0%

CS302 Operating System Week4 Note

Lecture 3: Process I

What is a process?

  • Process is a program in execution
  • It contains every accounting information of that running program e.g.
    • Current program counter
    • Accumulated running time
    • The list of files that are currently opened by that program
    • The page table
  • Important concept: Process control block

What is a process

1
2
3
$ ls | cat | cat
[ctrl + C]
$
  • The command involves three processes
  • It will stop early if I send a signal to interrupt it
  • Its progress is determined by the scheduler
  • The three processes cooperate to give useful output

What are those two "cats"?

  • 2 different processes using the same code "/bin/cat"

Our Roadmap

  1. How to distinguish the two cats?
  2. Who (and hwo to) create the processes?
  3. Which should run first?
  4. What are those pipes?
  5. What if "ls" is feeding data too fast? ill the "cat" feels full and dies?

Process identification

  • How can we identify processes from one to another?
    • Each process is given an unique ID number, and is called the processes ID, or the PID
    • The system call, getpid(), prints the PID of the calling process.

Process creation

  • To create a process, we use the system call fork().

Process creation - fork() system call

  • So, how do fork() and the processes behave?
  • What do we know?
    • Both the parent and the child execute the same program
    • The child process starts its execution at the location that fork() is returned, not from the beginning of the program

Let there be only ONE CPU

  • Only one process is allowed to be executed at one time
  • However, we can't predict which process will be chosen by the OS
  • That is controlled by the OS's scheduler

IMPORTANT: For child, the return value of fork() is zero

关于fork如何实现形式上返回两个值,详见此文章

fork() behaves like "cell division"

  • It creates the child process by cloning from the parent process, including all user-space data e.g.
Cloned items Descriptions
Program counter[CPU register] That's why they both execute from the same line of code after fork() returns
Program code[file & memory] They are sharing the same piece of code.
Memory Including local variables, global variables, and dynamically allocated memory
Opened files[Kernel's internal] If the parent has opened a file "A", then the child will also have file "A" opened automatically.
  • However
    • fork() does not clone the following
    • Note: PCB is in the kernel space
Distinct items Parent Child
Return value of fork() PID of the child process 0
PID Unchanged Different, not necessarily be "Parent PID + 1"
Parent process Unchanged Parent
Running time Cumulated Just created, so should be 0
[Advanced] File locks Unchanged None

fork() can only duplicate

  • If a process can only duplicate itself and always runs the same program, it's not quite meaningful
    • how can we execute other programs?
  • We want CHANGE!
    • Meet the exec*() system call family

exec

  • execl() - member of the exec system call family(and the family has 6 members)
1
2
3
4
5
6
7
8
9
10
int main(void) {

printf("before execl ...\n");
execl("/bin/ls", "/bin/ls", NULL);
printf("after execl ...\n");

return 0;

}

Arguments of the execl() call

  • 1st argument: the program name, “/bin/ls” in the example.
  • 2nd argument: argument[0] to the program.
  • 3rd argument: argument[1] to the program.

exec

  • The process is changing the code that is executing and never returns to the original code
    • The last two lines of codes are therefore not executed
  • The process that call an exec* system call will replace user-space info e.g.,
    • Program Code
    • Memory: local variables, global variables, and dynamically allocated memory
    • Register value: e.g.m the program counter
  • But, the kernel-space info of that process is preserved, including:
    • PID
    • Process relationship
    • etc.

When fork() meets exec*()

  • To implement the core part of a shell
  • To implement the C library call system()

fork() + exec*() = system()

  • It si very weird to allow different execution orders
  • How to let tht child to execute first?
    • But... we can't control the OS scheduler
  • Then, our problem becomes...
    • How to suspend the execution of the parent process?
    • How to wake the parent up after the child is terminated?

fork() + exec*(l + wait() = system()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int system_ver_CS302(const char *cmd_str) {
if(cmd_str == -1)
return -1;
if(fork() == 0) {
execl("/bin/sh", "/bin/sh", "-c", cmd_str, NULL);
fprintf(stderr, "%s: command not found\n", cmd_str);
exit(-1);
}
wait(NULL);
return 0;
}

int main(void) {

printf("before...\n\n");
system_ver_CS302("/bin/ls");
printf("\nafter...\n");
return 0;

}

Process Life Cycle(user-space)

wait() - user-space

wait() system call

  • suspend the calling process to waiting state and return (wakes up) when
    • one of its child processes changes from
      • running to terminated
    • Or a signal is received(will cover)
  • return immediately(i.e., does nothing) fi
    • It has no children
      • Or a child terminates before the parent calls wait for

wait() vs waitpid()

wait()

  • wait for any one of the children
  • Detect child termination

waitpid()

  • depending on the parameters, waitpid() will wait for a particular child only
  • Depending on the parameters, waitpid() multiple child's status change

summary

  • A new process is created by fork()
    • Who is the first process
  • A process is a program being brought by exec to the memory
    • has state(initial state = ready)
    • waiting for the OS to schedule the CPU to run it
  • Can a process execute more than one program?
    • Yes, keeps on calling the exec system call family
  • You now know how system() C library call is implemented by syscalls fork(), exec(), and wait()

exec*() – arguments explained

  • Environment variables
  • A set of strings maintained by the shell.
1
2
3
4
5
6
int main(int argc, char **argv, char **envp) {
int i;
for(i = 0; envp[i]; i++)
printf("%s\n", envp[i]);
return 0;
}

The “**envp” variable is an array of string

A string is an array of characters

  • Environment variables
    • A set of strings maintained by the shell.
    • Quite a number of programs will read and make use of the environment variable.
Variable name Description
SHELL The path to the shell that you're using
PWD The full path to the directory that you’re currently on.
HOME The full path to your home directory
USER Your login name.
EDITOR Your default text editor.
PRINTER Your default printer