Lecture 3: Process I
What is a process?
- Process is a program in execution
- It contains every accounting information of that running program e.g.
- Current program counter
- Accumulated running time
- The list of files that are currently opened by that program
- The page table
- Important concept: Process control block
What is a process
1 | $ ls | cat | cat |
- The command involves three processes
- It will stop early if I send a signal to interrupt it
- Its progress is determined by the scheduler
- The three processes cooperate to give useful output
What are those two "cats"?
- 2 different processes using the same code "/bin/cat"
Our Roadmap
- How to distinguish the two cats?
- Who (and hwo to) create the processes?
- Which should run first?
- What are those pipes?
- What if "ls" is feeding data too fast? ill the "cat" feels full and dies?
Process identification
- How can we identify processes from one to another?
- Each process is given an unique ID number, and is called the processes ID, or the PID
- The system call, getpid(), prints the PID of the calling process.
Process creation
- To create a process, we use the system call fork().
Process creation - fork() system call
- So, how do fork() and the processes behave?
- What do we know?
- Both the parent and the child execute the same program
- The child process starts its execution at the location that fork() is returned, not from the beginning of the program
Let there be only ONE CPU
- Only one process is allowed to be executed at one time
- However, we can't predict which process will be chosen by the OS
- That is controlled by the OS's scheduler
IMPORTANT: For child, the return value of fork() is zero
关于fork如何实现形式上返回两个值,详见此文章
fork() behaves like "cell division"
- It creates the child process by cloning from the parent process, including all user-space data e.g.
Cloned items | Descriptions |
---|---|
Program counter[CPU register] | That's why they both execute from the same line of code after fork() returns |
Program code[file & memory] | They are sharing the same piece of code. |
Memory | Including local variables, global variables, and dynamically allocated memory |
Opened files[Kernel's internal] | If the parent has opened a file "A", then the child will also have file "A" opened automatically. |
- However
- fork() does not clone the following
- Note: PCB is in the kernel space
Distinct items | Parent | Child |
---|---|---|
Return value of fork() | PID of the child process | 0 |
PID | Unchanged | Different, not necessarily be "Parent PID + 1" |
Parent process | Unchanged | Parent |
Running time | Cumulated | Just created, so should be 0 |
[Advanced] File locks | Unchanged | None |
fork() can only duplicate
- If a process can only duplicate itself and always runs the same program, it's not quite meaningful
- how can we execute other programs?
- We want CHANGE!
- Meet the exec*() system call family
exec
- execl() - member of the exec system call family(and the family has 6 members)
1 | int main(void) { |
Arguments of the execl() call
- 1st argument: the program name, “/bin/ls” in the example.
- 2nd argument: argument[0] to the program.
- 3rd argument: argument[1] to the program.
exec
- The process is changing the code that is executing and never returns to the original code
- The last two lines of codes are therefore not executed
- The process that call an exec* system call will replace user-space info e.g.,
- Program Code
- Memory: local variables, global variables, and dynamically allocated memory
- Register value: e.g.m the program counter
- But, the kernel-space info of that process is preserved, including:
- PID
- Process relationship
- etc.
When fork() meets exec*()
- To implement the core part of a shell
- To implement the C library call system()
fork() + exec*() = system()
- It si very weird to allow different execution orders
- How to let tht child to execute first?
- But... we can't control the OS scheduler
- Then, our problem becomes...
- How to suspend the execution of the parent process?
- How to wake the parent up after the child is terminated?
fork() + exec*(l + wait() = system()
1 | int system_ver_CS302(const char *cmd_str) { |
Process Life Cycle(user-space)
wait() - user-space
wait() system call
- suspend the calling process to waiting state and return (wakes up) when
- one of its child processes changes from
- running to terminated
- Or a signal is received(will cover)
- one of its child processes changes from
- return immediately(i.e., does nothing) fi
- It has no children
- Or a child terminates before the parent calls wait for
- It has no children
wait() vs waitpid()
wait()
- wait for any one of the children
- Detect child termination
waitpid()
- depending on the parameters, waitpid() will wait for a particular child only
- Depending on the parameters, waitpid() multiple child's status change
summary
- A new process is created by fork()
- Who is the first process
- A process is a program being brought by exec to the memory
- has state(initial state = ready)
- waiting for the OS to schedule the CPU to run it
- Can a process execute more than one program?
- Yes, keeps on calling the exec system call family
- You now know how system() C library call is implemented by syscalls fork(), exec(), and wait()
exec*() – arguments explained
- Environment variables
- A set of strings maintained by the shell.
1 | int main(int argc, char **argv, char **envp) { |
The “**envp” variable is an array of string
A string is an array of characters
- Environment variables
- A set of strings maintained by the shell.
- Quite a number of programs will read and make use of the environment variable.
Variable name | Description |
---|---|
SHELL | The path to the shell that you're using |
PWD | The full path to the directory that you’re currently on. |
HOME | The full path to your home directory |
USER | Your login name. |
EDITOR | Your default text editor. |
PRINTER | Your default printer |