COMP 3511: Lecture 7

Date: 2024-09-24 14:35:41

Reviewed:

Topic / Chapter:

summary

❓Questions

Notes

Operation on Processes (cont.)
  • Process creation

    • fork()

      • parent & child: can be added to ready queue parallelly
        • behavior: cannot be determined by code only
      • parent & child: resumes execution after fork w/ the same PC
        • i.e. the return of fork() call
      • after
      • return values of fork()
        • each process: receives exactly one return value
        • -1: unsuccessful (to parent)
        • 0: successful (to child)
        • >0: successful (to parent)
          • returned value: child's PID
          • πŸ‘¨β€πŸ« losing track of PID / etc.: not discussed here
        • πŸ‘¨β€πŸ« this value: used to distinguish post-fork behavior of process
      • almost everything of the parent gets copied
        • memory / file descriptors / etc.
        • i.e. such copy of everything: very costly & time consuming
          • parent can't access child's cloned memory
          • nor the child access parent's original memory
    • UNIX fork

      • create & initialize process control block (PCB) in kernel
      • create new address space / allocate memory
      • initialize address space w/ the copy of entire contents
        • time consuming!
      • inherit the execution context of the parent (e.g. open files)
        • i.e. all stack and etc. information
      • inform the CPU scheduler: child process is ready to run
    • parent & child comparison

      • after fork()
      DuplicatedDifferent
      address spacePID
      global-local varfork() return
      current working dirrunning time
      root dirrunning state
      process resources
      resource limits
      program counter
      ...
      example
      #include <stdlib.h>
      #include <stdio.h>
      #include <string.h>
      #include <unistd.h>
      #include <sys/types.h>
      
      #define BUFSIZE 1024
      int main(int argc, char *argv[]) {
          char bug[BUFSIZE];
          size_t readlen, writelen, slen;
          pid_t cpid, mypid;
          pit_t pid = getpid();
          printf("Parent pid: %d\n", pid);
          cpid = fork(); // branch
          if (cpid > 0) {
              mypid = getpid(); // parent
              printf("[%d] parent of [%d]\n", mypid, cpid);
          } else if (cpid == 0) {
              mypid = getpid(); // child
              printf("[%d] child\n", mypid);
          } else {
              perror("Fork failed");
              exit(1);
          }
      }
      
      • note that
        • output after printing (i.e. within control flow) has undetermined order
          • depending on CPU scheduler
  • system calls

    • exec(), execlp(): syscall to change program in current process

      • creates a new process image from: regular executable file
      • πŸ‘¨β€πŸŽ“ ~= j to another program?
        • ⭐ no return to original process!
    • wait(): syscall to wait for child process to finish

      • or: on general wait for event
      • πŸ‘¨β€πŸ« enter waiting stage & give up CPU
        • πŸ‘¨β€πŸŽ“ if parent keep occupying CPU: then a single-core won't be able to be execute a fork w/ wait()
        • ⭐ very very important!
    • exit(): syscall to terminate current process

        • free all resources
    • signal(): syscall to send notification to another process

    • implementing a shell

      char *prog, **args;
      int child_pid;
      
      while (readAndParseCmdLine(&prog, &args)) {
          child_pid = fork();
          if (child_pid == 0) {
              exec(prog, args); // run command in child
              // cannot be reached
          } else {
              wait(child_pid);
              return 0;
          }
      }
      
    • fork tracing

      • to trace forks in loops, try to expand the loop
        • e.g. for (int i=0; i < 10; ++i) fork(); into fork(); fork(); fork();...
      fork diagrams (credit: IA Peter)
---
    title: fork()
    ---
    flowchart LR
        p0((p0)) --> f1{fork 1}
        f1 --> p0_2((p0))
        f1 --0--> p1((p1))
---
    title: fork(); fork()
    ---
    flowchart LR
        p0((p0)) --> f1{fork 1}
        f1 --> p0_2((p0))
        f1 --0--> p1((p1))
        p0_2 --> f2{fork 2}
        f2 --> p0_3((p0))
        f2 --0--> p2((p2))
        p1 --> f3{fork 3}
        f3 --> p1_2((p1))
        f3 --0--> p3((p3))
---
    title: fork()&&fork()
    ---
    flowchart LR
        p0((p0)) --> f1{fork 1}
        f1 --> p0_2((p0))
        f1 --0--> p1((p1))
        p0_2 --> f2{fork 2}
        f2 --> p0_3((p0))
        f2 --0--> p2((p2))
---
    title: fork()||fork()
    ---
    flowchart LR
        p0((p0)) --> f1{fork 1}
        f1 --> p0_2((p0))
        f1 --0--> p1((p1))
        p1 --> f2{fork 2}
        f2 --> p1_2((p1))
        f2 --0--> p2((p2))
</details>
  • Final notes on fork

    • process creation is unix: unique (no pun)
      • most os: create a process in new address space & read in an executable file and execute
      • Unix: separating it into fork() and exec()
    • linux: fork(): implemented via copy-on-write
      • as usually: we don't need the entire copy
        • thus we can delay / prevent copying the data
          • until content is changes / written to
          • improves speed a lot!
      • child process: points to parent process's address space
    • linux also implements fork() via clone() (more general)
      • clone(): uses a series of flags allow to specify which set of resources should be shared by parent & child
  • Process termination

    • termination
      • after process executes the last statement
        • exit syscall: used to ask OS to delete it
    • some os: do not allow child to exist if parent has terminated
      • i.e. all children are to be terminated after parents
        • cascading termination by the OS
    • process termination:
      • deallocation: must involve the OS
      • e.g. kernel data, etc.: cannot be accessed / modified by the user application
    • concepts
      • zombie process: process terminated, but wait not called on parents yet
        • e.g. corresponding entry in the process table / PID, and PCB
        • πŸ‘¨β€πŸ« every process enters this stage, at least for a moment after termination
          • nothing wrong. It's just "we are almost over"
        • but zombies can be accumulated, and wit as a problem back then
          • because the memory restriction was very tight ~=30 years ago
        • once parent calls wait: PID of zombie process and other corresponding entry: released
        • such design: enables parent inform the OS termination of the child
          • πŸ‘¨β€πŸ« not the best design, nor the only. but the design of Unix
      • if parent terminates without invoking wait()
        • child becomes an orphan
        • without cascading, the process might be still runnable
        • or become a zombie
          • which, will never be released, as no parent exist
        • thus: all process (except root or so): must have a parent
          • (or kill them all using cascading)
          • πŸ‘¨β€πŸŽ“ can we assign
      • πŸ‘¨β€πŸ« this is the design chosen by UNIX
        • πŸ‘¨β€πŸŽ“ can't we ensure the child to call OS and take care of themselves?
          • maybe we can, but this is choice or trade-off made by the unix
        • πŸ‘¨β€πŸ«&πŸ‘¨β€πŸŽ“ parent has ability to track all its children, but it is not required.