CS110 Lecture 7: waitpid and execvp

CS110: Principles of Computer Systems

Winter 2021-2022

Stanford University

Instructors: Nick Troccoli and Jerry Cain

PDF of this presentation

Illustration courtesy of Roz Cyrus.

CS110 Topic 2: How can our program create and interact with other programs?

Learning About Processes

Creating processes and running other programs

Inter-process communication

Signals

Race Conditions

This lecture

Lecture 8/9

Lecture 10/11

Lecture 11

assign3: implement multiprocessing programs like "trace" (to trace another program's behavior) and "farm" (parallelize tasks)

assign4: implement your own shell!

Learning Goals

Get more practice with using fork() to create new processes
Understand how to use waitpid() to coordinate between processes
Learn how execvp() lets us execute another program within a process
Goal: write our first implementation of a shell!

first-shell-soln.c

Lecture Plan

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

Lecture Plan

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

fork()

A system call that creates a new child process
The "parent" is the process that creates the other "child" process
From then on, both processes are running the code after the fork
The child process is identical to the parent, except:
- it has a new Process ID (PID)
- for the parent, fork() returns the PID of the child; for the child, fork() returns 0
- fork() is called once, but returns twice

pid_t pidOrZero = fork();
// both parent and child run code here onwards
printf("This is printed by two processes.\n");

Virtual Memory and Copy on Write

How can the parent and child use the same address to store different data?

Each program thinks it is given all memory addresses to use
The operating system maps these virtual addresses to physical addresses
When a process forks, its virtual address space stays the same
The operating system will map the child's virtual addresses to different physical addresses than for the parent lazily
It will have them share physical addresses until one of them changes its memory contents to be different than the other.
This is called copy on write (only make copies when they are written to).

Lecture Plan

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

It would be nice if there was a function we could call that would "stall" our program until the child is finished.

waitpid()

A function that a parent can call to wait for its child to exit:

pid_t waitpid(pid_t pid, int *status, int options);

pid: the PID of the child to wait on (we'll see other options later)
status: where to put info about the child's termination (or NULL)
options: optional flags to customize behavior (always 0 for now)
the function returns when the specified child process exits
the return value is the PID of the child that exited, or -1 on error (e.g. no child to wait on)
If the child process has already exited, this returns immediately - otherwise, it blocks

waitpid()

int main(int argc, char *argv[]) {
    printf("Before.\n");
    pid_t pidOrZero = fork();
    
    if (pidOrZero == 0) {
        sleep(2);
        printf("I (the child) slept and the parent still waited up for me.\n");
    } else {
        pid_t result = waitpid(pidOrZero, NULL, 0);
        printf("I (the parent) finished waiting for the child.  This always prints last.\n");
    }

    return 0;
}

$ ./waitpid
Before.
I (the child) slept and the parent still waited up for me.
I (the parent) finished waiting for the child.  This always prints last.
$

waitpid.c

waitpid()

We can use provided macros (see man page for full list) to extract info from the status. (full program, with error checking, linked below)
- WIFEXITED: check if child terminated normally
- WEXITSTATUS: get exit status of child
The output will be the same every time! The parent will always wait for the child to finish before continuing.

int main(int argc, char *argv[]) {
    pid_t pid = fork();
    if (pid == 0) {
        printf("I'm the child, and the parent will wait up for me.\n");
        return 110; // contrived exit status (not a bad number, though)
    } else {
        int status;
        int result = waitpid(pid, &status, 0);

        if (WIFEXITED(status)) {
            printf("Child exited with status %d.\n", WEXITSTATUS(status));
        } else {
            printf("Child terminated abnormally.\n");
        }
        return 0;
    }
}

Pass in the address of an integer as the second parameter to get the child's status.

$ ./separate
I am the child, and the parent will wait up for me.
Child exited with status 110.
$

waitpid-status.c

A parent process should always wait on its children processes.

A process that finished but was not waited on by its parent is called a zombie 🧟‍♂️.
Zombies take up system resources (until they are ultimately cleaned up later by the OS)
Calling waitpid in the parent "reaps" the child process (cleans it up)
- If a child is still running, waitpid in the parent will block until it finishes, and then clean it up
- If a child process is a zombie, waitpid will return immediately and clean it up
Orphaned child processes get "adopted" by the init process (PID 1)

waitpid()

Make sure to reap your zombie children.

(Wait, what?)

Lecture Plan

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

What if we want to wait for children in the order in which they were created?

Check out the abbreviated program below (link to full program at the bottom):

int main(int argc, char *argv[]) {
    pid_t children[kNumChildren];

    for (size_t i = 0; i < kNumChildren; i++) {
        children[i] = fork();
        if (children[i] == 0) return 110 + i;
    }

    for (size_t i = 0; i < kNumChildren; i++) {
        int status;
        pid_t pid = waitpid(children[i], &status, 0);
        assert(WIFEXITED(status));
        printf("Child with pid %d accounted for (return status of %d).\n", children[i], WEXITSTATUS(status));
    }

    return 0;
}

Waiting On Multiple Children, In Order

reap-in-fork-order.c

This program reaps processes in the order they were spawned.
Child processes may not finish in this order, but they are reaped in this order.
- E.g. first child could finish last, holding up first loop iteration
Sample run below - the pids change between runs, but even those are guaranteed to be published in increasing order.

Waiting On Multiple Children, In Order

$ ./reap-in-fork-order 
Child with pid 12649 accounted for (return status of 110).
Child with pid 12650 accounted for (return status of 111).
Child with pid 12651 accounted for (return status of 112).
Child with pid 12652 accounted for (return status of 113).
Child with pid 12653 accounted for (return status of 114).
Child with pid 12654 accounted for (return status of 115).
Child with pid 12655 accounted for (return status of 116).
Child with pid 12656 accounted for (return status of 117).
$

A parent can call fork multiple times, but must reap all the child processes.

A parent can use waitpid to wait on any of its children by passing in -1 as the PID.
Key Idea: The children may terminate in any order!
If waitpid returns -1 and sets errno to ECHILD, this means there are no more children.

Demo: Let's see how we might use this.

Waiting On Multiple Children, No Order

reap-as-they-exit.c

Lecture Plan

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

The most common use for fork is not to spawn multiple processes to split up work, but instead to run a completely separate program under your control and communicate with it.

This is what a shell is; it is a program that prompts you for commands, and it executes those commands in separate processes.

execvp()

execvp is a function that lets us run another program in the current process.

int execvp(const char *path, char *argv[]);

execvp()

It runs the executable at the specified path, completely cannibalizing the current process.

If successful, execvp never returns in the calling process
If unsuccessful, execvp returns -1

To run another executable, we must specify the (NULL-terminated) arguments to be passed into its main function, via the argv parameter.

For our programs, path and argv[0] will be the same

execvp has many variants (execle, execlp, and so forth. Type man execvp for more). We rely on execvp in CS110.

int main(int argc, char *argv[]) {
    char *args[] = {"/bin/ls", "-l", "/usr/class/cs110/lecture-examples", NULL};
    execvp(args[0], args);
    printf("This only prints if an error occurred.\n");
    return 0;
}

$ ./execvp-demo 
total 26
drwx------ 2 troccoli operator 2048 Jan 11 21:03 cpp-primer
drwx------ 3 troccoli operator 2048 Jan 15 12:43 cs107review
drwx------ 2 troccoli operator 2048 Jan 13 14:15 filesystems
drwx------ 2 troccoli operator 2048 Jan 13 14:14 lambda
drwxr-xr-x 3 poohbear root     2048 Nov 19 13:24 map-reduce
drwx------ 2 poohbear root     4096 Nov 19 13:25 networking
drwxr-xr-x 2 poohbear root     6144 Jan 22 08:58 processes
drwxr-xr-x 2 poohbear root     2048 Oct 29 06:57 threads-c
drwxr-xr-x 2 poohbear root     4096 Oct 29 06:57 threads-cpp
$

execvp()

execvp is a function that lets us run another program in the current process.

int execvp(const char *path, char *argv[]);

execvp-demo.c

Lecture Plan

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

A shell is essentially a program that repeats asking the user for a command and running that command (Demo: first-shell-soln.c)

Component 1: loop for asking for user input
Component 2: way to run an arbitrary command

What Is A Shell?

first-shell-soln.c

system()

The built-in system function can execute a given shell command.

int system(const char *command);

command is a shell command (like you would type in the terminal); e.g. "ls" or "./myProgram"
system forks off a child process that executes the given shell command, and waits for it
on success, system returns the termination status of the child

int main(int argc, char *argv[]) {
    int status = system(argv[1]);
    printf("system returned %d\n", status);
    return 0;
}

$ ./system-demo "ls -l"
total 26
drwx------ 2 troccoli operator 2048 Jan 11 21:03 cpp-primer
drwx------ 3 troccoli operator 2048 Jan 15 12:43 cs107review
drwx------ 2 troccoli operator 2048 Jan 13 14:15 filesystems
drwx------ 2 troccoli operator 2048 Jan 13 14:14 lambda
drwxr-xr-x 3 poohbear root     2048 Nov 19 13:24 map-reduce
drwx------ 2 poohbear root     4096 Nov 19 13:25 networking
drwxr-xr-x 2 poohbear root     6144 Jan 21 19:38 processes
drwxr-xr-x 2 poohbear root     2048 Oct 29 06:57 threads-c
drwxr-xr-x 2 poohbear root     4096 Oct 29 06:57 threads-cpp
system returned 0
$

system-demo.c

mysystem()

We can implement our own version of system with fork(), waitpid() and execvp()!

int mysystem(const char *command);

call fork to create a child process
In the child, call execvp with the command to execute
In the parent, wait for the child with waitpid and then return exit status info

One twist; not all shell commands are executable programs, and some need parsing.

We can't just pass the command to execvp
Solution: there is a program called sh that runs any shell command
- e.g. /bin/sh -c "ls -a" runs the command "ls -a"
- We can call execvp to run /bin/sh with -c and the command as arguments

first-shell-soln.c

If execvp returns at all, an error occurred
Why not call execvp inside parent and forgo the child process altogether? Because
execvp would consume the calling process, and that's not what we want.
Why must the child exit rather than return? Because that would cause the child to also execute code in main!

static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    return WIFEXITED(status) ? WEXITSTATUS(status) : -WTERMSIG(status);
}

Here's the implementation, with minimal error checking (the full version is linked at the bottom):

mysystem()

first-shell-soln.c

`first-shell` Takeaways

A shell is a program that repeats: read command from the user, execute that command
In order to execute a program and continue running the shell afterwards, we fork off another process and run the program in that process
We rely on fork, execvp, and waitpid to do this!
Real shells have more advanced functionality that we will add going forward.
For your fourth assignment, you'll build on this with your own shell, stsh ("Stanford shell") with much of the functionality of real Unix shells.

Lecture Plan

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

More Shell Functionality

Shells have a variety of supported commands:

emacs & - create an emacs process and run it in the background
cat file.txt | uniq | sort - pipe the output of one command to the input of another
uniq < file.txt | sort > list.txt - make file.txt the input of uniq and output sort to list.txt
In lecture and assign4, we will see all these features!
Today, we'll focus on background execution
- only difference is specifying & with command
- shell immediately re-prompts the user
- process doesn't know "foreground" vs. "background"; this specifies whether or not shell waits

first-shell-soln-bg.c

Supporting Background Execution

Let's make an updated version of mysystem called executeCommand.

Takes an additional parameter bool inBackground
- If false, same behavior as mysystem (spawn child, execvp, wait for child)
- If true, spawn child, execvp, but don't wait for child

first-shell-soln-bg.c

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

first-shell-soln-bg.c

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

Line 1: Now, the caller can optionally run the command in the background.

first-shell-soln-bg.c

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

Lines 11-16: The parent waits on a foreground child, but not a background child.

first-shell-soln-bg.c

Supporting Background Execution

int main(int argc, char *argv[]) {
    char command[kMaxLineLength];
    while (true) {
        printf("> ");
        fgets(command, sizeof(command), stdin);
    
        // If the user entered Ctl-d, stop
        if (feof(stdin)) {
            break;
        }
    
        // Remove the \n that fgets puts at the end
        command[strlen(command) - 1] = '\0';

        if (strcmp(command, "quit") == 0) break;

        bool isbg = command[strlen(command) - 1] == '&';
        if (isbg) {
            command[strlen(command) - 1] = '\0';
        }

        executeCommand(command, isbg);
    }
  
    printf("\n");
    return 0;
}

In main, we add two additional things:

Check for the "quit" command to exit
Allow the user to add "&" at the end of a command to run that command in the background

Note that a background child isn't reaped! This is a problem - one we'll learn how to fix soon.

first-shell-soln-bg.c

Lecture Recap

Recap: fork()
waitpid() and waiting for child processes
Demo: Waiting For Children
execvp()
Putting it all together: first-shell
Background execution

Next time: interprocess communication

Practice Problems

What if we want to spawn a single child and wait for that child before spawning another child?

static const int kNumChildren = 8;
int main(int argc, char *argv[]) {
    for (size_t i = 0; i < kNumChildren; i++) {
        pid_t pidOrZero = fork();
        if (pidOrZero == 0) {
            printf("Hello from child %d!\n", getpid());
            return 110 + i;
        }

        int status;
        pid_t pid = waitpid(pidOrZero, &status, 0);        
        if (WIFEXITED(status)) {
            printf("Child with pid %d exited normally with status %d\n", pid, WEXITSTATUS(status));
        } else {
            printf("Child with pid %d exited abnormally\n", pid);
        }
    }

    return 0;
}

Waiting On Children

spawn-and-reap.c

Check out the abbreviated program below (link to full program at bottom):

CS110 Lecture 7: waitpid and execvp

CS110 Topic 2: How can our program create and interact with other programs?

Learning About Processes

This lecture

Lecture 8/9

Lecture 10/11

Lecture 11

Learning Goals

Lecture Plan

Lecture Plan

fork()

Virtual Memory and Copy on Write

Lecture Plan

It would be nice if there was a function we could call that would "stall" our program until the child is finished.

waitpid()

waitpid()

waitpid()

waitpid()

Make sure to reap your zombie children.

Lecture Plan

Waiting On Multiple Children, In Order

Waiting On Multiple Children, In Order

Waiting On Multiple Children, No Order

Lecture Plan

execvp()

execvp()

execvp()

Lecture Plan

What Is A Shell?

system()

mysystem()

mysystem()

first-shell Takeaways

Lecture Plan

More Shell Functionality

Supporting Background Execution

Supporting Background Execution

Supporting Background Execution

Supporting Background Execution

Supporting Background Execution

Lecture Recap

Practice Problems

Waiting On Children

`first-shell` Takeaways