CS110 Lecture 06: Pipes, Signals, and Concurrency

Principles of Computer Systems

Winter 2020

Stanford University

Computer Science Department

Instructors: Chris Gregg and

Nick Troccoli

PDF of this presentation

CS110 Topic 2: How can our programs create and interact with other programs?

Learning About Processes

Creating processes and running other programs

Inter-process communication

Signals

Race Conditions

1/15

Today

1/27

1/29

Today's Learning Goals

Get more practice with using fork() and execvp
Learn about pipe and dup2 to create and manipulate file descriptors
Introduce signals as another way for processes to communicate

Plan For Today

Review: fork() and execvp()
Practice: Revisiting first-shell
Running in the background
Break: Announcements
Introducing Pipes
Practice: Implementing subprocess
Introducing Signals
Demo: Disneyland

fork()

A system call that creates a new child process
The "parent" is the process that creates the other "child" process
From then on, both processes are running the code after the fork
The child process is identical to the parent, except:
- it has a new Process ID (PID)
- for the parent, fork() returns the PID of the child; for the child, fork() returns 0
- fork() is called once, but returns twice

pid_t pidOrZero = fork();
// both parent and child run code here onwards
printf("This is printed by two processes.\n");

waitpid()

A function that a parent can call to wait for its child to exit:

pid_t waitpid(pid_t pid, int *status, int options);

pid: the PID of the child to wait on, or -1 to wait on any of our children
status: where to put info about the child's termination (or NULL)
options: optional flags to customize behavior (always 0 for now)

The function returns when the specified child process exits.

the return value is the PID of the child that exited, or -1 on error (e.g. no child to wait on)
If the child process has already exited, this returns immediately - otherwise, it blocks
It's important to wait on all children to clean up system resources

execvp is a function that lets us run another program in the current process.

It runs the specified program executable, completely cannibalizing the current process.

path identifies the name of the executable to be invoked.
argv is the argument vector that should be passed to the new executable's main function.
For the purposes of CS110, path and argv[0] end up being the same exact string.
If execvp fails to cannibalize the process and install a new executable image within it, it returns -1 to express failure.
If execvp succeeds, it never returns in the calling process.
execvp has many variants (execle, execlp, and so forth. Type man execvp to see all of them). We generally rely on execvp in this course.

int execvp(const char *path, char *argv[]);

execvp()

execvp() Example

What does the following code output, assuming execvp executes successfully?

int main(int argc, char *argv[]) {
    char *args[] = {"/bin/ls", "-l", "/usr/class/cs110", NULL};
    execvp(args[0], args);
    printf("Hello world!\n");
    return 0;
}

This process will be completely consumed by the new program being run (ls).
Lines 4+ will never execute unless an error occurs in execvp.

Plan For Today

Review: fork() and execvp()
Practice: Revisiting first-shell
Running in the background
Break: Announcements
Introducing Pipes
Practice: Implementing subprocess
Introducing Signals
Demo: Disneyland

Revisiting mysystem

mysystem is our own version of the built-in function system.

It takes in a terminal command (e.g. "ls -l /usr/class/cs110"), executes it in a separate process, and returns when that process is finished.
- We can use fork to create the child process
- We can use execvp in that child process to execute the terminal command
- We can use waitpid in the parent process to wait for the child to terminate

Revisiting mysystem

static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    if (WIFEXITED(status)) {
        return WEXITSTATUS(status);
    } else {
        return -WTERMSIG(status);
    }
}

Revisiting mysystem

static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    if (WIFEXITED(status)) {
        return WEXITSTATUS(status);
    } else {
        return -WTERMSIG(status);
    }
}

Line 2: First, fork off a child process.

Revisiting mysystem

static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    if (WIFEXITED(status)) {
        return WEXITSTATUS(status);
    } else {
        return -WTERMSIG(status);
    }
}

Lines 4-6: In the child, execute the /bin/sh program, which can execute any shell command.

Revisiting mysystem

static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    if (WIFEXITED(status)) {
        return WEXITSTATUS(status);
    } else {
        return -WTERMSIG(status);
    }
}

Line 8: The child will only get to this line if execvp fails.

Revisiting mysystem

static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    if (WIFEXITED(status)) {
        return WEXITSTATUS(status);
    } else {
        return -WTERMSIG(status);
    }
}

Lines 11-13: In the parent, wait for the child to terminate.

Revisiting mysystem

static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    if (WIFEXITED(status)) {
        return WEXITSTATUS(status);
    } else {
        return -WTERMSIG(status);
    }
}

Lines 14-18: In the parent, after the child terminates, return its status.

Revisiting first-shell

int main(int argc, char *argv[]) {
    char command[kMaxLineLength];
    while (true) {
        printf("> ");
        fgets(command, sizeof(command), stdin);
    
        // If the user entered Ctl-d, stop
        if (feof(stdin)) {
            break;
        }
    
        // Remove the \n that fgets puts at the end
        command[strlen(command) - 1] = '\0';

        int commandReturnCode = mysystem(command);
        printf("return code = %d\n", commandReturnCode);
    }
  
    printf("\n");
    return 0;
}

Our first-shell program is a loop in main that parses the user input and passes it to mysystem.

first-shell Takeaways

A shell is a program that repeats: read command from the user, execute that command
In order to execute a program and continue running the shell afterwards, we fork off another process and run the program in that process
We rely on fork, execvp, and waitpid to do this!
Real shells have more advanced functionality that we will add going forward.
For your fourth assignment, you'll build on this with your own shell, stsh ("Stanford shell") with much of the functionality of real Unix shells.

More Shell Functionality

Shells have a variety of supported commands:

emacs & - create an emacs process and run it in the background
cat file.txt | uniq | sort - pipe the output of one command to the input of another
uniq < file.txt | sort > list.txt - make file.txt the input of uniq and output sort to list.txt
Let's see how we can implement these - but first, a demo.

Plan For Today

Review: fork() and execvp()
Practice: Revisiting first-shell
Running in the background
Break: Announcements
Introducing Pipes
Practice: Implementing subprocess
Introducing Signals
Demo: Disneyland

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

Line 1: Now, the caller can optionally run the command in the background.

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

Lines 11-16: The parent waits on a foreground child, but not a background child.

Supporting Background Execution

int main(int argc, char *argv[]) {
    char command[kMaxLineLength];
    while (true) {
        printf("> ");
        fgets(command, sizeof(command), stdin);
    
        // If the user entered Ctl-d, stop
        if (feof(stdin)) {
            break;
        }
    
        // Remove the \n that fgets puts at the end
        command[strlen(command) - 1] = '\0';

        if (strcmp(command, "quit") == 0) break;

        bool isbg = command[strlen(command) - 1] == '&';
        if (isbg) {
            command[strlen(command) - 1] = '\0';
        }

        executeCommand(command, isbg);
    }
  
    printf("\n");
    return 0;
}

In main, on lines 15-22, we check for the "quit" command, and also for whether to run the command in the background.

Plan For Today

Review: fork() and execvp()
Practice: Revisiting first-shell
Running in the background
Break: Announcements
Introducing Pipes
Practice: Implementing subprocess
Introducing Signals
Demo: Disneyland

Announcements

Assign2 due tomorrow at 11:59PM PST
Assign3 goes out tomorrow - all about multiprocessing
- This Monday's lecture needed for the last part
Section 2 starts tomorrow
- previous week's section solutions released tomorrow

Mid-Lecture Checkin

Now we can answer the following questions:

when writing a shell, why is it essential to call execvp in the child process?
how can we update our shell to support background execution of commands?

Plan For Today

Review: fork() and execvp()
Practice: Revisiting first-shell
Running in the background
Break: Announcements
Introducing Pipes
Practice: Implementing subprocess
Introducing Signals
Demo: Disneyland

Interprocess Communication

It's useful for a parent process to be able to communicate with its child (and vice versa)
There are two key ways we will learn to do this: pipes and signals
- Pipes let two processes send and receive arbitrary data
- Signals let two processes send and receive certain "signals" that indicate something special has happened.

Pipes

How can we let two processes send arbitrary data back and forth?
A core Unix principle is how many things can be modeled as files. Could we use a "file"?
Idea: what if we used a file that one process could write to, and another process could read from?
Problem: we don't want to clutter the filesystem with actual files every time two processes want to communicate.
Solution: have the operating system set this up for us.
- It will give us two new file descriptors - one for writing, another for reading.
- If someone writes data to the write FD, it can be read from the read FD.
- It's not actually a physical file on disk - we are just using files as an abstraction

The pipe system call takes an uninitialized array of two integers and populates it with two file descriptors such that everything written to fds[1]can be read from fds[0].
pipe can allow parent processes to communicate with spawned child processes.
- Because they're file descriptors, there's no global name for the pipe (another process can't "connect" to the pipe)
- The parent's table is replicated in the child, so the child automatically gets access to the same file descriptors

int pipe(int fds[]);

pipe()