CS110 Lecture 06: Pipes, Signals, and Concurrency

Principles of Computer Systems

Winter 2021

Stanford University

Computer Science Department

Instructors: Chris Gregg and

                            Nick Troccoli

CS110 Topic 2: How can our programs create and interact with other programs?

Learning About Processes

Creating processes and running other programs

Inter-process communication

Signals

Race Conditions

This lecture

Lecture 5

Lecture 7

Lecture 8

Learning Goals

  • Get more practice with using fork() and execvp
  • Learn about pipe and dup2 to create and manipulate file descriptors
  • Use pipes to redirect process input and output

Lecture Plan

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
  • Practice: Implementing subprocess

fork()

  • A system call that creates a new child process
  • The "parent" is the process that creates the other "child" process
  • From then on, both processes are running the code after the fork
  • The child process is identical to the parent, except:
    • it has a new Process ID (PID)
    • for the parent, fork() returns the PID of the child; for the child, fork() returns 0
    • fork() is called once, but returns twice
pid_t pidOrZero = fork();
// both parent and child run code here onwards
printf("This is printed by two processes.\n");

waitpid()

A function that a parent can call to wait for its child to exit:

pid_t waitpid(pid_t pid, int *status, int options);
  • pid: the PID of the child to wait on, or -1 to wait on any of our children
  • status: where to put info about the child's termination (or NULL)
  • options: optional flags to customize behavior (always 0 for now)

 

The function returns when the specified child process exits.

  • the return value is the PID of the child that exited, or -1 on error (e.g. no child to wait on)
  • If the child process has already exited, this returns immediately - otherwise, it blocks
  • It's important to wait on all children to clean up system resources

execvp is a function that lets us run another program in the current process.

int execvp(const char *path, char *argv[]);

execvp()

It runs the executable at the specified path, completely cannibalizing the current process.

  • If successful, execvp never returns in the calling process
  • If unsuccessful, execvp returns -1

To run another executable, we must specify the (NULL-terminated) arguments to be passed into its main function, via the argv parameter.

  • For our programs, path and argv[0] will be the same

execvp has many variants (execle, execlp, and so forth. Type man execvp for more). We rely on execvp in CS110.

Revisiting mysystem

mysystem is our own version of the built-in function system.

  • It takes in a terminal command (e.g. "ls -l /usr/class/cs110"), executes it in a separate process, and returns when that process is finished.
    • We can use fork to create the child process
    • We can use execvp in that child process to execute the terminal command
    • We can use waitpid in the parent process to wait for the child to terminate
static int mysystem(char *command) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, wait for the child
    int status;
    waitpid(pidOrZero, &status, 0);
    return WIFEXITED(status) ? WEXITSTATUS(status) : -WTERMSIG(status);
}

Revisiting mysystem

first-shell-soln.c

Revisiting first-shell

int main(int argc, char *argv[]) {
    char command[kMaxLineLength];
    while (true) {
        printf("> ");
        fgets(command, sizeof(command), stdin);
    
        // If the user entered Ctl-d, stop
        if (feof(stdin)) {
            break;
        }
    
        // Remove the \n that fgets puts at the end
        command[strlen(command) - 1] = '\0';

        int commandReturnCode = mysystem(command);
        printf("return code = %d\n", commandReturnCode);
    }
  
    printf("\n");
    return 0;
}

Our first-shell program is a loop in main that parses the user input and passes it to mysystem.

first-shell-soln.c

first-shell Takeaways

  • A shell is a program that repeats: read command from the user, execute that command
  • In order to execute a program and continue running the shell afterwards, we fork off another process and run the program in that process
  • We rely on fork, execvp, and waitpid to do this!
  • Real shells have more advanced functionality that we will add going forward.
  • For your fourth assignment, you'll build on this with your own shell, stsh ("Stanford shell") with much of the functionality of real Unix shells.

More Shell Functionality

Shells have a variety of supported commands:

  • emacs &  - create an emacs process and run it in the background
  • cat file.txt | uniq | sort - pipe the output of one command to the input of another
  • uniq < file.txt | sort > list.txt - make file.txt the input of uniq and output sort to list.txt
  • Let's see how we can implement these - but first, a demo.

Lecture Plan

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
  • Practice: Implementing subprocess

Supporting Background Execution

Let's make an updated version of mysystem called executeCommand.

  • Takes an additional parameter bool inBackground
    • If false, same behavior as mysystem (spawn child, execvp, wait for child)
    • If true, spawn child, execvp, but don't wait for child

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}
second-shell-start.c

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

Line 1: Now, the caller can optionally run the command in the background.

second-shell-start.c

Supporting Background Execution

static void executeCommand(char *command, bool inBackground) {
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // If we are the child, execute the shell command
        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        // If the child gets here, there was an error
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    // If we are the parent, either wait or return immediately
    if (inBackground) {
        printf("%d %s\n", pidOrZero, command);
    } else {
        waitpid(pidOrZero, NULL, 0);
    }
}

Lines 11-16: The parent waits on a foreground child, but not a background child.

second-shell-start.c

Supporting Background Execution

int main(int argc, char *argv[]) {
    char command[kMaxLineLength];
    while (true) {
        printf("> ");
        fgets(command, sizeof(command), stdin);
    
        // If the user entered Ctl-d, stop
        if (feof(stdin)) {
            break;
        }
    
        // Remove the \n that fgets puts at the end
        command[strlen(command) - 1] = '\0';

        if (strcmp(command, "quit") == 0) break;

        bool isbg = command[strlen(command) - 1] == '&';
        if (isbg) {
            command[strlen(command) - 1] = '\0';
        }

        executeCommand(command, isbg);
    }
  
    printf("\n");
    return 0;
}

In main, we must add two additional things:

  • Check for the "quit" command to exit
  • Allow the user to add "&" at the end of a command to run that command in the background

Note that a background child isn't reaped! This is a problem - one we'll learn how to fix soon.

second-shell-start.c

Lecture Plan

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
  • Practice: Implementing subprocess

Lecture Plan

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
    • What are pipes?
    • Pipes between processes
    • Redirecting process I/O
  • Practice: Implementing subprocess

Is there a way that the parent and child processes can communicate?

Interprocess Communication

  • It's useful for a parent process to communicate with its child (and vice versa)
  • There are two key ways we will learn to do this: pipes and signals
    • Pipes let two processes send and receive arbitrary data
    • Signals let two processes send and receive certain "signals" that indicate something special has happened. 

Interprocess Communication

  • It's useful for a parent process to communicate with its child (and vice versa)
  • There are two key ways we will learn to do this: pipes and signals
    • Pipes let two processes send and receive arbitrary data
    • Signals let two processes send and receive certain "signals" that indicate something special has happened. 

Pipes

  • How can we let two processes send arbitrary data back and forth?
  • A core Unix principle is modeling things as files.  Could we use a "file"?
  • Idea: a file that one process could write, and another process could read?
  • Problem: we don't want to clutter the filesystem with actual files every time two processes want to communicate.
  • Solution: have the operating system set this up for us.  
    • It will give us two new file descriptors - one for writing, another for reading.
    • If someone writes data to the write FD, it can be read from the read FD.
    • It's not actually a physical file on disk - we are just using files as an abstraction

The pipe system call populates the 2-element array fds with two file descriptors such that everything written to fds[1]can be read from fds[0].  Returns 0 on success, or -1 on error.

int pipe(int fds[]);

pipe()

The pipe system call populates the 2-element array fds with two file descriptors such that everything written to fds[1]can be read from fds[0].  Returns 0 on success, or -1 on error.

int pipe(int fds[]);

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    int result = pipe(fds);

    // Write message to pipe (assuming here all bytes written immediately)
    write(fds[1], kPipeMessage, strlen(kPipeMessage) + 1);
    close(fds[1]);

    // Read message from pipe
    char receivedMessage[strlen(kPipeMessage) + 1];
    read(fds[0], receivedMessage, sizeof(receivedMessage));
    close(fds[0]);
    printf("Message read: %s\n", receivedMessage);
  
    return 0;
}
pipe-demo.c

Tip: you learn to read before you learn to write (read = fds[0], write = fds[1]).

The pipe system call populates the 2-element array fds with two file descriptors such that everything written to fds[1]can be read from fds[0].  Returns 0 on success, or -1 on error.

int pipe(int fds[]);

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    int result = pipe(fds);

    // Write message to pipe (assuming here all bytes written immediately)
    write(fds[1], kPipeMessage, strlen(kPipeMessage) + 1);
    close(fds[1]);

    // Read message from pipe
    char receivedMessage[strlen(kPipeMessage) + 1];
    read(fds[0], receivedMessage, sizeof(receivedMessage));
    close(fds[0]);
    printf("Message read: %s\n", receivedMessage);
  
    return 0;
}
$ ./pipe-demo
Message read: Hello, this message is coming through a pipe.
pipe-demo.c

Tip: you learn to read before you learn to write (read = fds[0], write = fds[1]).

Lecture Plan

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
    • What are pipes?
    • Pipes between processes
    • Redirecting process I/O
  • Practice: Implementing subprocess

pipe can allow processes to communicate!

  • The parent's file descriptor table is replicated in the child - both have pipe access
  • E.g. the parent can write to the "write" end and the child can read from the "read" end
  • Because they're file descriptors, there's no global name for the pipe (another process can't "connect" to the pipe).
  • Each pipe is uni-directional (one end is read, the other write)
int pipe(int fds[]);

The pipe system call populates the 2-element array fds with two file descriptors such that everything written to fds[1]can be read from fds[0].  Returns 0 on success, or -1 on error.

pipe()

Here's an example program showing how pipe works across processes (full program link at bottom).

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

pipe()

Make a pipe just like before.

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

pipe()

The parent must close all its open FDs.  It never uses the Read FD so we can close it here.

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

pipe()

Write to the Write FD to send a message to the child.

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

pipe()

We are now done with the Write FD so we can close it here.

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

pipe()

We wait for the child to terminate.

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

Key Idea: when we call fork, the child gets a copy of the parent's file descriptor table.  Any open FDs in the parent at the time fork is called must be closed in both the parent and the child.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

More specifically, this duplication means the child's file descriptor table entries point to the same open file table entries as the parent.  Thus, the open file table entries for the two pipe FDs both have reference counts of 2.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

The child must close all its open FDs.  It never uses the Write FD so we can close it here.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

Read from the Read FD to read the message from the parent.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

We are now done with the Read FD so we can close it here.  Also print the received message.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

Key Idea: the child gets a copy of the parent's file descriptor table.  Any open FDs in the parent at the time fork is called must be closed in both the parent and the child.

Here, right before the fork call, the parent has 2 open file descriptors (besides 0-2): the pipe Read FD and Write FD.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

Key Idea: the child gets a copy of the parent's file descriptor table.  Any open FDs in the parent at the time fork is called must be closed in both the parent and the child.

Therefore, when the child is spawned, it also has the same 2 open file descriptors (besides 0-2): the pipe Read FD and Write FD.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

Key Idea: the child gets a copy of the parent's file descriptor table.  Any open FDs in the parent at the time fork is called must be closed in both the parent and the child.

We should close FDs when we are done with them.  The parent closes them here.

pipe()

static const char * kPipeMessage = "Hello, this message is coming through a pipe.";
int main(int argc, char *argv[]) {
    int fds[2];
    pipe(fds);
    size_t bytesSent = strlen(kPipeMessage) + 1;

    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // In the child, we only read from the pipe
        close(fds[1]);
        char buffer[bytesSent];
        read(fds[0], buffer, sizeof(buffer));
        close(fds[0]);
        printf("Message from parent: %s\n", buffer);
        return 0;
    }

    // In the parent, we only write to the pipe (assume everything is written)
    close(fds[0]);
    write(fds[1], kPipeMessage, bytesSent);
    close(fds[1]);
    waitpid(pidOrZero, NULL, 0);
    return 0;
}

Key Idea: the child gets a copy of the parent's file descriptor table.  Any open FDs in the parent at the time fork is called must be closed in both the parent and the child.

We should close FDs when we are done with them.  The child closes them here.

pipe()

Trying Out Pipes

This method of communication between processes relies on the fact that file descriptors are duplicated when forking.

  • each process has its own copy of both file descriptors for the pipe
  • both processes could read or write to the pipe if they wanted.
  • each process must therefore close both file descriptors for the pipe when finished

 

This is the core idea behind how a shell can support piping between processes
(e.g. cat file.txt | uniq | sort).  Let's see how this works in a shell.

 

Pipes

Lecture Plan

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
    • What are pipes?
    • Pipes between processes
    • Redirecting process I/O
  • Practice: Implementing subprocess

Redirecting Process I/O

  • Each process has the special file descriptors STDIN (0), STDOUT (1) and STDERR (2)
  • Processes assume these indexes are for these methods of communication (e.g. printf always outputs to file descriptor 1, STDOUT).

Idea: what happens if we change FD 1 to point somewhere else?

0 1 2 3

Terminal

File

Redirecting Process I/O

0 1 2

Terminal

int main() {
    printf("This will print to the terminal\n");
    close(STDOUT_FILENO);
    
    // fd will always be 1
    int fd = open("myfile.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
	
    printf("This will print to myfile.txt!\n");
    close(fd);
    return 0;
}

Idea: what happens if we change FD 1 to point somewhere else?

Redirecting Process I/O

0 1 2

Terminal

int main() {
    printf("This will print to the terminal\n");
    close(STDOUT_FILENO);
    
    // fd will always be 1
    int fd = open("myfile.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
	
    printf("This will print to myfile.txt!\n");
    close(fd);
    return 0;
}

Idea: what happens if we change FD 1 to point somewhere else?

Redirecting Process I/O

0 1 2

Terminal

myfile.txt

int main() {
    printf("This will print to the terminal\n");
    close(STDOUT_FILENO);
    
    // fd will always be 1
    int fd = open("myfile.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
	
    printf("This will print to myfile.txt!\n");
    close(fd);
    return 0;
}

Idea: what happens if we change FD 1 to point somewhere else?

Redirecting Process I/O

0 1 2

Terminal

myfile.txt

int main() {
    printf("This will print to the terminal\n");
    close(STDOUT_FILENO);
    
    // fd will always be 1
    int fd = open("myfile.txt", O_WRONLY | O_CREAT | O_TRUNC, 0644);
	
    printf("This will print to myfile.txt!\n");
    close(fd);
    return 0;
}

Idea: what happens if we change FD 1 to point somewhere else?

Redirecting Process I/O

0 1 2

Terminal

Idea: what happens if we change a special FD to point somewhere else?

Could we do this with a pipe?

0 1 2

pipe READ

Process 1

Process 2

pipe WRITE

Why would this be useful?

Redirecting Process I/O

I/O redirection and pipes allow us to handle piping in our shell: e.g. cat file.txt | sort

0 1 2

Terminal

0 1 2

pipe READ

cat

sort

pipe WRITE

  • Shell creates three child processes: cat, uniq and sort
  • Shell creates two pipes: one between cat and sort, one between sort and uniq

cat

sort

uniq

terminal in

terminal out

pipe1

pipe2

Process stdin stdout
cat terminal pipe1[1]
sort pipe1[0] pipe2[1]
uniq pipe2[0] terminal
int pipe1[2];

int pipe2[2];

pipe(pipe1);

pipe(pipe2);

Redirecting Process I/O

I/O redirection and pipes allow us to handle piping in our shell: e.g. cat file.txt | sort | uniq

One last issue; how do we "connect" our pipe FDs to STDIN/STDOUT?

Redirecting Process I/O

dup2 makes a copy of a file descriptor entry and puts it in another file descriptor index.  If the second parameter is an already-open file descriptor, it is closed before being used.

int dup2(int oldfd, int newfd);

Example: we can use dup2 to copy the pipe read file descriptor into standard input!

dup2(fds[0], STDIN_FILENO);

Second key detail: execvp consumes the process, except for the file descriptor table!

Lecture Plan

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
  • Practice: Implementing subprocess

subprocess File Descriptor Diagram

To practice this piping technique, let's implement a custom function called subprocess.

 

subprocess_t subprocess(char *command);

subprocess is the same as mysystem, except it also sets up a pipe we can use to write to the child process's STDIN.

It returns a struct containing:

  • the PID of the child process
  • a file descriptor we can use to write to the child's STDIN

Demo: subprocess

Lecture Recap

  • Review: fork() and execvp()
  • Running in the background
  • Introducing Pipes
  • Practice: Implementing subprocess

 

 

Next time: introducing signals

Practice Problems

The program below takes an arbitrary number of filenames as arguments and attempts to publish the date and time.  The desired behavior is shown at right:

static void publish(const char *name) {
    printf("Publishing date and time to file named \"%s\".\n", name);
    int outfile = open(name, O_WRONLY | O_CREAT | O_TRUNC, 0644);
    dup2(outfile, STDOUT_FILENO);
    close(outfile);
    if (fork() > 0) return;
    char *argv[] = { "date", NULL };
    execvp(argv[0], argv);
}
 
int main(int argc, char *argv[]) {
    for (size_t i = 1; i < argc; i++) publish(argv[i]);
    return 0;
}

A Publishing Error

publish.c
myth62:~$ ./publish one two three four
Publishing date and time to file named "one".
Publishing date and time to file named "two".
Publishing date and time to file named "three".
Publishing date and time to file named "four".

However, the program is buggy!

  • What text is actually printed to standard output?
  •  What do each of the four files contain?
  • How can we fix the issue?

Because the child processes (and only the child processes) should be redirecting, we should open, dup2, and close in child-specific code. A happy side effect of the change is that we never muck with STDOUT_FILENO in the parent if we confine the redirection code to the child.  Solution:

static void publish(const char *name) {
    printf("Publishing date and time to file named \"%s\".\n", name); 
    if (fork() > 0) return;
    int outfile = open(name, O_WRONLY | O_CREAT | O_TRUNC, 0644); 
    dup2(outfile, STDOUT_FILENO);
    close(outfile);
    char *argv[] = { "date", NULL };
    execvp(argv[0], argv);
}

A Publishing Error

publish.c

captureProcess

Let's implement a custom function called captureProcess, like subprocess except instead of setting up a pipe to write to the child's STDIN, it's a pipe to read from its STDOUT.

subprocess_t captureProcess(char *command);

It returns a struct containing:

  • the PID of the child process
  • a file descriptor we can use to read from the child's STDOUT

captureProcess

Let's implement a custom function called captureProcess, like subprocess except instead of setting up a pipe to write to the child's STDIN, it's a pipe to read from its STDOUT.

subprocess_t captureProcess(char *command) {
    int fds[2];
    pipe(fds);
    
    pid_t pidOrZero = fork();
    if (pidOrZero == 0) {
        // We are not reading from the pipe, only writing to it
        close(fds[0]);

        // Duplicate the write end of the pipe into STDOUT
        dup2(fds[1], STDOUT_FILENO);
        close(fds[1]);

        char *arguments[] = {"/bin/sh", "-c", command, NULL};
        execvp(arguments[0], arguments);
        exitIf(true, kExecFailed, stderr, "execvp failed to invoke this: %s.\n", command);
    }

    close(fds[1]);
    return (subprocess_t) { pidOrZero, fds[0] };
}
captureProcess.c

CS110 Lecture 06: Pipes, Signals and Concurrency (w21)

By Nick Troccoli

CS110 Lecture 06: Pipes, Signals and Concurrency (w21)

  • 2,038