CS110: Principles of Computer Systems
Winter 2021-2022
Stanford University
Instructors: Nick Troccoli and Jerry Cain
Creating processes and running other programs
Inter-process communication and Pipes
Signals
Race Conditions
assign3: implement multiprocessing programs like "trace" (to trace another program's behavior) and "farm" (parallelize tasks)
assign4: implement your own shell!
I/O redirection and pipes allow us to handle piping in our shell: e.g. cat file.txt | sort
0 | 1 | 2 |
---|
Terminal
0 | 1 | 2 |
---|
pipe READ
cat
sort
pipe WRITE
Last time, we implemented a custom function called pipeline.
void pipeline(char *argv1[], char *argv2[], pid_t pids[]);
pipeline is similar to subprocess, except it also spawns a second child and directs its STDOUT to write to the pipe. Both children should run in parallel.
It doesn't return anything, but it writes the two children PIDs to the specified pids array
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
// Spawn the first child
pids[0] = fork();
if (pids[0] == 0) {
// The first child's STDOUT should be the write end of the pipe
close(fds[0]);
dup2(fds[1], STDOUT_FILENO);
close(fds[1]);
execvp(argv1[0], argv1);
}
// We no longer need the write end of the pipe
close(fds[1]);
// Spawn the second child
pids[1] = fork();
if (pids[1] == 0) {
// The second child's STDIN should be the read end of the pipe
dup2(fds[0], STDIN_FILENO);
close(fds[0]);
execvp(argv2[0], argv2);
}
// We no longer need the read end of the pipe
close(fds[0]);
}
There were a lot of close() calls! Is there a way for any of them to be done automatically?
int pipe2(int fds[], int flags);
pipe2 is the same as pipe except it lets you customize the pipe with some optional flags.
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
pids[0] = fork();
if (pids[0] == 0) {
close(fds[0]);
dup2(fds[1], STDOUT_FILENO);
close(fds[1]);
execvp(argv1[0], argv1);
}
close(fds[1]);
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO);
close(fds[0]);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
The highlighted calls to close() would no longer be necessary if we use pipe2 with O_CLOEXEC because the surrounding process for each calls execvp.
Note that the parent must still close them because it doesn't call execvp.
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe2(fds, O_CLOEXEC);
pids[0] = fork();
if (pids[0] == 0) {
dup2(fds[1], STDOUT_FILENO);
execvp(argv1[0], argv1);
}
close(fds[1]);
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
This version of pipeline uses pipe2 with O_CLOEXEC.
A signal is a way to notify a process that an event has occurred
myth$ ./my-program
Segmentation fault (core dumped)
myth$
A segmentation fault is actually a signal (SIGSEGV) sent from the OS to your program.
Here are some examples of signals:
Running - a process is either executing or waiting to execute
Stopped - a process is suspended due to receiving a SIGSTOP or similar signal. A process will resume if it receives a SIGCONT signal.
Terminated - a process is permanently stopped, either due to finishing, or receiving a signal such as SIGSEGV or SIGKILL whose default behavior is to terminate the process.
waitpid()
Waitpid can be used to wait on children to terminate or change state:
pid_t waitpid(pid_t pid, int *status, int options);
The default behavior is to wait for the specified child process to exit. options lets us customize this further (can combine these flags using | ):
The operating system sends many signals, but we can also send signals manually.
int kill(pid_t pid, int signum);
// same as kill(getpid(), signum)
int raise(int signum);
There are two main ways we can respond to signals we have received:
Signal handlers are versatile but fraught with potential issues. We will learn about them to motivate the second approach (blocking until signal is received).
We can have a function of our choice execute when a certain signal is received.
typedef void (*sighandler_t)(int);
...
sighandler_t signal(int signum, sighandler_t handler);
static void handleSIGINT(int sig) {
printf("Sigint received!\n");
}
int main(int argc, char *argv[]) {
signal(SIGINT, handleSIGINT);
printf("Just try to interrupt me!\n");
while (true) {
sleep(1);
}
return 0;
}
Key insight: when a child changes state, the kernel sends a SIGCHLD signal to its parent.
Let's write a program where a parent spawns off five children to go play, and does something else (sleeps 😴) until all the children are done.
static const size_t kNumChildren = 5;
int main(int argc, char *argv[]) {
printf("Let my five children play while I take a nap.\n");
for (size_t kid = 1; kid <= kNumChildren; kid++) {
if (fork() == 0) {
sleep(3 * kid); // sleep emulates "play" time
printf("Child #%zu tired... returns to parent.\n", kid);
return 0;
}
}
// parent goes and does other work
snooze(5); // custom fn to sleep uninterrupted
return 0;
}
static const size_t kNumChildren = 5;
static size_t numChildrenDonePlaying = 0;
static void reapChild(int sig) {
waitpid(-1, NULL, 0);
numChildrenDonePlaying++;
}
int main(int argc, char *argv[]) {
printf("Let my five children play while I take a nap.\n");
signal(SIGCHLD, reapChild);
for (size_t kid = 1; kid <= kNumChildren; kid++) {
if (fork() == 0) {
sleep(3 * kid); // sleep emulates "play" time
printf("Child #%zu tired... returns to parent.\n", kid);
return 0;
}
}
while (numChildrenDonePlaying < kNumChildren) {
printf("At least one child still playing, so parent nods off.\n");
snooze(5); // custom fn to sleep uninterrupted
printf("Parent wakes up! ");
}
printf("All children accounted for. Good job, parent!\n");
return 0;
}
A signal can be received at any time, and a signal handler can execute at any time.
// five-children.c
static const size_t kNumChildren = 5;
static size_t numChildrenDonePlaying = 0;
static void reapChild(int sig) {
waitpid(-1, NULL, 0);
numChildrenDonePlaying++;
}
int main(int argc, char *argv[]) {
printf("Let my five children play while I take a nap.\n");
signal(SIGCHLD, reapChild);
for (size_t kid = 1; kid <= kNumChildren; kid++) {
if (fork() == 0) {
sleep(3); // sleep emulates "play" time
printf("Child #%zu tired... returns to parent.\n", kid);
return 0;
}
}
while (numChildrenDonePlaying < kNumChildren) {
printf("At least one child still playing, so parent nods off.\n");
snooze(5); // custom fn to sleep uninterrupted
printf("Parent wakes up! ");
}
printf("All children accounted for. Good job, parent!\n");
return 0;
}
What happens if all children sleep for the same amount of time? (E.g. change line 15 from sleep(3 * kid) to sleep(3)).
Problem: a signal handler is called if one or more signals are sent.
Solution: signal handler should clean up as many children as possible.
Next time: more signal handlers and another approach to signals