Principles of Computer Systems
Winter 2020
Stanford University
Computer Science Department
Lecturer: Chris Gregg and
Nick Troccoli
Layering: decomposing systems into components with well-defined responsibilities, specifying repcise APIs between them (above and below)
$cat people.txt | uniq | sort > list.txt vnode abstraction of a file within the kernelprocess control blocks, and they are stored in the process table
file descriptor tableread, write, and close)$ cat in.txt > out.txt works)$ ./main 1> log.txt 2> log.txt
$ ./main 1> log.txt 2>&1
Opens log.txt twice (two file table entries)
Opens log.txt once, two descriptors for same file table entry
// file: testfd.c
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main(int argc, char **argv)
{
const char* error = "One plus one is\ntwo.\n";
const char* msg = "One plus two is\n";
write(2, error, strlen(error));
write(1, msg, strlen(msg));
return 0;
}1
2
pos: 0
pos: 0
log.txt
fd table
file table
vnode
1
2
pos: 0
log.txt
fd table
file table
vnode
cgregg@myth60:$ ./testfd 1> log.txt 2> log.txt
cgregg@myth60:$ cat log.txt
One plus two is
two.
cgregg@myth60:$cgregg@myth60:$ ./testfd 1> log.txt 2>&1
cgregg@myth60:$ cat log.txt
One plus one is
two.
One plus two is
cgregg@myth60:$bash shell calls make, which itself calls g++, each of them inserts text into the same terminal window: those three files could be stdin, stdout, and stderr for a terminal0xFFFFFFFFFFFFFFFF
0x0
libc.so
bash
heap
stack
libdl.so
data
read() and write() operate on the buffer cachesync() system call flushes buffers associated with filelibc.so
libc.so
Process A
Process B
Buffer Cache
mmap()
libc.so
libc.so
Process A
Process B
Buffer Cache
In the CS curriculum so far, your programs have operated in a single process, meaning, basically, that one program was running your code. The operating system gives your program the illusion that it was the only thing running, and that was that.
Now, we are going to move into the realm of multiprocessing, where you control more than one process at a time with your programs. You will ask the OS, “do these things concurrently”, and it will.
sleep() , a read(), etc.// file: getpidEx.c
#include<stdio.h>
#include<stdlib.h>
#include <unistd.h> // getpid
int main(int argc, char **argv)
{
pid_t pid = getpid();
printf("My process id: %d\n",pid);
return 0;
}cgregg@myth57$ ./getpidEx
My process id: 7526fork() system call creates a new processfork()
fork() does exactly this:
fork call returns a pid_t (an integer) to both processes. Neither is the actual pid of the process that receives it:
getpid itself to retrieve it.fork is twofold:
getppid)fork, and it is useful for a process to know whether it is the parent or the child.fork, getpid, and getppid. The full program can be viewed right here.int main(int argc, char *argv[]) {
printf("Greetings from process %d! (parent %d)\n", getpid(), getppid());
pid_t pid = fork();
assert(pid >= 0);
printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
return 0;
}myth60$ ./basic-fork
Greetings from process 29686! (parent 29351)
Bye-bye from process 29686! (parent 29351)
Bye-bye from process 29687! (parent 29686)
myth60$ ./basic-fork
Greetings from process 29688! (parent 29351)
Bye-bye from process 29688! (parent 29351)
Bye-bye from process 29689! (parent 29688int main(int argc, char *argv[]) {
printf("Greetings from process %d! (parent %d)\n", getpid(), getppid());
pid_t pid = fork();
assert(pid >= 0);
printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
return 0;
}gdb has built-in support for debugging multiple processes, as follows:
set detach-on-fork off
gdb to capture any fork'd processes, though it pauses them upon the fork.
info inferiors
gdb has captured.inferior X
detach inferior X
gdb to stop watching the process, and continue itbasic-fork program right here.fork calls
fork this way, it's instructive to trace through a short program where spawned processes themselves call fork. The full program can be viewed right here.static const char const *kTrail = "abcd";
int main(int argc, char *argv[]) {
size_t trailLength = strlen(kTrail);
for (size_t i = 0; i < trailLength; i++) {
printf("%c\n", kTrail[i]);
pid_t pid = fork();
assert(pid >= 0);
}
return 0;
}fork calls
a is printed by the soon-to-be-great-grandparent process.fork and continue running in mirror processes, each with their own copy of the global "abcd" string, and each advancing to the i++ line within a loop that promotes a 0 to 1. It's hopefully clear now that two b's will be printed.b's always consecutive?c's get printed?d's get printed?myth60$ ./fork-puzzle
a
b
c
b
d
c
d
c
c
d
d
d
d
d
d
myth60$myth60$ ./fork-puzzle
a
b
b
c
d
c
d
c
d
d
c
d
myth60$ d
d
dwaitpid can be used to temporarily block a process until a child process exits.waitpid can return.NULL if we don't care for the information).waitpid should only return when a process in the supplied wait set exits.waitpid was called and there were no child processes in the supplied wait set.pid_t waitpid(pid_t pid, int *status, int options);waitpid
fork really gets used in practice (full program, with error checking, is right here):int main(int argc, char *argv[]) {
printf("Before.\n");
pid_t pid = fork();
printf("After.\n");
if (pid == 0) {
printf("I am the child, and the parent will wait up for me.\n");
return 110; // contrived exit status
} else {
int status;
waitpid(pid, &status, 0)
if (WIFEXITED(status)) {
printf("Child exited with status %d.\n", WEXITSTATUS(status));
} else {
printf("Child terminated abnormally.\n");
}
return 0;
}
}waitpid.waitpid call, and uses the WIFEXITEDWEXITSTATUS macro to extract the lower eight bits of its argument to produce the child return value (which we can see is, and should be, 110).waitpid call also donates child process-oriented resources back to the system. myth60$ ./separate
Before.
After.
After.
I am the child, and the parent will wait up for me.
Child exited with status 110.
myth60$fork really is (full program, with more error checking, is right here).printf gets executed twice. The child is always the first to execute it, becausewaitpid call until the child executes everything.int main(int argc, char *argv[]) {
printf("I'm unique and just get printed once.\n");
bool parent = fork() != 0;
if ((random() % 2 == 0) == parent) sleep(1); // force exactly one of the two to sleep
if (parent) waitpid(pid, NULL, 0); // parent shouldn't exit until child has finished
printf("I get printed twice (this one is being printed from the %s).\n",
parent ? "parent" : "child");
return 0;
}fork multiple times, provided it reaps the child processes (via waitpid) once they exit. If we want to reap processes as they exit without concern for the order they were spawned, then this does the trick (full program checking right here):int main(int argc, char *argv[]) {
for (size_t i = 0; i < 8; i++) {
if (fork() == 0) exit(110 + i);
}
while (true) {
int status;
pid_t pid = waitpid(-1, &status, 0);
if (pid == -1) { assert(errno == ECHILD); break; }
if (WIFEXITED(status)) {
printf("Child %d exited: status %d\n", pid, WEXITSTATUS(status));
} else {
printf("Child %d exited abnormally.\n", pid);
}
}
return 0;
}
waitpid. man waitpid:
"The value of pid can be:
< -1 meaning wait for any child process whose process group ID is equal to the absolute value of pid.
-1 meaning wait for any child process.
0 meaning wait for any child process whose process group ID is equal to that of the calling process.
> 0 meaning wait for the child whose process ID is equal to the value of pid.
waitpid correctly returns -1 to signal there are no more processes under the parent's jurisdiction.waitpid returns -1, it sets a global variable called errno to the constant ECHILD to signal waitpid returned -1 because all child processes have terminated. That's the "error" we want.$ vim main.c
$ vim main.c &