Principles of Computer Systems
Winter 2020
Stanford University
Computer Science Department
Lecturer: Chris Gregg and
Nick Troccoli
Layering: decomposing systems into components with well-defined responsibilities, specifying repcise APIs between them (above and below)
$cat people.txt | uniq | sort > list.txt
vnode
abstraction of a file within the kernelprocess control blocks
, and they are stored in the process table
file descriptor table
read, write,
and close
)$ cat in.txt > out.txt
works)$ ./main 1> log.txt 2> log.txt
$ ./main 1> log.txt 2>&1
Opens log.txt twice (two file table entries)
Opens log.txt once, two descriptors for same file table entry
// file: testfd.c
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main(int argc, char **argv)
{
const char* error = "One plus one is\ntwo.\n";
const char* msg = "One plus two is\n";
write(2, error, strlen(error));
write(1, msg, strlen(msg));
return 0;
}
1
2
pos: 0
pos: 0
log.txt
fd table
file table
vnode
1
2
pos: 0
log.txt
fd table
file table
vnode
cgregg@myth60:$ ./testfd 1> log.txt 2> log.txt
cgregg@myth60:$ cat log.txt
One plus two is
two.
cgregg@myth60:$
cgregg@myth60:$ ./testfd 1> log.txt 2>&1
cgregg@myth60:$ cat log.txt
One plus one is
two.
One plus two is
cgregg@myth60:$
bash
shell calls make
, which itself calls g++
, each of them inserts text into the same terminal window: those three files could be stdin, stdout, and stderr for a terminal0xFFFFFFFFFFFFFFFF
0x0
libc.so
bash
heap
stack
libdl.so
data
read()
and write()
operate on the buffer cachesync()
system call flushes buffers associated with filelibc.so
libc.so
Process A
Process B
Buffer Cache
mmap()
libc.so
libc.so
Process A
Process B
Buffer Cache
In the CS curriculum so far, your programs have operated in a single process, meaning, basically, that one program was running your code. The operating system gives your program the illusion that it was the only thing running, and that was that.
Now, we are going to move into the realm of multiprocessing, where you control more than one process at a time with your programs. You will ask the OS, “do these things concurrently”, and it will.
sleep()
, a read()
, etc.// file: getpidEx.c
#include<stdio.h>
#include<stdlib.h>
#include <unistd.h> // getpid
int main(int argc, char **argv)
{
pid_t pid = getpid();
printf("My process id: %d\n",pid);
return 0;
}
cgregg@myth57$ ./getpidEx
My process id: 7526
fork()
system call creates a new processfork()
fork()
does exactly this:
fork
call returns a pid_t
(an integer) to both processes. Neither is the actual pid
of the process that receives it:
getpid
itself to retrieve it.fork
is twofold:
getppid
)fork
, and it is useful for a process to know whether it is the parent or the child.fork
, getpid
, and getppid
. The full program can be viewed right here.int main(int argc, char *argv[]) {
printf("Greetings from process %d! (parent %d)\n", getpid(), getppid());
pid_t pid = fork();
assert(pid >= 0);
printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
return 0;
}
myth60$ ./basic-fork
Greetings from process 29686! (parent 29351)
Bye-bye from process 29686! (parent 29351)
Bye-bye from process 29687! (parent 29686)
myth60$ ./basic-fork
Greetings from process 29688! (parent 29351)
Bye-bye from process 29688! (parent 29351)
Bye-bye from process 29689! (parent 29688
int main(int argc, char *argv[]) {
printf("Greetings from process %d! (parent %d)\n", getpid(), getppid());
pid_t pid = fork();
assert(pid >= 0);
printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
return 0;
}
gdb
has built-in support for debugging multiple processes, as follows:
set detach-on-fork off
gdb
to capture any fork
'd processes, though it pauses them upon the fork
.
info inferiors
gdb
has captured.inferior X
detach inferior X
gdb
to stop watching the process, and continue itbasic-fork
program right here.fork
calls
fork
this way, it's instructive to trace through a short program where spawned processes themselves call fork
. The full program can be viewed right here.static const char const *kTrail = "abcd";
int main(int argc, char *argv[]) {
size_t trailLength = strlen(kTrail);
for (size_t i = 0; i < trailLength; i++) {
printf("%c\n", kTrail[i]);
pid_t pid = fork();
assert(pid >= 0);
}
return 0;
}
fork
calls
a
is printed by the soon-to-be-great-grandparent process.fork
and continue running in mirror processes, each with their own copy of the global "abcd"
string, and each advancing to the i++
line within a loop that promotes a 0 to 1. It's hopefully clear now that two b
's will be printed.b
's always consecutive?c
's get printed?d
's get printed?myth60$ ./fork-puzzle
a
b
c
b
d
c
d
c
c
d
d
d
d
d
d
myth60$
myth60$ ./fork-puzzle
a
b
b
c
d
c
d
c
d
d
c
d
myth60$ d
d
d
waitpid
can be used to temporarily block a process until a child process exits.waitpid
can return.NULL
if we don't care for the information).waitpid
should only return when a process in the supplied wait set exits.waitpid
was called and there were no child processes in the supplied wait set.pid_t waitpid(pid_t pid, int *status, int options);
waitpid
fork
really gets used in practice (full program, with error checking, is right here):int main(int argc, char *argv[]) {
printf("Before.\n");
pid_t pid = fork();
printf("After.\n");
if (pid == 0) {
printf("I am the child, and the parent will wait up for me.\n");
return 110; // contrived exit status
} else {
int status;
waitpid(pid, &status, 0)
if (WIFEXITED(status)) {
printf("Child exited with status %d.\n", WEXITSTATUS(status));
} else {
printf("Child terminated abnormally.\n");
}
return 0;
}
}
waitpid
.waitpid
call, and uses the WIFEXITED
WEXITSTATUS
macro to extract the lower eight bits of its argument to produce the child return value (which we can see is, and should be, 110).waitpid
call also donates child process-oriented resources back to the system. myth60$ ./separate
Before.
After.
After.
I am the child, and the parent will wait up for me.
Child exited with status 110.
myth60$
fork
really is (full program, with more error checking, is right here).printf
gets executed twice. The child is always the first to execute it, becausewaitpid
call until the child executes everything
.int main(int argc, char *argv[]) {
printf("I'm unique and just get printed once.\n");
bool parent = fork() != 0;
if ((random() % 2 == 0) == parent) sleep(1); // force exactly one of the two to sleep
if (parent) waitpid(pid, NULL, 0); // parent shouldn't exit until child has finished
printf("I get printed twice (this one is being printed from the %s).\n",
parent ? "parent" : "child");
return 0;
}
fork
multiple times, provided it reaps the child processes (via waitpid
) once they exit. If we want to reap processes as they exit without concern for the order they were spawned, then this does the trick (full program checking right here):int main(int argc, char *argv[]) {
for (size_t i = 0; i < 8; i++) {
if (fork() == 0) exit(110 + i);
}
while (true) {
int status;
pid_t pid = waitpid(-1, &status, 0);
if (pid == -1) { assert(errno == ECHILD); break; }
if (WIFEXITED(status)) {
printf("Child %d exited: status %d\n", pid, WEXITSTATUS(status));
} else {
printf("Child %d exited abnormally.\n", pid);
}
}
return 0;
}
waitpid
. man waitpid
:
"The value of pid can be:
< -1 meaning wait for any child process whose process group ID is equal to the absolute value of pid.
-1 meaning wait for any child process.
0 meaning wait for any child process whose process group ID is equal to that of the calling process.
> 0 meaning wait for the child whose process ID is equal to the value of pid.
waitpid
correctly returns -1 to signal there are no more processes under the parent's jurisdiction.waitpid
returns -1, it sets a global variable called errno
to the constant ECHILD
to signal waitpid
returned -1 because all child processes have terminated. That's the "error" we want.$ vim main.c
$ vim main.c &