Spring 2022
Jerry Cain
open
system call, and you can set the permissions at that time, as well. The open function comes with the following signatures:int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
The first argument names the file you'd like to interact with, e.g. "sh111.cc"
The second argument is a bitwise or'ed collection of flags that specifies how you'd like to interact with the file. The argument must include exactly one of the following:
O_RDONLY: read only
O_WRONLY: write only
O_RDWR: read and write (this one won't come up in Project 1)
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
ssize_t read(int fd, char buffer[], size_t len);
ssize_t write(int fd, char buffer[], size_t len);
int close(int fd);
tee
program that ships with Linux copies everything from standard input to standard output, making zero or more extra copies in the named files supplied as user program arguments.one.txt
, two.txt
, and three.txt
.If the file vowels.txt
contains the five vowels and the newline character, and tee
is invoked as follows, one.txt
would be rewritten to contain only the English vowels.
$ cat vowels.txt | ./tee one.txt
aeiou
$ cat one.txt
aeiou
$ cat alphabet.txt | tee one.txt two.txt three.txt
abcdefghijklmnopqrstuvwxyz
$ cat one.txt
abcdefghijklmnopqrstuvwxyz
$ cat two.txt
abcdefghijklmnopqrstuvwxyz
$ diff one.txt two.txt
$ diff one.txt three.txt
$
We'll work through an implementation of our tee program during the review session, but that implementation is on the next slide.
int main(int argc, char *argv[]) {
int fds[argc];
fds[0] = STDOUT_FILENO;
for (size_t i = 1; i < argc; i++)
fds[i] = open(argv[i], O_WRONLY | O_CREAT | O_TRUNC, 0644);
char buffer[2048];
while (true) {
ssize_t numRead = read(STDIN_FILENO, buffer, sizeof(buffer));
if (numRead == 0) break;
for (size_t i = 0; i < argc; i++) write(fds[i], buffer, numRead);
}
for (size_t i = 1; i < argc; i++) close(fds[i]);
return 0;
}
argc
incidentally equals the number of descriptors we need to write to. That's why we declare an int array (or rather, a descriptor array) of length argc
.STDIN_FILENO
is a built-in constant for the number 0, which is the descriptor normally linked to standard input. STDOUT_FILENO
is a constant for the number 1, which is the default descriptor bound to standard output.fork
fork
, getpid
, and getppid
. The full program can be viewed right here.int main(int argc, char *argv[]) {
std::cout << "Greetings from process " << getpid()
<< " (with parent " << getppid() << ")" << std::endl;
pid_t pid = fork();
assert(pid >= 0);
std::cout << "Bye-bye from process " << getpid()
<< " (with parent " << getppid() << ")" << std::endl;
return 0;
}
myth60$ ./basic-fork
Greetings from process 29686! (parent 29351)
Bye-bye from process 29686! (parent 29351)
Bye-bye from process 29687! (parent 29686)
myth60$ ./basic-fork
Greetings from process 29688! (parent 29351)
Bye-bye from process 29688! (parent 29351)
Bye-bye from process 29689! (parent 29688)
fork
is called once, but it returns twice.getpid
and getppid
return the process id of the caller and the process id of the caller's parent, respectively.fork
knows how to clone the calling process, synthesize a nearly identical copy of it, and schedule the copy to run as if it’s been running all along.pid_t waitpid(pid_t pid, int *status, int options);
int main(int argc, char *argv[]) {
std::cout << "Before." << std::endl;
pid_t pid = fork();
std::cout << "After." << std::endl;
if (pid == 0) {
std::cout << "I'm taking CS111!" << std::endl;
return 111;
}
int status;
waitpid(pid, &status, 0);
assert(WIFEXITED(status) && WEXITSTATUS(status) == 111);
std::cout << "Student completed CS111 and aced it!"
<< std::endl;
return 0;
}
execvp
effectively cannibalizes a process to run a different program from scratch.
path
is relative or absolute pathname of the executable to be invoked.argv
is the argument vector that should be funneled through to the new executable's main
function.path
and argv[0]
generally end up being the same exact string.execvp
fails to cannibalize the process and install a new executable image within it, it returns -1 to express failure.execvp
succeeds, it 😱 never returns 😱 (to the original executable, anyway)execvp
has many variants (execle
, execlp
, and so forth. Type man
execvp
to see all of them).int execvp(const char *path, char *argv[]);
timeout
launches the provided command
with all of the arguments that follow and allows it to run for up to n
seconds before terminating it.command
finishes before time is up, timeout
itself returns the exit code of that process without waiting any additional time.command
doesn’t finish before time is up, timeout
kills it and returns an exit code of 124.
myth62:~$ ./timeout <n> <command> [<arg1> [<argv2 [...]]]
myth62:~$ ./timeout 5 sleep 3
myth62:~$ echo $? # this prints return value of last command
0
myth62:~$ ./timeout 5 sleep 10
myth62:~$ echo $?
124
myth62:~$ ./timeout 1 factor 1234 2345 3456
1234: 2 617
2345: 5 7 67
3456: 2 2 2 2 2 2 2 3 3 3
myth62:~$ echo $?
0
myth62:~$ ./timeout 0 factor 3125250912230709951372256510
myth62:~$ echo $?
124
myth62:~$
int main(int argc, char *argv[]) {
pid_t timed = fork();
if (timed == 0) { execvp(argv[2], argv + 2); exit(0); }
pid_t timer = fork();
if (timer == 0) { sleep(atoi(argv[1])); return 0; }
int status;
pid_t gold = waitpid(-1, &status, 0);
pid_t silver = gold == timed ? timer : timed;
kill(silver, SIGKILL);
waitpid(silver, NULL, 0);
if (gold == timed) {
return WEXITSTATUS(status);
} else {
return 124;
}
}
int pipe(int fds[]);
pipe
system call.
pipe
system call takes an uninitialized array of two integers—we'll call it fds
—and populates it with two file descriptors such that everything written to fds[1]
can be read from fds[0]
.pipe
is particularly useful for allowing parent processes to communicate with spawned child processes.
pipe
work?
pipe
works and how messages can be passed from one process to a second, let's consider the following program (available for play right here):int main(int argc, char *argv[]) {
int fds[2];
pipe(fds);
pid_t pid = fork();
if (pid == 0) {
close(fds[1]); // close is the fclose of descriptors
char buffer[6];
read(fds[0], buffer, sizeof(buffer)); // read is the scanf of descriptors
std::cout << "Read the following from the pid " << getpid() << ": \""
<< buffer << "\"." << std::endl;
close(fds[0]);
return 0;
}
close(fds[0]);
std::cout << "Printing \"hello\" from pid " << getpid() << "." << std::endl;
write(fds[1], "hello", 6); // write is the printf of descriptors
close(fds[1]);
waitpid(pid, NULL, 0);
return 0;
}
pipe
and fork
work together in this example?
fds
is shared with the call to pipe
.pipe
allocates two descriptors, setting the first to read from a resource and the second to write to that same resource. Think of this resource as an unnamed file that only the OS knows about.pipe
then plants copies of those two descriptors into indices 0 and 1 of the supplied array before it returns.fork
call creates a child process, which itself inherits a shallow copy of the parent's fds
array.
fork
call, anything printed to fds[1]
is readable from the parent's fds[0]
and the child's fds[0].
fds[1]
.fds[0]
before it writes to anything to fds[1]
to emphasize the fact that the parent has no need to read anything from the pipe.fds[1]
before it reads from fds[0]
to be clear it has zero interest in printing anything to the pipe.
int dup2(int source, int target);
dup2(fds[0], STDIN_FILENO); // STDIN_FILENO is a #define constant for 0
close(fds[0]);
myth51:~$ cat /usr/include/tar.h | wc
112 600 3786
myth51:~$ echo -e "pear\ngrape\npeach\napricot\nbanana\napple" | sort | grep ap
apple
apricot
grape
myth51:~$ time sleep 5 | sleep 10
real 0m10.004s
user 0m0.006s
sys 0m0.001s
myth55:~$ curl -sL "http://cs111.stanford.edu/odyssey.txt" | sed 's/[^a-zA-Z ]/ /g' |
> tr 'A-Z ' 'a-z\n' | grep [a-z] | sort -u |
> comm -23 - <(sort /usr/share/dict/words) | less
myth55:~$
pipeline
function that codes to the following interface:pipeline
accepts two argument vectors and, assuming both vectors are valid, spawns off twin processes with the added bonus that the standard output of the first is directed to the standard input of the second.
pipeline
calls are well-formed and work as expected. argv1
and argv2
are each valid, NULL
-terminated argument vectors, and pids
is the base address of an array of length two.pipe
, dup2
, close
, execvp
, and so forth succeed so that you needn't do any error checking whatsoever.pipeline
should return without waiting for either of the child processes to finish, and the pids of the two processes are dropped into pids[0]
and pids[1]
.void pipeline(char *argv1[], char *argv2[], pid_t pids[]);
pid_t pids[2];
pipeline({"sleep", "10", NULL}, {"sleep", "10", NULL}, pids);
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
pids[0] = fork();
if (pids[0] == 0) {
close(fds[0]);
dup2(fds[1], STDOUT_FILENO); // second arg can be 1 instead of constant
close(fds[1]);
execvp(argv1[0], argv1);
}
close(fds[1]); // was only relevant to first child, so close before second fork
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO); // second arg can be 0 instead of constant
close(fds[0]);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
pids[0] = fork();
if (pids[0] == 0) {
close(fds[0]);
dup2(fds[1], STDOUT_FILENO);
close(fds[1]);
execvp(argv1[0], argv1);
}
close(fds[1]); // was only relevant to first child, so close before second fork
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO);
close(fds[0]);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
pids[0] = fork();
if (pids[0] == 0) {
close(fds[0]);
dup2(fds[1], STDOUT_FILENO);
close(fds[1]);
execvp(argv1[0], argv1);
}
close(fds[1]); // was only relevant to first child, so close before second fork
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO);
close(fds[0]);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
pids[0] = fork();
if (pids[0] == 0) {
close(fds[0]);
dup2(fds[1], STDOUT_FILENO);
close(fds[1]);
execvp(argv1[0], argv1);
}
close(fds[1]); // was only relevant to first child, so close before second fork
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO);
close(fds[0]);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
pids[0] = fork();
if (pids[0] == 0) {
close(fds[0]);
dup2(fds[1], STDOUT_FILENO);
close(fds[1]);
execvp(argv1[0], argv1);
}
close(fds[1]); // was only relevant to first child, so close before second fork
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO);
close(fds[0]);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
void pipeline(char *argv1[], char *argv2[], pid_t pids[]) {
int fds[2];
pipe(fds);
pids[0] = fork();
if (pids[0] == 0) {
close(fds[0]);
dup2(fds[1], STDOUT_FILENO);
close(fds[1]);
execvp(argv1[0], argv1);
}
close(fds[1]); // was only relevant to first child, so close before second fork
pids[1] = fork();
if (pids[1] == 0) {
dup2(fds[0], STDIN_FILENO);
close(fds[0]);
execvp(argv2[0], argv2);
}
close(fds[0]);
}
open
system call, and you can set the permissions at that time, as well. The open function comes with the following signatures:int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
There are many flags (see man 2 open for a list of them), and they can be bitwise or'd together. You must, however, include exactly one of the following flags:
O_RDONLY: read only
O_WRONLY: write only
O_RDWR: read and write (this one won't come up in Project 1)