Spring 2021
Instructors Roz Cyrus and Jerry Cain
PDF
open
, read
, write
, close
, stat
, and lstat
. We'll see many others in the coming weeks.printf
, malloc
, fopen
, and opendir
are not syscalls. They're C library functions that themselves rely on syscalls to get their jobs done.libc
and libstdc++
library functions), system calls need to execute in some privileged mode so they can access data structures, system information, and other OS resources intentionally and necessarily hidden from user code.open
, for instance, needs to access all of the filesystem data structures for existence and permissioning. Filesystem implementation details should be hidden from the user, and permission information should be respected as private.open
musn’t be visible to the user functions that call open
. Restated, privileged information shouldn't be discoverable.malloc
, realloc
, free
, and their C++ equivalents. It's initially very small, but grows as needed for processes requiring a good amount of dynamically allocated memory.libc
and libstdc++
with code for routines like C's printf
, C's malloc
, or C++'s getline
. Shared libraries get their own segment so all processes can trampoline through some glue code—that is, the minimum amount of code necessary—to jump into the one copy of the library code that exists on behalf of all processes.%rsp
to track the address boundary between the in-use portion of the user stack and the portion that's on deck to be used should the currently executing function invoke a subroutine.callq
and retq
instructions for user function call and return.%rdi
, %rsi
, %rdx
, %rcx
, %r8
,%r9
. The stackloadFiles
as per the diagram below. Because loadFiles
's stack frame is directly below that of its caller, it can use pointer arithmetic to advance beyond its frame and examine—or even update—the stack frame above it.loadFiles
returns, main
could use pointer arithmetic to descend into the ghost of loadFiles
's stack frame and accessloadFiles
open
and stat
need access to OS implementation detail that should not be exposed or otherwise accessible to the user program.callq
is used for user function call, but callq
would dereference a function pointer we're not permitted to dereference, since it resides in kernel space.callq
.%rax
. Each system call has its own opcode (e.g. 0 for read
, 1 for write
, 2 for open
, 3 for close
, 4 for stat
, and so forth).%rdi
, %rsi
, %rdx
, %r10
, %r8
,%r9
. Note the fourth parameter is %r10
, not %rcx
.syscall
, which prompts an interrupt handler to execute in superuser mode.%rax
, and then executes iretq
to return from the interrupt handler, revert from superuser mode, and execute the instruction following the syscall
.%rax
is negative, errno
is set to abs(%rax
) and %rax
is updated to contain a -1. If %rax
is nonnegative, it's left as is. The value in %rax
is then extracted byfork
fork
, getpid
, and getppid
. The full program can be viewed right here.int main(int argc, char *argv[]) {
printf("Greetings from process %d! (parent %d)\n", getpid(), getppid());
pid_t pid = fork();
assert(pid >= 0);
printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
return 0;
}
myth60$ ./basic-fork
Greetings from process 29686! (parent 29351)
Bye-bye from process 29686! (parent 29351)
Bye-bye from process 29687! (parent 29686)
myth60$ ./basic-fork
Greetings from process 29688! (parent 29351)
Bye-bye from process 29688! (parent 29351)
Bye-bye from process 29689! (parent 29688)
fork
is called once, but it returns twice.
getpid
and getppid
return the process id of the caller and the process id of the caller's parent, respectively.fork
knows how to clone the calling process, synthesize a nearly identical copy of it, and schedule the copy to run as if it’s been running all along.
basic-fork
processes—with pids of 29686 and 29688—are direct child processes of the terminal. The output tells us so.fork
and child generated by it:
fork
's return value in the two processes
fork
returns in the parent process, it returns the pid of the new child.fork
returns in the child process, it returns 0. That isn't to say the child's pid is 0, but rather that fork
elects to return a 0 as a way of allowing the child to easily self-identify as the child.gdb
has built-in support for debugging multiple processes, as follows:
set detach-on-fork off
gdb
to capture all fork
'd processes, though it pauses each at fork
.
info inferiors
gdb
has captured.inferior X
detach inferior X
gdb
to stop watching the process before continuing itbasic-fork
program right here.fork
so far:
fork
is a system call that creates a near duplicate of the current process.fork
is the child's pid
, and in the child, the return value is 0. This enables both the parent and the child to determine which process they are.fork
, there is virtually no difference in the two processes, and they both continue after fork
as if they were the original process.wait
(more next time) for child processes to complete.fork
calls. More on that in discussion section.