Spring 2021
Instructors Roz Cyrus and Jerry Cain
PDF
open, read, write, close, stat, and lstat. We'll see many others in the coming weeks.printf, malloc, fopen, and opendir are not syscalls. They're C library functions that themselves rely on syscalls to get their jobs done.libc and libstdc++ library functions), system calls need to execute in some privileged mode so they can access data structures, system information, and other OS resources intentionally and necessarily hidden from user code.open, for instance, needs to access all of the filesystem data structures for existence and permissioning. Filesystem implementation details should be hidden from the user, and permission information should be respected as private.open musn’t be visible to the user functions that call open. Restated, privileged information shouldn't be discoverable.malloc, realloc, free, and their C++ equivalents. It's initially very small, but grows as needed for processes requiring a good amount of dynamically allocated memory.libc and libstdc++ with code for routines like C's printf, C's malloc, or C++'s getline. Shared libraries get their own segment so all processes can trampoline through some glue code—that is, the minimum amount of code necessary—to jump into the one copy of the library code that exists on behalf of all processes.%rsp to track the address boundary between the in-use portion of the user stack and the portion that's on deck to be used should the currently executing function invoke a subroutine.callq and retq instructions for user function call and return.%rdi, %rsi, %rdx, %rcx, %r8,%r9. The stackloadFiles as per the diagram below. Because loadFiles's stack frame is directly below that of its caller, it can use pointer arithmetic to advance beyond its frame and examine—or even update—the stack frame above it.loadFiles returns, main could use pointer arithmetic to descend into the ghost of loadFiles's stack frame and accessloadFilesopen and stat need access to OS implementation detail that should not be exposed or otherwise accessible to the user program.callq is used for user function call, but callq would dereference a function pointer we're not permitted to dereference, since it resides in kernel space.callq.%rax. Each system call has its own opcode (e.g. 0 for read, 1 for write, 2 for open, 3 for close, 4 for stat, and so forth).%rdi, %rsi, %rdx, %r10, %r8,%r9. Note the fourth parameter is %r10, not %rcx.syscall, which prompts an interrupt handler to execute in superuser mode.%rax, and then executes iretq to return from the interrupt handler, revert from superuser mode, and execute the instruction following the syscall.%rax is negative, errno is set to abs(%rax) and %rax is updated to contain a -1. If %rax is nonnegative, it's left as is. The value in %rax is then extracted byforkfork, getpid, and getppid. The full program can be viewed right here.int main(int argc, char *argv[]) {
printf("Greetings from process %d! (parent %d)\n", getpid(), getppid());
pid_t pid = fork();
assert(pid >= 0);
printf("Bye-bye from process %d! (parent %d)\n", getpid(), getppid());
return 0;
}
myth60$ ./basic-fork
Greetings from process 29686! (parent 29351)
Bye-bye from process 29686! (parent 29351)
Bye-bye from process 29687! (parent 29686)
myth60$ ./basic-fork
Greetings from process 29688! (parent 29351)
Bye-bye from process 29688! (parent 29351)
Bye-bye from process 29689! (parent 29688)fork is called once, but it returns twice.
getpid and getppid return the process id of the caller and the process id of the caller's parent, respectively.fork knows how to clone the calling process, synthesize a nearly identical copy of it, and schedule the copy to run as if it’s been running all along.
basic-fork processes—with pids of 29686 and 29688—are direct child processes of the terminal. The output tells us so.fork and child generated by it:
fork's return value in the two processes
fork returns in the parent process, it returns the pid of the new child.fork returns in the child process, it returns 0. That isn't to say the child's pid is 0, but rather that fork elects to return a 0 as a way of allowing the child to easily self-identify as the child.gdb has built-in support for debugging multiple processes, as follows:
set detach-on-fork off
gdb to capture all fork'd processes, though it pauses each at fork.
info inferiors
gdb has captured.inferior X
detach inferior X
gdb to stop watching the process before continuing itbasic-fork program right here.fork so far:
fork is a system call that creates a near duplicate of the current process.fork is the child's pid, and in the child, the return value is 0. This enables both the parent and the child to determine which process they are.fork, there is virtually no difference in the two processes, and they both continue after fork as if they were the original process.wait (more next time) for child processes to complete.fork calls. More on that in discussion section.