CS110: Principles of Computer Systems

Spring 2021
Instructors Roz Cyrus and Jerry Cain

PDF

Introduction to UNIX Filesystems

  • We have already discussed two file system API calls: open and umask. We are going to look at other low-level operations that allow programmers to interact with the file system. We will focus here on the direct system calls, but when writing production code, you'll generally use directives like the FILE*, ifstream, and ofstream abstractions, whose implementations layer over file descriptors, calls to open, umask, close, and related systems calls like read, write, and stat, which we'll discuss today.
  • Requests to open a file, read from a file, extend the heap, etc., all eventually go through system calls, which are the only functions trustworthy enough to interact with the OS on your behalf. The OS kernel executes the code of a system call, isolating all system-level interactions from your (potentially buggy and harmful) program.

Implementing copy to emulate cp

  • The implementation of copy (designed to mimic the behavior of cp) illustrates how to use open, read, write, and close and what the file descriptors are.
    man pages exist for all of these functions (e.g. man 2 open, man 2 read, etc.)
    Full implementation of our own copy, with exhaustive error checking, is right here.
    Simplified implementation, sans error checking, is on the next slide.

Back to file systems: Implementing copy

int main(int argc, char *argv[]) {
  int fdin = open(argv[1], O_RDONLY);
  int fdout = open(argv[2], O_WRONLY | O_CREAT | O_EXCL, 0644);
  char buffer[1024];
  while (true) {
    ssize_t bytesRead = read(fdin, buffer, sizeof(buffer));
    if (bytesRead == 0) break;
    size_t bytesWritten = 0;
    while (bytesWritten < bytesRead) {
      bytesWritten += write(fdout, buffer + bytesWritten, bytesRead - bytesWritten);
    }
  }
  close(fdin); 
  close(fdout)
  return 0;
}
  • The read system call will block until at least one byte is available to be read.  If read returns 0, there are no more bytes to read, presumably because you've reached the end of the file, or the file descriptor was closed.
  • If write returns a value less than the value supplied as the third argument, it means that the system couldn't write all bytes at once, hence the while loop and the need to keep track of bytesRead and bytesWritten.
  • You should close file descriptors as soon as you're done with them so that descriptors can be reused on behalf of future open calls and other syscalls—that's shorthand for system calls—that allocate descriptors. Some systems allow a surprisingly small number of descriptors to be open at any one time, so be sure to close them.

Pros and cons of file descriptors over alternatives

  • The file descriptor abstraction provides direct, low-level access to a stream of data without the fuss of higher-level data structures or classes. It certainly can't be slower, and depending on what you're doing, it might be faster.
  • FILE pointers and C++ iostreams work well when you know you're interacting with standard output, standard input, and local files.
    • They are less useful when the stream of bytes is associated with a network connection, which we'll soon learn is also supported via descriptors.
    • FILE pointers and C++ iostreams assume they can, in theory, rewind and move the file pointer back and forth freely, but that's not the case with file descriptors associated with network connections.
  • File descriptors, however, work with read and write, but little else of use to CS110.
  • C FILE pointers and C++ streams, on the other hand, provide automatic buffering and more elaborate formatting options.

Implementing t to emulate tee

  • The tee program that ships with Linux copies everything from standard input to standard output, making zero or more extra copies in the named files supplied as user program arguments.
    • For example, if the file contains 27 bytes—the 26 letters of the English alphabet followed by a newline character—then the following would print the alphabet to standard output and to three files named one.txt, two.txt, and three.txt.
  • If the file vowels.txt contains the five vowels and the newline character, and tee is invoked as follows, one.txt would be rewritten to contain only the English vowels.

$ cat vowels.txt | ./tee one.txt
aeiou
$ cat one.txt 
aeiou
  • Full implementation of our own t executable, with error checking, is right here.
$ cat alphabet.txt | tee one.txt two.txt three.txt
abcdefghijklmnopqrstuvwxyz
$ cat one.txt 
abcdefghijklmnopqrstuvwxyz
$ cat two.txt
abcdefghijklmnopqrstuvwxyz
$ diff one.txt two.txt
$ diff one.txt three.txt
$
  • Implementation replicates much of what copy.c does, but it illustrates how you can use low-level I/O to manage many sessions with multiple files at the same time. The implementation, but without erroring checking, is presented on the next slide.

Implementing t to emulate tee

int main(int argc, char *argv[]) {
  int fds[argc];
  fds[0] = STDOUT_FILENO;
  for (size_t i = 1; i < argc; i++)
    fds[i] = open(argv[i], O_WRONLY | O_CREAT | O_TRUNC, 0644);

  char buffer[2048];
  while (true) {
    ssize_t numRead = read(STDIN_FILENO, buffer, sizeof(buffer));
    if (numRead == 0) break;
    for (size_t i = 0; i < argc; i++) writeall(fds[i], buffer, numRead);
  }

  for (size_t i = 1; i < argc; i++) close(fds[i]);
  return 0;
}

static void writeall(int fd, const char buffer[], size_t len) {
  size_t numWritten = 0;
  while (numWritten < len) {
    numWritten += write(fd, buffer + numWritten, len - numWritten);
  }
}
  • Note that argc incidentally equals the number of descriptors we need to write to. That's why we declare an int array (or rather, a descriptor array) of length argc.
  • STDIN_FILENO is a built-in constant for the number 0, which is the descriptor normally linked to standard input. STDOUT_FILENO is a constant for the number 1, which is the default descriptor bound to standard output.

Using stat and lstat to extract file metadata

  • stat and lstat are system calls that populate a struct stat with information about some named file. The prototypes of the two are:
int stat(const char *pathname, struct stat *st);
int lstat(const char *pathname, struct stat *st);
  • stat and lstat operate exactly the same way, except when the named file is a link, stat returns information about the file the link ultimately references, and lstat returns information about the link itself.
  • the struct stat looks like: 
struct stat {
  dev_t st_dev;        // id of device containing file
  ino_t st_ino;        // id of data structure on device
  mode_t st_mode;      // mode of file
  // many other fields (file size, create time, etc.)
};
  • The st_mode field—which is the only one we'll really pay much attention to—isn't so much a single value as it is a collection of bits encoding multiple pieces of information about file type and permissions. A collection of bit masks and macros can be used to extract information from this st_mode field.
  • search is our own version of the find utility that ships with Linux.  Compare the outputs of the following to be clear how search is supposed to work.  In each of the two test runs below, an executable—one native to Linux, and a second we'll implement together—is invoked to find all files named stdio.h within /usr/include or any of its descendant directories.

Implementing search to emulate find

poohbear@myth53$ find /usr/include -name stdio.h -print
/usr/include/x86_64-linux-gnu/bits/stdio.h
/usr/include/stdio.h
/usr/include/bsd/stdio.h
/usr/include/c++/7/tr1/stdio.h
/usr/include/c++/10/tr1/stdio.h
/usr/include/c++/8/tr1/stdio.h
/usr/include/c++/9/tr1/stdio.h
poohbear@myth53$
poohbear@myth53$ ./search /usr/include stdio.h
/usr/include/x86_64-linux-gnu/bits/stdio.h
/usr/include/stdio.h
/usr/include/bsd/stdio.h
/usr/include/c++/7/tr1/stdio.h
/usr/include/c++/10/tr1/stdio.h
/usr/include/c++/8/tr1/stdio.h
/usr/include/c++/9/tr1/stdio.h
poohbear@myth53$
  • Nice! They match!
  • The following main relies on listMatches, which we'll implement in a second.
    The full program, complete with error checks we don't present below, is right here.

Implementing search to emulate find

int main(int argc, char *argv[]) {
  const char *directory = argv[1];
  struct stat st;
  stat(directory, &st);
  if (!S_ISDIR(st.st_mode)) return 0;
  size_t length = strlen(directory);
  const char *pattern = argv[2];
  char path[kMaxPath + 1]; 
  strcpy(path, directory); 
  // buffer overflow impossible, directory length <= kMaxPath else stat fails
  listMatches(path, length, pattern);
  return 0;
}
  • This is our first example that calls stat and lstat, each of which extracts information about the named file and populates the struct stat supplied by address.
  • You'll also note the use of the S_ISDIR macro, which examines the upper four bits of the st_mode field to determine whether the named file is a directory.
  • S_ISDIR has a few cousins: S_ISREG decides whether a file is a regular file, and S_ISLNK decided whether the file is a link.
  • Most of what's algorithmically interesting falls under the  jurisdiction of  this listMatches function, which performs a depth-first tree traversal of the filesystem to determine what filenames just happen to match the name of interest.
  • The implementation of listMatches makes use of three library functions to iterate over all files within a directory.  Let's play with those before tackling listMatches.

Implementing search to emulate find

  • Here's a relatively straightforward function—not listMatches, but something even simpler called listEntries—illustrating how these three functions above can be used to print all of the named entries within a supplied directory.
DIR *opendir(const char *dirname);
struct dirent *readdir(DIR *dirp);
int closedir(DIR *dirp);
static void listEntries(const char *name) {
  struct stat st;
  stat(name, &st);
  if (!S_ISDIR(st.st_mode)) return;
  DIR *dir = opendir(name);
  while (true) {
    struct dirent *de = readdir(dir);
    if (de == NULL) break;
    printf("+ %s\n", de->d_name);
  }
  closedir(dir);
}
  • opendir accepts the name of a directory and returns the address of an opaque iterable surfacing a series of dirents records via a sequence of readdir calls.​
    • If opendir gets anything other than an accessible directory, it returns NULL.
    • Once de has surfaced all entries, readdir returns NULL.
  • The struct dirent is only guaranteed to contain a d_name field, which stores the entry's name as a C string. . and .. are included in the sequence of named entries.
  • closedir gets called to dispose of the resources allocated by opendir.
  • We can now leverage everything we've learned to implement listMatches

Implementing search to emulate find

static void listMatches(char path[], size_t length, const char *name) {
  DIR *dir = opendir(path);
  if (dir == NULL) return; // it's a directory, but permission to open was denied
  strcpy(path + length++, "/");
  while (true) {
    struct dirent *de = readdir(dir);
    if (de == NULL) break; // we've iterated over every directory entry, so stop
    if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0) continue;
    if (length + strlen(de->d_name) > kMaxPath) continue;
    strcpy(path + length, de->d_name);
    struct stat st;
    lstat(path, &st);
    if (S_ISREG(st.st_mode)) {
      if (strcmp(de->d_name, name) == 0) printf("%s\n", path);
    } else if (S_ISDIR(st.st_mode)) {
      listMatches(path, length + strlen(de->d_name), name);
    }
  }
  closedir(dir);
}
  • Note we brute-force ignore . and .., else we're threatened with infinite recursion.
  • We use lstat instead of stat so we know whether an entry is a link. We ignore all links because, again, we want to avoid infinite recursion.
  • If the stat record identifies something as a regular file, we print the entire path if and only if the entry name matches the name of interest.
  • If the stat record identifies something as a directory, we recursively dip into it to see if any descendents match name.

Lecture 02 Live: Filesystems and Filesystem APIs, Take I

By Jerry Cain

Lecture 02 Live: Filesystems and Filesystem APIs, Take I

  • 1,955