Autumn 2021
Jerry Cain
PDF
ls
commandpoohbear@myth53:~/cs110/lecture-examples/filesystems$ ls
alphabet.txt copy.c Makefile open.c search.c t.c umask.c vowels.txt
poohbear@myth62:~/cs110/lecture-examples/filesystems$ ls -la
total 16
-rw------- 1 poohbear operator 27 Sep 26 21:31 alphabet.txt
-rw------- 1 poohbear operator 1882 Sep 26 21:31 copy.c
-rw------- 1 poohbear operator 631 Sep 26 21:31 Makefile
-rw------- 1 poohbear operator 949 Sep 26 21:31 open.c
-rw------- 1 poohbear operator 2302 Sep 26 21:31 search.c
-rw------- 1 poohbear operator 1321 Sep 26 21:31 t.c
-rw------- 1 poohbear operator 286 Sep 26 21:31 umask.c
-rw------- 1 poohbear operator 6 Sep 26 21:31 vowels.txt
drwxr-xr-x 5 poohbear root 2048 Sep 26 21:29 ..
drwx------ 2 poohbear operator 2048 Sep 26 21:28 .
poohbear@myth53:~/cs110/lecture-examples/filesystems$ ls -la search
-rwxr-xr-x 1 poohbear operator 22328 Sep 26 21:32 search
rwx r-x r-x
owner
group
other
In this case, the owner has read, write, and execute permissions, the group has only read and execute permissions, and the user also has only read and execute permissions.
111 101 101
So, the permissions for the file would be recorded internally as 755.
open
system call, and you can set the permissions at that time, as well. We will discuss the idea of system calls soon, but for now, simply think of them as a function that can do systemsy stuff. The open function comes with the following signatures (and this works in C, even though C does not support function overloading! How, you ask? See here.)int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
There are many flags (see man 2 open for a list of them), and they can be bitwise or'd together. You must include exactly one of the following flags:
O_RDONLY: read only
O_WRONLY: write only
O_RDWR: read and write
/usr/class/cs110/lecture-examples/filesystems
.
/usr/class/cs110/lecture-examples
directory is a git
repository that will be updated with additional examples as the quarter progresses.git
clone
/usr/class/cs110/lecture-examples lecture-examples
at the command prompt to create your own local copy.git
pull
. Doing so will update your local copy to match whatever the primary has become.#include <fcntl.h> // for open
#include <unistd.h> // for read, close
#include <stdio.h>
#include <sys/types.h> // for umask
#include <sys/stat.h> // for umask
#include <errno.h>
static const char * const kFilename = "empty";
int main() {
int fd = open(kFilename, O_WRONLY | O_CREAT | O_EXCL, 0664);
if (fd == -1) {
printf("There was a problem creating \"%s\"!\n", kFilename);
if (errno == EEXIST) {
printf("The file already exists.\n");
} else {
printf("Unknown errno: %d\n", errno);
}
return -1;
}
printf("Successfully opened the file called \"%s\", and about to close it.\n", kFilename);
close(fd); // companion system call to open and releases the provided file descriptor
return 0;
}
poohbear@myth62:/usr/class/cs110/lecture-examples/filesystems$ ./open
Successfully opened the file called "empty", and about to close it.
poohbear@myth62:/usr/class/cs110/lecture-examples/filesystems$ ./open
There was a problem creating 'empty'!
The file already exists.
poohbear@myth62:/usr/class/cs110/lecture-examples/filesystems$ ls -la empty
-rw-rw-r-- 1 poohbear operator 0 Sep 26 21:39 empty
int main(int argc, char *argv[]) {
int fdin = open(argv[1], O_RDONLY);
int fdout = open(argv[2], O_WRONLY | O_CREAT | O_EXCL, 0644);
char buffer[1024];
while (true) {
ssize_t bytesRead = read(fdin, buffer, sizeof(buffer));
if (bytesRead == 0) break;
size_t bytesWritten = 0;
while (bytesWritten < bytesRead) {
bytesWritten += write(fdout, buffer + bytesWritten, bytesRead - bytesWritten);
}
}
close(fdin);
close(fdout);
return 0;
}
tee
program that ships with Linux copies everything from standard input to standard output, making zero or more extra copies in the named files supplied as user program arguments.
one.txt
, two.txt
, and three.txt
.If the file vowels.txt
contains the five vowels and the newline character, and tee
is invoked as follows, one.txt
would be rewritten to contain only the English vowels.
$ cat vowels.txt | ./tee one.txt
aeiou
$ cat one.txt
aeiou
t
executable, with error checking, is right here.$ cat alphabet.txt | tee one.txt two.txt three.txt
abcdefghijklmnopqrstuvwxyz
$ cat one.txt
abcdefghijklmnopqrstuvwxyz
$ cat two.txt
abcdefghijklmnopqrstuvwxyz
$ diff one.txt two.txt
$ diff one.txt three.txt
$
int main(int argc, char *argv[]) {
int fds[argc];
fds[0] = STDOUT_FILENO;
for (size_t i = 1; i < argc; i++)
fds[i] = open(argv[i], O_WRONLY | O_CREAT | O_TRUNC, 0644);
char buffer[2048];
while (true) {
ssize_t numRead = read(STDIN_FILENO, buffer, sizeof(buffer));
if (numRead == 0) break;
for (size_t i = 0; i < argc; i++) writeall(fds[i], buffer, numRead);
}
for (size_t i = 1; i < argc; i++) close(fds[i]);
return 0;
}
static void writeall(int fd, const char buffer[], size_t len) {
size_t numWritten = 0;
while (numWritten < len) {
numWritten += write(fd, buffer + numWritten, len - numWritten);
}
}
argc
incidentally equals the number of descriptors we need to write to. That's why we declare an int array (or rather, a descriptor array) of length argc
.STDIN_FILENO
is a built-in constant for the number 0, which is the descriptor normally linked to standard input. STDOUT_FILENO
is a constant for the number 1, which is the default descriptor bound to standard output.read
, write
and close
calls. Internally, that descriptor is an index into the descriptor table.mode
tracks whether we're reading, writing, or both. cursor
tracks a position within the file payload. refcount
tracks the number of descriptors across all processes that refer to that entry. (We'll discuss the vnode
field in a moment.)open(filename, O_RDONLY)
from that process might result in the above.bash
shell calls make
, which itself calls g++
, each of them inserts text into the same terminal window.None of these
kernel-resident
data structures
are visible to
users. Note the
filesystem itself
is a completely
different
component, and
that filesystem
inodes of open
files are loaded into vnode table entries. The yellow inode in the vnode is an in-memory replica of the yellow sliver of memory in the filesystem.
stat
and lstat
are system calls that populate a struct
stat
with information about some named file. The prototypes of the two are:int stat(const char *pathname, struct stat *st);
int lstat(const char *pathname, struct stat *st);
stat
and lstat
operate exactly the same way, except when the named file is a link, stat
returns information about the file the link ultimately references, and lstat
returns information about the link itself.struct stat {
dev_t st_dev; // id of device containing file
ino_t st_ino; // id of data structure on device
mode_t st_mode; // mode of file
// many other fields (file size, create time, etc.)
};
st_mode
field—which is the only one we'll really pay much attention to—isn't so much a single value as it is a collection of bits encoding multiple pieces of information about file type and permissions. A collection of bit masks and macros can be used to extract information from this st_mode
field.I won't be formally covering stat in lecture, but I will refer to these in future lectures when stat is needed. Still, cool stuff!
poohbear@myth53$ find /usr/include -name stdio.h -print
/usr/include/x86_64-linux-gnu/bits/stdio.h
/usr/include/stdio.h
/usr/include/bsd/stdio.h
/usr/include/c++/7/tr1/stdio.h
/usr/include/c++/10/tr1/stdio.h
/usr/include/c++/8/tr1/stdio.h
/usr/include/c++/9/tr1/stdio.h
poohbear@myth53$ ./search /usr/include stdio.h
/usr/include/x86_64-linux-gnu/bits/stdio.h
/usr/include/stdio.h
/usr/include/bsd/stdio.h
/usr/include/c++/7/tr1/stdio.h
/usr/include/c++/10/tr1/stdio.h
/usr/include/c++/8/tr1/stdio.h
/usr/include/c++/9/tr1/stdio.h
poohbear@myth53$
int main(int argc, char *argv[]) {
const char *directory = argv[1];
struct stat st;
stat(directory, &st);
if (!S_ISDIR(st.st_mode)) return 0;
size_t length = strlen(directory);
const char *pattern = argv[2];
char path[kMaxPath + 1];
strcpy(path, directory);
// buffer overflow impossible, directory length <= kMaxPath else stat fails
listMatches(path, length, pattern);
return 0;
}
listMatches
makes use of three library functions to iterate over all files within a directory. Let's play with those before tackling listMatches.DIR *opendir(const char *dirname);
struct dirent *readdir(DIR *dirp);
int closedir(DIR *dirp);
static void listEntries(const char *name) {
struct stat st;
stat(name, &st);
if (!S_ISDIR(st.st_mode)) return;
DIR *dir = opendir(name);
while (true) {
struct dirent *de = readdir(dir);
if (de == NULL) break;
printf("+ %s\n", de->d_name);
}
closedir(dir);
}
opendir
gets anything other than an accessible directory, it returns NULL
.de
has surfaced all entries, readdir
returns NULL
.struct
dirent
is only guaranteed to contain a d_name
field, which stores the entry's name as a C string. .
and ..
are included in the sequence of named entries.
listMatches
. static void listMatches(char path[], size_t length, const char *name) {
DIR *dir = opendir(path);
if (dir == NULL) return; // it's a directory, but permission to open was denied
strcpy(path + length++, "/");
while (true) {
struct dirent *de = readdir(dir);
if (de == NULL) break; // we've iterated over every directory entry, so stop
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0) continue;
if (length + strlen(de->d_name) > kMaxPath) continue;
strcpy(path + length, de->d_name);
struct stat st;
lstat(path, &st);
if (S_ISREG(st.st_mode)) {
if (strcmp(de->d_name, name) == 0) printf("%s\n", path);
} else if (S_ISDIR(st.st_mode)) {
listMatches(path, length + strlen(de->d_name), name);
}
}
closedir(dir);
}
.
and ..
, else we're threatened with infinite recursion.
lstat
instead of stat
so we know whether an entry is a link. We ignore all links because, again, we want to avoid infinite recursion.stat
record identifies something as a regular file, we print the entire path if and only if the entry name matches the name of interest.stat
record identifies something as a directory, we recursively dip into it to see if any descendents match name
.