Principles of Computer Systems
Winter 2021
Stanford University
Computer Science Department
Instructors: Chris Gregg
and Nick Troccoli
cgregg@myth57:$ ./amazon_search "best thing since sliced bread"
Found 258 matching reviews out of 3093869 reviews in the database.
**********
Review index: 3092071
Product title: Ceiva Internet-Enabled Photo Frame
Product category: Electronics
Star rating: 5 stars
Review headline: Rave Review ! Perfect for mom and grandma !
Review body: This product is the best thing since sliced bread ! I bought 5 of
them for family and friends. The best application for a ceiva is my grandmothers
room, at the retirement home. She is too old to use AOL or WebTV. Ceiva is the
perfect bedside companion for her. Everyday, she gets a new set of 10 pictures
downloaded from the 250 I have "uploaded" to her album storage section.
She has no computer skills and doesn't know a jpeg from a gif, but she loves the
pictures I have sent to her via her Ceiva frame...
Date: 2000-5-10
**********
cgregg@myth57:/usr/class/cs110/staff/master_repos/assign1$ ./amazon_search '"almost killed me"' -k bodysize -r -n 1
Total number of keywords: 483621
Found 4 matching reviews out of 3093869 reviews in the database.
**********
Review index: 3043273
Product title: Segway Human Transporter (HT) p Series
Product category: Electronics
Star rating: 1 stars
Review headline: Segway Danger - almost Killed me.
Review body: I want to worn you before you get an the Segway and take off...I bought mine
here from Amazon and at that time took 4 months to get it...then I started riding it with
no problem everyday I could to get the mail up our 600 feet concrete driveway...when I
returned everytime I hooked it up to the power source. This one day while going up the driveway
which is on an incline...it powered down..and slammed me to the ground....without any notice..
I layed there until my brother came along and help me to the house....my back was brused...and
it knocked some bone spurs loose in my neck....I contacted Amazon/Segway ..and SEGWAY said
there was a recall to the computer system and I sent it back...and sold it ASAP...Segway never
offered and relief on my bills..etc...MY advise to riders is they nee some sort of fail safe
system for backway fails....I know how one could rig up a devise for the forward fall..but
short of some sort of backward fall...I don't...I have played sports in school but never have
been slammed a hard as that fall....I would say..be very careful and don't look for Segway to
offer any type of help if you are hurt. <br />RDM
Date: 2005-8-10
**********
lower_bound
function from the STL. The function is a bit subtle -- you need to take some time to understand how it works. For example, it takes an iterator, which in our case is just a pointer to the data. Also, when searching, it returns an "Iterator pointing to the first element that is not less than value, or last if no such element is found." (see the link above for details).lower_bound
function for a moment: part of the assignment says, "Remember, you must use the lower_bound function and a C++ lambda function to perform a binary search to find keywords."void qsort(void *base, size_t nmemb, size_t size,
int (*compar)(const void *, const void *));
int
return value. qsort does not care about the details of how the comparison is done, it just relies on it to provide a legitimate result.int add(int x, int y) { return x + y; }
int sub(int x, int y) { return x - y; }
void modifyVec(vector<int> &vec, int val, function<int(int, int)>op) {
for (int &v : vec) {
v = op(v,val);
}
}
int main(int argc, char *argv[]) {
string opStr = string(argv[1]);
int val = atoi(argv[2]);
vector<int> vec = {1, 2, 3, 4, 5, 10, 100, 1000};
printVec("Original",vec);
cout << "Performing " << opStr << " on vector with value " << val << endl;
if (opStr == "add") modifyVec(vec, val, add);
else if (opStr == "sub") modifyVec(vec, val, sub);
printVec("Result",vec);
return 0;
}
./fun_pointer add 12
Original: 1, 2, 3, 4, 5, 10, 100, 1000,
Performing add on vector with value 12
Result: 13, 14, 15, 16, 17, 22, 112, 1012,
function<int(int, int)op
parameter is a C++ way of creating a function pointer.void modifyVec(vector<int> &vec, int val, function<int(int, int)>op) {
for (int &v : vec) {
v = op(v,val);
}
}
int main(int argc, char *argv[]) {
string opStr = string(argv[1]);
int val = atoi(argv[2]);
vector<int> vec = {1, 2, 3, 4, 5, 10, 100, 1000};
printVec("Original", vec);
cout << "Performing " << opStr << " on vector with value " << val << endl;
if (opStr == "add") modifyVec(vec, val, [](int x, int y) {
return x + y;
});
else if (opStr == "sub") modifyVec(vec, val, [](int x, int y) {
return x - y;
});
printVec("Result", vec);
return 0;
}
[ captures ] ( params ) { body }
void modifyVec(vector<int> &vec, int val, function<int(int, int)>op) {
for (int &v : vec) {
v = op(v,val);
}
}
void modifyVec(vector<int> &vec, function<int(int)>op) {
for (int &v : vec) {
v = op(v);
}
}
To this:
void modifyVec(vector<int> &vec, std::function<int(int v)>op) {
for (int &v : vec) {
v = op(v);
}
}
int main(int argc, char *argv[]) {
string opStr = string(argv[1]);
int val = atoi(argv[2]);
vector<int> vec = {1, 2, 3, 4, 5, 10, 100, 1000};
printVec("Original", vec);
cout << "Performing " << opStr << " on vector with value " << val << endl;
if (opStr == "add") modifyVec(vec, [val](int x) {
return x + val;
});
else if (opStr == "sub") modifyVec(vec, [val](int x) {
return x - val;
});
printVec("Result", vec);
return 0;
}
val
, using the bracket notation. This allows the lambda function, when it is called (remember, it isn't called immediately) to use val.
if (opStr == "add") modifyVec(vec, [&val](int x) {
return x + val;
});
lower_bound
function for assignment 1!)copy
(designed to mimic the behavior of cp
) illustrates how to use open
, read
, write
, and close
. It also introduces the notion of a file descriptor.
man
pages exist for all of these functions (e.g. man 2 open
, man 2 read
, etc.)copy
, with exhaustive error checking, is right here.copy
to emulate cp
copy
to emulate cp
int main(int argc, char *argv[]) {
int fdin = open(argv[1], O_RDONLY);
int fdout = open(argv[2], O_WRONLY | O_CREAT | O_EXCL, 0644);
char buffer[1024];
while (true) {
ssize_t bytesRead = read(fdin, buffer, sizeof(buffer));
if (bytesRead == 0) break;
size_t bytesWritten = 0;
while (bytesWritten < bytesRead) {
bytesWritten += write(fdout, buffer + bytesWritten, bytesRead - bytesWritten);
}
}
close(fdin);
close(fdout)
return 0;
}
FILE
pointers and C++ iostream
sFILE
pointers and C++ iostream
s work well when you know you're interacting with standard output, standard input, and local files.
FILE
pointers and C++ iostream
s assume they can rewind and move the file pointer back and forth freely, but that's not the case with file descriptors associated with network connections.read
and write
and little else used in this course.FILE
pointers and C++ streams, on the other hand, provide automatic buffering and more elaborate formatting options.t
to emulate tee
tee
tee
program that ships with Linux copies everything from standard input to standard output, making zero or more extra copies in the named files supplied as user program arguments. For example, if the file contains 27 bytes—the 26 letters of the English alphabet followed by a newline character—then the following would print the alphabet to standard output and to three files named one.txt
, two.txt
, and three.txt
.$ cat alphabet.txt | tee one.txt two.txt three.txt
abcdefghijklmnopqrstuvwxyz
$ cat one.txt
abcdefghijklmnopqrstuvwxyz
$ cat two.txt
abcdefghijklmnopqrstuvwxyz
$ diff one.txt two.txt
$ diff one.txt three.txt
$
If the file vowels.txt contains the five vowels and the newline character, and tee is invoked as follows, one.txt would be rewritten to contain only the English vowels.
$ cat vowels.txt | ./tee one.txt
aeiou
$ cat one.txt
aeiou
t
executable, with error checking, is right here.copy.c
does, but it illustrates how you can use low-level I/O to manage many sessions with multiple files. The implementation inlined across the next two slides omit error checking.Source: https://commons.wikimedia.org/wiki/File:Tee.svg
t
to emulate tee
int main(int argc, char *argv[]) {
int fds[argc];
fds[0] = STDOUT_FILENO;
for (size_t i = 1; i < argc; i++)
fds[i] = open(argv[i], O_WRONLY | O_CREAT | O_TRUNC, 0644);
char buffer[2048];
while (true) {
ssize_t numRead = read(STDIN_FILENO, buffer, sizeof(buffer));
if (numRead == 0) break;
for (size_t i = 0; i < argc; i++) writeall(fds[i], buffer, numRead);
}
for (size_t i = 1; i < argc; i++) close(fds[i]);
return 0;
}
static void writeall(int fd, const char buffer[], size_t len) {
size_t numWritten = 0;
while (numWritten < len) {
numWritten += write(fd, buffer + numWritten, len - numWritten);
}
}
argc
incidentally provides a count on the number of descriptors that write to. That's why we declare an integer array (or rather, a file descriptor array) of length argc
.STDIN_FILENO
is a built-in constant for the number 0, which is the descriptor normally attached to standard input. STDOUT_FILENO
is a constant for the number 1, which is the default descriptor bound to standard output.myth
machines include real error checking.stat
and lstat
stat
and lstat are functions—system calls, actually—that populate a struct stat with information about some named file (e.g. a regular file, a directory, a symbolic link, etc).
int stat(const char *pathname, struct stat *st);
int lstat(const char *pathname, struct stat *st);
stat
and lstat
operate exactly the same way, except when the named file is a link, stat
returns information about the file the link references, and lstat
returns information about the link itself.
man
pages exist for both of these functions (e.g. man 2 stat
, man 2 lstat
, etc.)stat
and lstat
struct stat {
dev_t st_dev; // ID of device containing file
ino_t st_ino; // file serial number
mode_t st_mode; // mode of file
// many other fields (file size, creation and modified times, etc)
};
st_mode
field—which is the only one we'll really pay much attention to—isn't so much a single value as it is a collection of bits encoding multiple pieces of information about file type and permissions.st_mode
field.stat
and lstat
functions can be used to navigate and otherwise manipulate a tree of files within the file system.stat
and lstat
search
is our own imitation of the find
program that comes with Linux.
search
is supposed to work.stdio.h
in /usr/include
or within any descendant subdirectories.myth60$ find /usr/include -name stdio.h -print
/usr/include/stdio.h
/usr/include/x86_64-linux-gnu/bits/stdio.h
/usr/include/c++/5/tr1/stdio.h
/usr/include/bsd/stdio.h
myth60$ ./search /usr/include stdio.h
/usr/include/stdio.h
/usr/include/x86_64-linux-gnu/bits/stdio.h
/usr/include/c++/5/tr1/stdio.h
/usr/include/bsd/stdio.h
myth60$
stat
and lstat
main
relies on listMatches
, which we'll implement a little later.
int main(int argc, char *argv[]) {
assert(argc == 3);
const char *directory = argv[1];
struct stat st;
lstat(directory, &st);
assert(S_ISDIR(st.st_mode));
size_t length = strlen(directory);
if (length > kMaxPath) return 0; // assume kMaxPath is some #define
const char *pattern = argv[2];
char path[kMaxPath + 1];
strcpy(path, directory); // buffer overflow impossible
listMatches(path, length, pattern);
return 0;
}
stat
and lstat
lstat
, which extracts information about the named file and populates the struct st
with that information.S_ISDIR
macro, which examines the upper four bits of the st_mode
field to determine whether the named file is a directory.S_ISDIR
has a few cousins: S_ISREG
decides whether a file is a regular file, and S_ISLNK
decided whether the file is a link. We'll use all of these in our next example.listMatches
function, which does a depth-first traversal of the filesystem to see what files just happen to match the name
of interest.listMatches
, which appears on the next slide, makes use of these three library functions to iterate over all of the files within a named directory.DIR *opendir(const char *dirname);
struct dirent *readdir(DIR *dirp);
int closedir(DIR *dirp);
stat
and lstat
listMatches
:
static void listMatches(char path[], size_t length, const char *name) {
DIR *dir = opendir(path);
if (dir == NULL) return; // it's a directory, but permission to open was denied
strcpy(path + length++, "/");
while (true) {
struct dirent *de = readdir(dir);
if (de == NULL) break; // we've iterated over every directory entry, so stop looping
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0) continue;
if (length + strlen(de->d_name) > kMaxPath) continue;
strcpy(path + length, de->d_name);
struct stat st;
lstat(path, &st);
if (S_ISREG(st.st_mode)) {
if (strcmp(de->d_name, name) == 0) printf("%s\n", path);
} else if (S_ISDIR(st.st_mode)) {
listMatches(path, length + strlen(de->d_name), name);
}
}
closedir(dir);
}
stat
and lstat
opendir
, which accepts what is presumably a directory. It returns a pointer to an opaque iterable that surfaces a series of struct dirent
s via a sequence of readdir
calls.
opendir
accepts anything other than an accessible directory, it'll return NULL
.DIR
has surfaced all of its entries, readdir
returns NULL
.struct dirent
is only guaranteed to contain a d_name
field, which is the directory entry's name, captured as a C string. .
and ..
are among the sequence of named entries, but we ignore them to avoid cycles and infinite recursion.lstat
instead of stat
so we know whether an entry is really a link. We ignore links, again because we want to avoid infinite recursion and cycles.stat
record identifies an entry as a regular file, we print the entire path if and only if the entry name matches the name of interest.stat
record identifies an entry as a directory, we recursively descend into it to see if any of its named entries match the name of interest.opendir
returns access to a record that eventually must be released via a call to closedir
. That's why our implementation ends with it.stat
and lstat
list
, which emulates the functionality of ls
(in particular, ls -lUa
). Implementations of list
and search
have much in common, but implementation of list
is much longer.
list
is presented right here:myth60$ ./list /usr/class/cs110/WWW
drwxr-xr-x 8 70296 root 2048 Jan 08 17:16 .
drwxr-xr-x >9 root root 2048 Jan 08 17:02 ..
drwxr-xr-x 2 70296 root 2048 Jan 08 15:45 restricted
drwxr-xr-x 4 cgregg operator 2048 Jan 08 17:03 examples
-rw------- 1 cgregg operator 2395 Jan 08 15:51 index.html
// others omitted for brevity
myth60$
list.c
is right here.
drwxr-xr-x
) for an arbitrary entry.stat
and lstat
list
's listPermissions
function, which prints out the permission string consistent with the supplied stat
information:static inline void updatePermissionsBit(bool flag, char permissions[],
size_t column, char ch) {
if (flag) permissions[column] = ch;
}
static const size_t kNumPermissionColumns = 10;
static const char kPermissionChars[] = {'r', 'w', 'x'};
static const size_t kNumPermissionChars = sizeof(kPermissionChars);
static const mode_t kPermissionFlags[] = {
S_IRUSR, S_IWUSR, S_IXUSR, // user flags
S_IRGRP, S_IWGRP, S_IXGRP, // group flags
S_IROTH, S_IWOTH, S_IXOTH // everyone (other) flags
};
static const size_t kNumPermissionFlags =
sizeof(kPermissionFlags)/sizeof(kPermissionFlags[0]);
static void listPermissions(mode_t mode) {
char permissions[kNumPermissionColumns + 1];
memset(permissions, '-', sizeof(permissions));
permissions[kNumPermissionColumns] = '\0';
updatePermissionsBit(S_ISDIR(mode), permissions, 0, 'd');
updatePermissionsBit(S_ISLNK(mode), permissions, 0, 'l');
for (size_t i = 0; i < kNumPermissionFlags; i++) {
updatePermissionsBit(mode & kPermissionFlags[i], permissions, i + 1,
kPermissionChars[i % kNumPermissionChars]);
}
printf("%s ", permissions);
}