# CS110 Lecture 11: Semaphores and Multithreading Patterns

Principles of Computer Systems

Winter 2021

Stanford University

Computer Science Department

Instructors: Chris Gregg and

Nick Troccoli

# CS110 Topic 3: How can we have concurrency within a single process?

Condition Variables and Semaphores

# Learning Goals

• Learn how a semaphore generalizes the "permits pattern" we previously saw
• Learn how to apply semaphores to coordinate threads in different ways

# Lecture Plan

• Recap: Dining With Philosophers
• Example: Mythbusters

# Lecture Plan

• Recap: Dining With Philosophers
• Example: Mythbusters
• This is a canonical multithreading example of the potential for deadlock and how to avoid it.
• Five philosophers sit around a circular table, eating spaghetti
• There is one fork for each of them
• Each philosopher thinks, then eats, and repeats this three times for their three daily meals.
• To eat, a philosopher must grab the fork on their left and the fork on their right.  With two forks in hand, they chow on spaghetti to nourish their big, philosophizing brain. When they're full, they put down the forks in the same order they picked them up and return to thinking for a while.
• To think, the a philosopher keeps to themselves for some amount of time.  Sometimes they think for a long time, and sometimes they barely think at all.
• The full program is right here.

# Dining Philosophers Problem

`https://commons.wikimedia.org/wiki/File:An_illustration_of_the_dining_philosophers_problem.png​`

When coding with threads, you need to ensure that:

• there are never any race conditions
• there's zero chance of deadlock; otherwise a subset of threads are forever starved
• Race conditions can generally be solved with mutexes.
• We use them to mark the boundaries of critical regions and limit the number of threads present within them to be at most one.
• Deadlock can be programmatically prevented by implanting directives to limit the number of threads competing for a shared resource.
• Our general goal is to determine what constraints must be added to eliminate race conditions.

Goal: we must encode constraints into our program.

Example: how many philosophers can hold a fork at the same time? One.

How can we encode this into our program?  Let's make a mutex for each fork.

• Each philosopher either holds a fork or doesn't.
• A philosopher grabs a fork by locking that mutex.  If the fork is available, the philosopher continues.  Otherwise, it blocks until the fork becomes available and it can have it.
• A philosopher puts down a fork by unlocking that mutex.

# Constraints: Forks

Goal: we must encode constraints into our program.

Example: how many philosophers can try to eat at the same time? Four.

• Alternative: how many philosophers can eat at the same time? Two.
• Why might the first one be better?  Imposes less bottlenecking while still solving the issue.

How can we encode this into our program?

• let's have a count of "permits" or "tickets" available.
• In order to try to eat (aka grab forks at all) a philosopher must get a permit
• Once done eating, a philosopher must return their permit

What does this look like in code?

• Use a semaphore initialized with the number of permits we want
• Before grabbing forks, get a permit
• When done eating, return a permit.

# Lecture Plan

• Recap: Dining With Philosophers
• Example: Mythbusters

# More on Semaphores

A semaphore is a variable type that represents a count of finite resources.

• "Permits" pattern with a counter, mutex and ​condition_variable_any
• Thread-safe way to grant permission and to wait for permission (aka sleep)
``````class semaphore {
public:
semaphore(int value = 0);
void wait();
void signal();

private:
int value;
std::mutex m;
std::condition_variable_any cv;
}``````

What does a `semaphore` initialized with a positive number mean?

``semaphore permits(3);``

# Positive Semaphores

• Once those permits are taken, further threads must wait for permits to be returned before continuing
• Example: Dining Philosophers

What does a `semaphore` initialized with 0 mean?

``semaphore permits(0);``

# Zero Semaphores

• We don't have any permits!
• permits.wait() always initially waits for a signal, and will only stop waiting once that signal is received.  E.g. you want to wait until another thread finishes before a thread continues.
``````void create(int creationCount, semaphore &s) {
for (int i = 0; i < creationCount; i++) {
cout << oslock << "Now creating " << i << endl << osunlock;
s.signal();
}
}

void consume_after_create(int consumeCount, semaphore &s) {
for (int i = 0; i < consumeCount; i++) {
s.wait();
cout << oslock << "Now consuming " << i << endl << osunlock;
}
}

int main(int argc, const char *argv[]) {
semaphore zeroSemaphore(0); // can omit (0), since default initializes to 0
int numIterations = 5;
return 0;
}``````

# Zero Semaphores

``````\$ ./zeroSemaphore
Now creating 0
Now creating 1
Now creating 2
Now creating 3
Now creating 4
Now consuming 0
Now consuming 1
Now consuming 2
Now consuming 3
Now consuming 4``````

# Negative Semaphores

What does a `semaphore` initialized with a negative number mean?

``semaphore permits(-9);``

The semaphore must reach 1 before the initial wait would end.  E.g. you want to wait until other threads finish before a final thread continues.

``````void writer(int i, semaphore &s) {
cout << oslock << "Sending signal " << i << endl << osunlock;
s.signal();
}

s.wait();
cout << oslock << "Got enough signals to continue!" << endl << osunlock;
}

int main(int argc, const char *argv[]) {
semaphore negSemaphore(-9);
for (size_t i = 0; i < 10; i++) {
}
for (thread &t : writers) t.join();
r.join();
return 0;
}``````

# Negative Semaphores

``````\$ ./negativeSemaphores
Sending signal 0
Sending signal 1
Sending signal 2
Sending signal 3
Sending signal 5
Sending signal 7
Sending signal 8
Sending signal 9
Sending signal 6
Sending signal 4
Got enough signals to continue!``````

semaphores can be used to support thread coordination.

• One thread can stall—via semaphore::wait—until other thread(s) use semaphore::signal, e.g. the signaling thread prepared some data that the waiting thread needs to continue.

# Lecture Plan

• Recap: Dining With Philosophers
• Example: Mythbusters

Let's implement a program that requires thread coordination with semaphores.  First, we'll look at a version without semaphores to see why they are necessary.

• The reader-writer pattern/program spawns 2 threads: one writer (publishes content to a shared buffer) and one reader (reads from shared buffer when content is available)
• Common pattern! E.g. web server publishes content over a dedicated communication channel, and the web browser consumes that content.
• More complex version: multiple readers, similar to how a web server handles many incoming requests (puts request in buffer, readers each read and process requests)
``````int main(int argc, const char *argv[]) {
// Create an empty buffer
char buffer[kNumBufferSlots];
memset(buffer, ' ', sizeof(buffer));

writer.join();
return 0;
}``````

`confused-reader-writer.cc`
``````static void readFromBuffer(char buffer[], size_t bufferSize, size_t iterations) {
for (size_t i = 0; i < iterations * bufferSize; i++) {

// Read and process the data
char ch = buffer[i % bufferSize];
processData(ch); // sleep to simulate work
buffer[i % bufferSize] = ' ';

cout << oslock << "Reader: consumed data packet "
<< "with character '" << ch << "'.\t\t" << osunlock;
printBuffer(buffer, bufferSize);
}
}``````

`confused-reader-writer.cc`
``````static void writeToBuffer(char buffer[], size_t bufferSize, size_t iterations) {
cout << oslock << "Writer: ready to write." << endl << osunlock;
for (size_t i = 0; i < iterations * bufferSize; i++) {

char ch = prepareData();
buffer[i % bufferSize] = ch;

cout << oslock << "Writer: published data packet with character '"
<< ch << "'.\t\t" << osunlock;
printBuffer(buffer, bufferSize);
}
}``````

`confused-reader-writer.cc`

• Both threads share the same buffer, so they agree where content is stored (think of buffer like state for a pipe or a connection between client and server)
• The writer publishes content to the circular buffer, and the reader consumes that content as it's written. Each thread cycles through the buffer the same number of times, and they both agree that i % 8 identifies the next slot of interest.
• Problem: each thread runs independently, without knowing how much progress the other has made.
• Example: no way for the reader to know that the slot it wants to read from has meaningful data in it. It's possible the writer hasn't gotten that far yet.
• Example: the writer could loop around and overwrite content that the reader has not yet consumed.

Goal: we must encode constraints into our program.

What constraint(s) should we add to our program?

• A writer should not write until there is space available to write

How can we model these constraint(s)?

• One semaphore to manage empty slots
• One semaphore to manage full slots

What might this look like in code?

• The writer thread waits until at least one buffer slot is empty before writing. Once it writes, it increments the full buffer count by one.
• The reader thread waits until at least one buffer slot is full before reading. Once it reads, it increments the empty buffer count by one.
• Let's try it!

`reader-writer.cc`
• We have two semaphores to permit bidirectional thread coordination
• reader can communicate with writer, and writer can communicate with reader

`reader-writer.cc`

# Lecture Plan

• Recap: Dining With Philosophers
• Example: Mythbusters

# Mythbusters

Let's implement a program called myth-buster that prints out how many CS110 student processes are running on each myth machine right now.

• representative of load balancers (e.g. myth.stanford.edu or www.netflix.com) determining which internal server your request should forward to.
``````myth51 has this many CS110-student processes: 59
myth52 has this many CS110-student processes: 135
myth53 has this many CS110-student processes: 112
myth54 has this many CS110-student processes: 89
myth55 has this many CS110-student processes: 107
myth56 has this many CS110-student processes: 58
myth57 has this many CS110-student processes: 70
myth58 has this many CS110-student processes: 93
myth59 has this many CS110-student processes: 107
myth60 has this many CS110-student processes: 145
myth61 has this many CS110-student processes: 105
myth62 has this many CS110-student processes: 126
myth63 has this many CS110-student processes: 314
myth64 has this many CS110-student processes: 119
myth65 has this many CS110-student processes: 156
myth66 has this many CS110-student processes: 144
Machine least loaded by CS110 students: myth56
Number of CS110 processes on least loaded machine: 58``````

# Mythbusters

Let's implement a program called myth-buster that prints out how many CS110 student processes are running on each myth machine right now.

• representative of load balancers (e.g. myth.stanford.edu or www.netflix.com) determining which internal server your request should forward to.
``int getNumProcesses(int mythNum, const std::unordered_set<std::string>& sunetIDs);``

We'll use the following pre-implemented function that does some networking to fetch process counts.  This connects to the specified myth machine, and blocks until done.

# Mythbusters

Let's implement a program called myth-buster that prints out how many CS110 student processes are running on each myth machine right now.

• representative of load balancers (e.g. myth.stanford.edu or www.netflix.com) determining which internal server your request should forward to.
``````int main(int argc, char *argv[]) {
// Create a set of student SUNETs
unordered_set<string> cs110SUNETs;

// Create a map from myth number -> CS110 process count and print its info
map<int, int> processCountMap;
createCS110ProcessCountMap(cs110SUNETs, processCountMap);
printMythMachineWithFewestProcesses(processCountMap);

return 0;
}``````

We'll implement createCS110ProcessCountMap sequentially and concurrently.

# Mythbusters: Sequential

``````static void createCS110ProcessCountMap(const unordered_set<string>& sunetIDs,
map<int, int>& processCountMap) {

for (int mythNum = kMinMythMachine; mythNum <= kMaxMythMachine; mythNum++) {
int numProcesses = getNumProcesses(mythNum, sunetIDs);

// If successful, add to the map and print out
if (numProcesses >= 0) {
processCountMap[mythNum] = numProcesses;
cout << "myth" << mythNum << " has this many CS110-student processes: " << numProcesses << endl;
}
}
}``````

This implementation fetches the count for each myth machine one after the other.  This means we have to wait for 16 sequential connections to be started and completed.

`myth-buster-sequential.cc`

# Mythbusters: Sequential

Why is this implementation slow?

Each call to getNumProcesses is independent.  We should call it multiple times concurrently to overlap this "dead time".

We wait 16 times, because we idle while waiting  for a connection to come back.

How can we improve its performance?

# Mythbusters: Concurrent

`myth-buster-concurrent.cc`

What might this look like in code?

• For each myth machine number, we'll spawn a new thread if there are permits available.
• That thread will fetch the count for that myth machine.  It must acquire a lock before modifying the map.
• When the thread finishes, it returns its permit.
• Let's try it!

Implementation: spawn multiple threads, each responsible for connecting to a different myth machine and updating the map.  We'll cap the number of active threads to avoid overloading the myth machines.

# Mythbusters Takeaways

`myth-buster-concurrent.cc`
• We parallelized an independent operation to speed up runtime
• One call to getNumProcesses isn't dependent on another
• To share the map for updating, we need a lock
• We use signal(on_thread_exit) to signal only once the thread has terminated.  This more accurately reflects permits as a cap on spawned threads.

# Recap

• Recap: Dining With Philosophers
• Example: Mythbusters

Next time: a trip to the ice cream store

# Extra Practice Problems

For each of the scenarios below, what multithreading patterns might we use to apply appropriate constraints and coordinate threads?

• multiple workers periodically need approval from an "approval thread" before continuing.  Approval may take some time, and gives back result (approve or deny).
• multiple threads wait in line to be processed by a "processor" thread.  The processor wakes up just one individual thread when it's their turn to be processed.

For each of the scenarios below, what multithreading patterns might we use to apply appropriate constraints and coordinate threads?

• same as reader/writer from before, but semaphores initialized to 0 or 1 ("1 slot")
• multiple workers periodically need approval from an "approval thread" before continuing.  Approval may take some time, and gives back result (approve or deny).
• see next slide
• multiple threads wait in line to be processed by a "processor" thread.  The processor wakes up just one individual thread when it's their turn to be processed.
• stay tuned for next lecture!

Challenge: multiple workers periodically need approval from an "approval thread" before continuing.  Approval may take some time, and gives back result (approve or deny).

``````// global struct
struct approval {
mutex available;
int workerData;
semaphore requested;
bool approved;
semaphore finished;
}

// all N workers
// spend time creating data, then...
approval.available.lock();
approval.workerData = ....
approval.requested.signal();
approval.finished.wait();
bool success = approval.approved;
approval.available.unlock();

// approver
while (true) {
approval.requested.wait();
// we are the only one accessing the struct here
approval.approved = someCalculation(approval.workerData);
approval.finished.signal();
}``````

By troccoli

Winter 2021

• 789