CS110 Lecture 19: Thread Pool and Ice Cream Store
CS110: Principles of Computer Systems
Winter 2021-2022
Stanford University
Instructors: Nick Troccoli and Jerry Cain
CS110 Topic 3: How can we have concurrency within a single process?
Learning About Multithreading
Introduction to Threads
Mutexes and Condition Variables
Semaphores
Multithreading Patterns
Lecture 13
Lectures 14/15
Lecture 16
Lecture 17/18/this lecture
assign5: implement your own multithreaded news aggregator to quickly fetch news from the web!
Learning Goals
- Practice applying our toolbox of concurrency directives (mutexes, condition variables and semaphores) to coordinate threads in different ways
- Understand the larger ice cream store example as a case study in multithreading and threads doing different tasks
Plan For Today
- Recap: Mythbuster
- Example: Ice Cream Store
Plan For Today
- Recap: Mythbuster
- Example: Ice Cream Store
Mythbuster
Let's implement a program called myth-buster that prints out how many CS110 student processes are running on each myth machine right now.
representative of load balancers (e.g. myth.stanford.edu or www.netflix.com) determining which internal server your request should forward to.
myth51 has this many CS110-student processes: 59
myth52 has this many CS110-student processes: 135
myth53 has this many CS110-student processes: 112
myth54 has this many CS110-student processes: 89
myth55 has this many CS110-student processes: 107
myth56 has this many CS110-student processes: 58
myth57 has this many CS110-student processes: 70
myth58 has this many CS110-student processes: 93
myth59 has this many CS110-student processes: 107
myth60 has this many CS110-student processes: 145
myth61 has this many CS110-student processes: 105
myth62 has this many CS110-student processes: 126
myth63 has this many CS110-student processes: 314
myth64 has this many CS110-student processes: 119
myth65 has this many CS110-student processes: 156
myth66 has this many CS110-student processes: 144
Machine least loaded by CS110 students: myth56
Number of CS110 processes on least loaded machine: 58
I/O-Bound vs. CPU-Bound Programs
CPU-bound tasks: the time to complete them is dictated by how long it takes us to do the CPU computation.
- heavy computations
- data processing
I/O-bound tasks: the time to complete them is dictated by how long it takes for some external mechanism to complete its work.
- reading from an external device (e.g. disk)
- reading data from the network
Even a single-core CPU can see performance improvements by parallelizing I/O-bound tasks. But parallelizing CPU-bound tasks will likely show minimal gains unless we have a multi-core CPU.
Parallelizing Mythbuster
For mythbuster, the primary task is fetching the number of running CS110 processes over the network. Is this an I/O-bound or CPU-bound task?
I/O-bound!
This means we should see large gains from multithreading, even on a single-core machine.
Mythbusters: Concurrent
Implementation: spawn multiple threads, each responsible for connecting to a different myth machine and updating the map.
static void countCS110ProcessesForMyth(int mythNum, const unordered_set<string>& sunetIDs,
map<int, int>& processCountMap, mutex& processCountMapLock) {
int numProcesses = getNumProcesses(mythNum, sunetIDs);
// If successful, add to the map and print out
if (numProcesses >= 0) {
processCountMapLock.lock();
processCountMap[mythNum] = numProcesses;
processCountMapLock.unlock();
cout << oslock << "myth" << mythNum << " has this many CS110-student processes: " << numProcesses << endl << osunlock;
}
}
Mythbusters: Capped
When spawning threads, we don't want to spawn too many, because we might overwhelm the OS and diminish the performance gains of our multithreaded implementation.
A common approach is to limit the number of simultaneous threads with a cap. E.g. we can only have 16 spawned threads at a time. Once one finishes, then we can spawn another.
Mythbusters: Capped
- For each myth machine number, we'll spawn a new thread if there are permits available. That thread will fetch the count for that myth machine.
- When the thread finishes, it returns its permit.
static void createCS110ProcessCountMap(const unordered_set<string>& sunetIDs, map<int, int>& processCountMap) {
vector<thread> threads;
mutex processCountMapLock;
semaphore permits(kMaxNumSimultaneousThreads);
for (int mythNum = kMinMythMachine; mythNum <= kMaxMythMachine; mythNum++) {
permits.wait();
threads.push_back(thread(countCS110ProcessesForMyth, mythNum, ref(sunetIDs),
ref(processCountMap), ref(processCountMapLock), ref(permits)));
}
for (thread& threadToJoin : threads) threadToJoin.join();
}
Mythbusters: Capped
- For each myth machine number, we'll spawn a new thread if there are permits available. That thread will fetch the count for that myth machine.
- When the thread finishes, it returns its permit. We can use a special version of signal() to specify that the semaphore should be signaled only once it exits.
static void countCS110ProcessesForMyth(int mythNum, const unordered_set<string>& sunetIDs,
map<int, int>& processCountMap, mutex& processCountMapLock, semaphore& permits) {
permits.signal(on_thread_exit);
int numProcesses = getNumProcesses(mythNum, sunetIDs);
if (numProcesses >= 0) {
processCountMapLock.lock();
processCountMap[mythNum] = numProcesses;
processCountMapLock.unlock();
cout << "myth" << mythNum << " has this many CS110-student processes: " << numProcesses << endl;
}
}
Mythbusters: Thread Pool
Even though we are limiting the number of simultaneous threads, we still spawn that many in total. It would be nice if we could use the same threads to complete all the tasks.
A common approach is to use a thread pool; a variable type that maintains a pool of worker threads that can complete assigned tasks.
- You initialize the thread pool and specify the number of workers
- You can call schedule and pass in a function you want it to execute. It will assign it to the next available worker.
- You can call wait to block until all currently-assigned tasks have been completed.
class ThreadPool {
public:
ThreadPool(size_t numThreads);
void schedule(const std::function<void(void)>& thunk);
void wait();
~ThreadPool();
};
Mythbusters: Thread Pool
Even though we are limiting the number of simultaneous threads, we still spawn that many in total. It would be nice if we could use the same threads to complete all the tasks.
What might this look like in code?
- In myth buster, instead of spawning threads, we can schedule a "thunk" for each task of fetching a myth machine's count of CS110 processes. It must be a function that has no parameters or return value.
- After we add all the tasks to the thread pool, we wait on the thread pool to finish all the tasks.
class ThreadPool {
public:
ThreadPool(size_t numThreads);
void schedule(const std::function<void(void)>& thunk);
void wait();
~ThreadPool();
};
Mythbusters: Thread Pool
- We can schedule a "thunk" for each task of fetching a myth machine's count of CS110 processes. It must be a function that has no parameters or return value.
- After we add all the tasks to the thread pool, we wait on the thread pool to finish all the tasks.
static void createCS110ProcessCountMap(const unordered_set<string>& sunetIDs,
map<int, int>& processCountMap) {
ThreadPool pool(kMaxNumSimultaneousThreads);
mutex processCountMapLock;
for (int mythNum = kMinMythMachine; mythNum <= kMaxMythMachine; mythNum++) {
pool.schedule([mythNum, &sunetIDs, &processCountMap, &processCountMapLock]() {
countCS110ProcessesForMyth(mythNum, sunetIDs, processCountMap, processCountMapLock);
});
}
...
We can only enqueue a task represented by a function with no params/return value. Therefore, to access external data, we must capture it in a lambda function.
Mythbusters: Thread Pool
- We can schedule a "thunk" for each task of fetching a myth machine's count of CS110 processes. It must be a function that has no parameters or return value.
- After we add all the tasks to the thread pool, we wait on the thread pool to finish all the tasks.
static void createCS110ProcessCountMap(const unordered_set<string>& sunetIDs,
map<int, int>& processCountMap) {
ThreadPool pool(kMaxNumSimultaneousThreads);
mutex processCountMapLock;
for (int mythNum = kMinMythMachine; mythNum <= kMaxMythMachine; mythNum++) {
pool.schedule([mythNum, &sunetIDs, &processCountMap, &processCountMapLock]() {
countCS110ProcessesForMyth(mythNum, sunetIDs, processCountMap, processCountMapLock);
});
}
pool.wait();
}
Thread Pools
Thread Pools are very useful abstractions that let a client spread tasks across several threads without having to deal with the complexities of threads.
- You will have a chance to implement your own ThreadPool class on assignment 5!
Plan For Today
- Recap: Mythbuster
- Example: Ice Cream Store
Visiting The Ice Cream Store
- Now, let's use our multithreading knowledge to understand an in-depth multithreading program simulating an ice cream store!
- There are customers, clerks, a manager and a cashier, coordinating in various ways.
Visiting The Ice Cream Store
- Each customer wants to order some number of ice cream cones.
- A customer spawns a new clerk to make each ice cream cone.
- A clerk makes a single cone, and must have it approved by the manager.
- The single manager approves or rejects cones made by clerks.
- Once a customer's order is made, they must get in line with the cashier to check out.
- The cashier helps customers check out in the order in which they got on line.
Ice Cream Store: scaffolding
static mutex rgenLock;
static RandomGenerator rgen;
...
void browse() {
cout << oslock << "Customer starts to kill time." << endl << osunlock;
size_t browseTimeMS = getBrowseTimeMS();
sleep_for(browseTimeMS);
cout << oslock << "Customer just killed " << double(browseTimeMS) / 1000
<< " seconds." << endl << osunlock;
}
void makeCone(size_t coneID, size_t customerID) {
cout << oslock << " Clerk starts to make ice cream cone #" << coneID
<< " for customer #" << customerID << "." << endl << osunlock;
size_t prepTimeMS = getPrepTimeMS();
sleep_for(prepTimeMS);
cout << oslock << " Clerk just spent " << double(prepTimeMS) / 1000
<< " seconds making ice cream cone #" << coneID
<< " for customer #" << customerID << "." << endl << osunlock;
}
...
To model a "real" ice cream store, we want to randomize different occurrences throughout the program. We use functions like this to do that.
Ice Cream Store: main
int main(int argc, const char *argv[]) {
// Make an array of customer threads, and add up how many cones they order
size_t totalConesOrdered = 0;
thread customers[kNumCustomers];
/* The structs to package up variables needed for cone inspection and
* customer checkout
*/
inspection_t inspection;
checkout_t checkout;
for (size_t i = 0; i < kNumCustomers; i++) {
// utility function, random (see ice-cream-store-utils.h)
size_t numConesWanted = getNumCones();
customers[i] = thread(customer, i, numConesWanted,
ref(inspection), ref(checkout));
totalConesOrdered += numConesWanted;
}
/* Make the manager and cashier threads to approve cones / checkout customers.
* Tell the manager how many cones will be ordered in total. */
thread managerThread(manager, totalConesOrdered, ref(inspection));
thread cashierThread(cashier, ref(checkout));
for (thread& customer: customers) customer.join();
cashierThread.join();
managerThread.join();
return 0;
}
In main, we spawn all of the customers, the manager (telling it the total number of cones ordered), and the cashier. Why not clerks? Each customer spawns its own clerks.
Then, we wait for the threads to finish.
Ice Cream Store: customer
A customer does the following:
- spawns a clerk for each cone
- browses and waits for clerks to finish
- gets its number in checkout line
- tells cashier we are ready to check out
- waits for cashier to ring us up
- "gets its number in checkout line"
- "tells cashier we are ready to check out"
- "waits for cashier to ring us up"
- "gets its number in checkout line" - global counter, needs a binary lock
- "tells cashier we are ready to check out" - one generalized coordination semaphore
- "waits for cashier to ring us up" - binary coordination semaphore per customer
Ice Cream Store: customer
struct checkout_t {
atomic<size_t> nextPlaceInLine{0};
semaphore customers[kNumCustomers];
semaphore waitingCustomers;
};
Struct passed by reference to all customers and the cashier.
- nextPlaceInLine is a counter that is automatically atomic for ++!
- waitingCustomers is a generalized coordination semaphore that the cashier waits on
- customers stores a binary coordination semaphore per customer, customers wait on them
Ice Cream Store: customer
static void customer(size_t id, size_t numConesWanted,
inspection_t& inspection, checkout_t& checkout) {
// Make a vector of clerk threads, one per cone to be ordered
vector<thread> clerks(numConesWanted);
for (size_t i = 0; i < clerks.size(); i++) {
clerks[i] = thread(clerk, i, id, ref(inspection));
}
// The customer browses for some amount of time, then joins the clerks.
browse();
for (thread& clerk: clerks) clerk.join();
size_t place = checkout.nextPlaceInLine++;
cout << oslock << "Customer " << id << " assumes position #" << place
<< " at the checkout counter." << endl << osunlock;
// Tell the cashier that we are ready to check out
checkout.waitingCustomers.signal();
// Wait on our unique semaphore so we know when it is our turn
checkout.customers[place].wait();
cout << oslock << "Customer " << id
<< " has checked out and leaves the ice cream store."
<< endl << osunlock;
}
A customer does the following:
1) spawns a clerk for each cone
2) browses and waits for clerks
3) gets its place in checkout line
4) tells cashier it's there
5) waits for cashier to ring it up
struct checkout_t {
atomic<size_t> nextPlaceInLine{0};
semaphore customers[kNumCustomers];
semaphore waitingCustomers;
};
Ice Cream Store: clerk
A clerk does the following:
- makes a cone
- attempts to get exclusive access to the manager
- tells the manager it needs approval
- waits for the manager to decide whether to approve or reject
- checks the manager's decision
- forfeits exclusive access to the manager
- if our cone was rejected, go to step 1
- "attempts to get exclusive access to the manager"
- "tells the manager it needs approval"
- "waits for the manager to decide..."
- "attempts to get exclusive access to the manager" - binary lock
- "tells the manager it needs approval" - binary coordination semaphore
- "waits for the manager to decide..." - binary coordination semaphore
Ice Cream Store: clerk
struct inspection_t {
mutex available;
semaphore requested;
semaphore finished;
bool passed;
};
Struct passed by reference to all clerks and the manager.
- available is a lock that a clerk must hold in order to interact with the manager.
- requested is a binary coordination semaphore that the manager waits on
- finished is a binary coordination semaphore that a clerk waits on
- passed stores the result of the most recent inspection - only for lock-holder.
Ice Cream Store: clerk
static void clerk(size_t coneID, size_t customerID,
inspection_t& inspection) {
bool success = false;
while (!success) {
makeCone(coneID, customerID);
// We must be the only one requesting approval
inspection.available.lock();
// Let the manager know we are requesting approval
inspection.requested.signal();
// Wait for the manager to finish
inspection.finished.wait();
/* If the manager is finished, it has put
* its approval decision into "passed"
*/
success = inspection.passed;
// We're done requesting approval, so unlock for someone else
inspection.available.unlock();
}
}
A clerk does the following:
- makes a cone
- gets exclusive manager access
- tells the manager it needs approval
- waits for the manager to decide
- checks the manager's decision
- forfeits manager access
- if rejected, go to step 1
struct inspection_t {
mutex available;
semaphore requested;
semaphore finished;
bool passed;
};
Ice Cream Store: manager
The single manager does the following while there are more cones needed:
- waits for a clerk to request an inspection
- inspects the cone and records decision to approve or not
- tells the clerk that it is done
- updates its cone counts
- if more cones needed, go to step 1
- "waits for a clerk's cone to inspect" - binary coordination semaphore
- "tells the clerk that we are done" - binary coordination semaphore
Ice Cream Store: manager
struct inspection_t {
mutex available;
semaphore requested;
semaphore finished;
bool passed;
};
Struct passed by reference to all clerks and the manager.
- available is a lock that a clerk must hold in order to interact with the manager.
- requested is a binary coordination semaphore that the manager waits on
- finished is a binary coordination semaphore that a clerk waits on
- passed stores the result of the most recent inspection - only for lock-holder.
Ice Cream Store: manager
static void manager(size_t numConesNeeded,
inspection_t& inspection) {
size_t numConesAttempted = 0;
size_t numConesApproved = 0;
while (numConesApproved < numConesNeeded) {
// Wait for someone to request an inspection
inspection.requested.wait();
inspection.passed = inspectCone();
// Let them know we have finished inspecting
inspection.finished.signal();
numConesAttempted++;
if (inspection.passed) numConesApproved++;
}
cout << oslock << " Manager inspected a total of "
<< numConesAttempted
<< " ice cream cones before approving a total of "
<< numConesNeeded
<< "." << endl << " Manager leaves the ice cream store."
<< endl << osunlock;
}
The manager does the following while there are more cones needed:
- waits for a clerk's cone to inspect
- inspects the cone and records decision to approve or not.
- tells the clerk that it is done.
- updates its cone counts
- if more cones needed, go to 1
struct inspection_t {
mutex available;
semaphore requested;
semaphore finished;
bool passed;
};
Ice Cream Store: cashier
- "waits for a customer to be ready to check out" - generalized coordination semaphore
- "tells the i-th customer that it has checked out" - binary coordination semaphore per customer
The single cashier does the following while there are more customers to ring up:
- waits for a customer to be ready to check out
- tells the i-th customer that it has checked out
- if more customers to ring up, go to step 1
Ice Cream Store: cashier
struct checkout_t {
atomic<size_t> nextPlaceInLine{0};
semaphore customers[kNumCustomers];
semaphore waitingCustomers;
};
Global struct shared by all customers and the cashier.
- nextPlaceInLine is a counter that is automatically atomic for ++!
- waitingCustomers is a generalized coordination semaphore that the cashier waits on
- customers stores a binary coordination semaphore per customer, customers wait on them
Ice Cream Store: cashier
static void cashier(checkout_t& checkout) {
cout << oslock
<< " Cashier is ready to help customers check out."
<< endl << osunlock;
// We check out all customers
for (size_t i = 0; i < kNumCustomers; i++) {
// Wait for someone to let us know they are ready to check out
checkout.waitingCustomers.wait();
cout << oslock << " Cashier rings up customer " << i << "."
<< endl << osunlock;
// Let the ith customer know that they can leave.
checkout.customers[i].signal();
}
cout << oslock << " Cashier is all done and can go home."
<< endl << osunlock;
}
The cashier does the following while there are more customers to ring up:
- waits for a customer to be ready to check out
- tells the i-th customer that it has checked out
- if more customers to ring up, go to step 1
struct checkout_t {
atomic<size_t> nextPlaceInLine{0};
semaphore customers[kNumCustomers];
semaphore waitingCustomers;
};
Ice Cream Store Takeaways
-
There's a lot going on in this simulation!
-
Managing all of the threads, locking, waiting, etc., takes planning and foresight.
-
This isn't the only way to model the ice cream store
-
How would you modify the model?
-
What would we have to do if we wanted more than one manager?
-
Could we create multiple clerks in main, as well? (sure)
-
-
Example of different threads doing different tasks
-
Layered construction - combination of multithreading patterns
-
Role playing helps to visualize!
Multithreading Wrap-Up
- Multithreading allows one process to execute multiple tasks at the same time.
- We can spawn threads, which all share the same address space, and each of them can execute a function.
- Race conditions are common when accessing shared data
- We can use concurrency directives like mutexes, condition variables and semaphores to coordinate between threads and prevent race conditions.
- Depending on what tasks a program performs, it may see varying benefits from adding multithreading - eg. I/O-bound vs. CPU-bound tasks.
Recap
- Recap: Mythbuster
- Example: Ice Cream Store
Next time: Introduction to networking
CS110 Lecture 19: Thread Pools and Ice Cream Store
By Nick Troccoli
CS110 Lecture 19: Thread Pools and Ice Cream Store
- 2,147