Lecture 14: API Servers, Threads, Processes
- An application programming interface (or API) is a set of library functions one can use in order to build a larger piece of software.
- You're familiar with some APIs: #include files, system calls, and ad hoc protocols for driving and communicating with child processes using pipes and signals.
- Very often these libraries reside on other machines, and we interface with them over the Internet.
- I want to implement an API server that's architecturally in line with the way Google, Twitter, Facebook, and LinkedIn architect their own API services.
- This example is inspired by a website called Lexical Word Finder.
- Our implementation assumes we have a standard Unix executable called
scrabble-word-finder
. The source code for this executable—completely unaware it'll be used in a larger networked application—can be found right here. - Here are two abbreviated sample runs:
- Our implementation assumes we have a standard Unix executable called
poohbear@myth61:$ ./scrabble-word-finder lexical
ace
// many lines omitted for brevity
lexical
li
lice
lie
lilac
xi
poohbear@myth61:$
poohbear@myth61:$ ./scrabble-word-finder network
en
// many lines omitted for brevity
work
worn
wort
wot
wren
wrote
poohbear@myth61:$
Lecture 14: API Servers, Threads, Processes
- I want to implement an API service using HTTP to replicate what
scrabble-wordfinder
is capable of.- We'll expect the API call to come in the form of a URL, and we'll expect that URL to include the rack of letters.
- Assuming our API server is running on
myth54:13133
, we expecthttp://myth54:13133/lexical
andhttp://myth54:13133/network
to generate the following payloads:
{
time: 0.223399,
cached: false,
possibilities: [
'ace',
// several words omitted
'lexical',
'li',
'lice',
'lie',
'lilac',
'xi'
]
}
{
time: 0.223399,
cached: false,
possibilities: [
'en',
// several words omitted
'work',
'worn',
'wort',
'wot',
'wren',
'wrote'
]
}
Lecture 14: API Servers, Threads, Processes
- One might think to cannibalize the code within
scrabble-word-finder.cc
to build the core ofscrabble-word-finder-server.cc
. - Reimplementing from scratch is wasteful, time-consuming, and unnecessary.
scrabble-word-finder
already outputs the primary content we need for our payload. We're packaging the payload as JSON instead of plain text, but we can still tapscrabble-word-finder
to generate the collection of formable words. - Can we implement a server that leverages existing functionality? Of course we can!
- We can just leverage our
subprocess_t
type andsubprocess
function from Assignment 3.
struct subprocess_t {
pid_t pid;
int supplyfd;
int ingestfd;
};
subprocess_t subprocess(char *argv[], bool supplyChildInput, bool ingestChildOutput);
Lecture 14: API Servers, Threads, Processes
- Here is the core of the
main
function implementing our server:
int main(int argc, char *argv[]) {
unsigned short port = extractPort(argv[1]);
int server = createServerSocket(port);
cout << "Server listening on port " << port << "." << endl;
ThreadPool pool(16);
map<string, vector<string>> cache;
mutex cacheLock;
while (true) {
struct sockaddr_in address;
// used to surface IP address of client
socklen_t size = sizeof(address); // also used to surface client IP address
bzero(&address, size);
int client = accept(server, (struct sockaddr *) &address, &size);
char str[INET_ADDRSTRLEN];
cout << "Received a connection request from "
<< inet_ntop(AF_INET, &address.sin_addr, str, INET_ADDRSTRLEN) << "." << endl;
pool.schedule([client, &cache, &cacheLock] {
publishScrabbleWords(client, cache, cacheLock);
});
}
return 0; // server never gets here, but not all compilers can tell
}
Lecture 14: API Servers, Threads, Processes
- The second and third arguments to
accept
are used to surface the IP address of the client. - Ignore the details around how I use
address
,size
, and theinet_ntop
function until next Wednesday, when we'll talk more about them. Right now, it's a neat-to-see! - Each request is handled by a dedicated worker thread within a
ThreadPool
of size 16. - The thread routine called
publishScrabbleWords
will rely on oursubprocess
function to marshal plain text output of scrabble-word-finder into JSON and publish that JSON as the payload of the HTTP response. - The next several slides include the full implementation of
publishScrabbleWords
and some of its helper functions. - Most of the complexity comes around the fact that I've elected to maintain a cache of previously processed letter racks and that I absolutely need to maintain a set of open
ingestfds
so overlapping calls tosubprocess
—that is, parallel calls tosubprocess
—work properly and without race conditions.
Lecture 14: API Servers, Threads, Processes
- Here is
publishScrabbleWords
:
static void publishScrabbleWords(int client, map<string, vector<string>>& cache,
mutex& cacheLock) {
sockbuf sb(client);
iosockstream ss(&sb);
string letters = getLetters(ss);
sort(letters.begin(), letters.end());
skipHeaders(ss);
struct timeval start;
gettimeofday(&start, NULL); // start the clock
cacheLock.lock();
auto found = cache.find(letters);
cacheLock.unlock(); // release lock immediately, iterator won't be invalidated by competing find calls
bool cached = found != cache.end();
vector<string> formableWords;
if (cached) {
formableWords = found->second;
} else {
const char *command[] = {"./scrabble-word-finder", letters.c_str(), NULL};
subprocess_t sp = subprocess(const_cast<char **>(command), false, true);
pullFormableWords(formableWords, sp.ingestfd); // function exits
waitpid(sp.pid, NULL, 0);
lock_guard<mutex> lg(cacheLock);
cache[letters] = formableWords;
}
struct timeval end, duration;
gettimeofday(&end, NULL); // stop the clock, server-computation of formableWords is complete
timersub(&end, &start, &duration);
double time = duration.tv_sec + duration.tv_usec/1000000.0;
ostringstream payload;
constructPayload(formableWords, cached, time, payload);
sendResponse(ss, payload.str());
}
Lecture 14: API Servers, Threads, Processes
- Here's the
pullFormableWords
andsendResponse
helper functions.
static void pullFormableWords(vector<string>& formableWords, int ingestfd) {
stdio_filebuf<char> inbuf(ingestfd, ios::in);
istream is(&inbuf);
while (true) {
string word;
getline(is, word);
if (is.fail()) break;
formableWords.push_back(word);
}
}
static void sendResponse(iosockstream& ss, const string& payload) {
ss << "HTTP/1.1 200 OK\r\n";
ss << "Content-Type: text/javascript; charset=UTF-8\r\n";
ss << "Content-Length: " << payload.size() << "\r\n";
ss << "\r\n";
ss << payload << flush;
}
Lecture 14: API Servers, Threads, Processes
- Finally, here are the
getLetters
and theconstructPayload
helper functions. I omit the implementation ofskipHeaders
—you saw it withweb-get
—andconstructJSONArray
, which you're welcome to view right here.
static string getLetters(iosockstream& ss) {
string method, path, protocol;
ss >> method >> path >> protocol;
string rest;
getline(ss, rest);
size_t pos = path.rfind("/");
return pos == string::npos ? path : path.substr(pos + 1);
}
static void constructPayload(const vector<string>& formableWords, bool cached,
double time, ostringstream& payload) {
payload << "{" << endl;
payload << " time: " << time << "," << endl;
payload << " cached: " << boolalpha << cached << "," << endl;
payload << " possibilities: " << constructJSONArray(formableWords, 2) << endl;
payload << "}" << endl;
}
- Our
scrabble-word-finder-server
provided a single API call that resembles the types of API calls afforded by Google, Twitter, or Facebook to access search, tweet, or friend-graph data.
Lecture 14: API Servers, Threads, Processes
By Jerry Cain
Lecture 14: API Servers, Threads, Processes
- 991