createClientSocket
. For now, view it as a built-in that sets up a bidirectional pipe between a client and a server running on the specified host (e.g. myth64
) and bound to the specified port number (e.g. 12345).int main(int argc, char *argv[]) {
int clientSocket = createClientSocket("myth64.stanford.edu", 12345);
assert(client >= 0);
sockbuf sb(clientSocket);
iosockstream ss(&sb);
string timeline;
getline(ss, timeline);
cout << timeline << endl;
return 0;
}
wget
is a command line utility that, given its URL, downloads a single document (HTML document, image, video, etc.) and saves a copy of it to the current working directory.wget
's most basic functionality.main
and parseUrl
functions.parseUrl
dissects the supplied URL to surface the host and pathname components.static const string kProtocolPrefix = "http://";
static const string kDefaultPath = "/";
static pair<string, string> parseURL(string url) {
if (startsWith(url, kProtocolPrefix))
url = url.substr(kProtocolPrefix.size());
size_t found = url.find('/');
if (found == string::npos)
return make_pair(url, kDefaultPath);
string host = url.substr(0, found);
string path = url.substr(found);
return make_pair(host, path);
}
int main(int argc, char *argv[]) {
pullContent(parseURL(argv[1]));
return 0;
}
wget
(continued)
pullContent
, of course, needs to manage everything, including the networking.static const unsigned short kDefaultHTTPPort = 80;
static void pullContent(const pair<string, string>& components) {
int client = createClientSocket(components.first, kDefaultHTTPPort);
if (client == kClientSocketError) {
cerr << "Could not connect to host named \"" << components.first << "\"." << endl;
return;
}
sockbuf sb(client);
iosockstream ss(&sb);
issueRequest(ss, components.first, components.second);
skipHeader(ss);
savePayload(ss, getFileName(components.second));
}
createClientSocket
function for our time-client
. This time, we're connecting to real but arbitrary web servers that speak HTTP.issueRequest
, skipHeader
, and savePayload
subdivide the client-server conversation into manageable chunks.Emulation of wget
(continued)
issueRequest
, which generates the smallest legitimate HTTP request possible and sends it over to the server.
static void issueRequest(iosockstream& ss, const string& host, const string& path) {
ss << "GET " << path << " HTTP/1.0\r\n";
ss << "Host: " << host << "\r\n";
ss << "\r\n";
ss.flush();
}
'\r'
following by '\n'
.flush
is necessary to ensure all character data is pressed over the wire and consumable at the other end.flush
, the client transitions from supply to ingest mode. Remember, the iosockstream
is read/write, because the socket descriptor backing it is bidirectional.skipHeader
reads through and discards all of the HTTP response header lines until it encounters either a blank line or one that contains nothing other than a '\r'.
The blank line is, indeed, supposed to be "\r\n"
, but some servers—often hand-rolled ones—are sloppy, so we treat the '\r'
as optional. Recall that getline chews up the '\n'
, but it won't chew up the '\r'
.static void skipHeader(iosockstream& ss) {
string line;
do {
getline(ss, line);
} while (!line.empty() && line != "\r");
}
wget
we're imitating—would ingest all of the lines of the response header into a data structure and allow it to influence how it treats payload.static string getFileName(const string& path) {
if (path.empty() || path[path.size() - 1] == '/') return "index.html";
size_t found = path.rfind('/');
return path.substr(found + 1);
}
static void savePayload(iosockstream& ss, const string& filename) {
ofstream output(filename, ios::binary); // don't assume it's text
size_t totalBytes = 0;
while (!ss.fail()) {
char buffer[2014] = {'\0'};
ss.read(buffer, sizeof(buffer));
totalBytes += ss.gcount();
output.write(buffer, ss.gcount());
}
cout << "Total number of bytes fetched: " << totalBytes << endl;
}
EOF
, and we write everything we read.