Principles of Computer Systems
Winter 2021
Stanford University
Computer Science Department
Instructors: Chris Gregg and
Nick Troccoli
static void returnWebpage(iosockstream &ss) {
bool foundGzip = checkHeadersForGzip(ss);
string fullText, line, filename;
if (foundGzip) {
filename = "words.gz";
} else {
filename = "/usr/share/dict/words";
}
ifstream fileToRead(filename);
while (getline(fileToRead, line)) fullText += line + "\n";
fileToRead.close();
ss << "HTTP/1.1 200 OK\r\n";
ss << "Content-Type: text/plain; charset=UTF-8\r\n";
if (foundGzip) {
ss << "Content-Encoding: gzip\r\n";
}
ss << "Content-Length: " << fullText.size() << "\r\n\r\n";
ss << fullText << endl << flush;
}
static bool checkHeadersForGzip(iosockstream& ss) {
string line;
bool foundGzip = false;
do {
getline(ss, line);
if ((line.find("Accept-Encoding:") != string::npos) && (line.find("gzip") != string::npos)) {
foundGzip = true;
}
} while (!line.empty() && line != "\r");
return foundGzip;
}
We pre-compressed our file:
gzip -c /usr/share/words/dict > words.gz
$ telnet web.stanford.edu 80
Trying 171.67.215.200...
Connected to web.Stanford.EDU.
Escape character is '^]'.
HEAD /class/cs110/ HTTP/1.1
Host: web.stanford.edu
HTTP/1.1 200 OK
Date: Sun, 07 Mar 2021 20:10:11 GMT
Server: Apache
Accept-Ranges: bytes
Content-Length: 41776
Content-Type: text/html
Connection closed by foreign host.
<!DOCTYPE html>
<html>
<body>
<h1>The form method="post" attribute</h1>
<form action="/action_page.cgi" method="post" target="_blank">
<label for="fname">First name:</label>
<input type="text" id="fname" name="fname"><br><br>
<label for="lname">Last name:</label>
<input type="text" id="lname" name="lname"><br><br>
<input type="submit" value="Submit">
</form>
<p>Click on the submit button, and the form
will be submittied using the POST method.</p>
</body>
</html>
fname=Chris&lname=Gregg
)Host: myth57.stanford.edu:13133
Connection: keep-alive
Content-Length: 23
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 11_2_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.192 Safari/537.36
Origin: http://myth57.stanford.edu:13133
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Referer: http://myth57.stanford.edu:13133/
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Cookie: _ga=GA1.2.1964281459.1545270074; _mkto_trk=id:194-OCQ-487&token:_mch-stanford.edu-1554081070487-72175; _gid=GA1.2.61160621.1615039886
fname=Chris&lname=Gregg
<form action="http://localhost:8000" method="post" enctype="multipart/form-data">
<p><input type="text" name="text" value="text default">
<p><input type="file" name="file1">
<p><input type="file" name="file2">
<p><button type="submit">Submit</button>
</form>
nc -l localhost 8000
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryIVpTQH4nQEVWt9pR
POST / HTTP/1.1
Host: localhost:8000
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryIVpTQH4nQEVWt9pR
Origin: null
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.3 Safari/605.1.15
Content-Length: 493
Accept-Language: en-us
Accept-Encoding: gzip, deflate
------WebKitFormBoundaryIVpTQH4nQEVWt9pR
Content-Disposition: form-data; name="text"
Some text
------WebKitFormBoundaryIVpTQH4nQEVWt9pR
Content-Disposition: form-data; name="file1"; filename="a.txt"
Content-Type: text/plain
This is the a.txt file.
------WebKitFormBoundaryIVpTQH4nQEVWt9pR
Content-Disposition: form-data; name="file2"; filename="file1.html"
Content-Type: text/html
<!DOCTYPE html><title>Content of a.html.</title>
------WebKitFormBoundaryIVpTQH4nQEVWt9pR--
$ ./proxy
Listening for all incoming traffic on port 19419.
telnet
, instead:myth51:$ telnet myth65.stanford.edu 19419
Trying 171.64.15.30...
Connected to myth65.stanford.edu.
Escape character is '^]'.
GET http://api.ipify.org/?format=json HTTP/1.1
Host: api.ipify.org
HTTP/1.0 200 OK
content-length: 23
You're writing a proxy!Connection closed by foreign host.
myth51:$
myth51:$ telnet myth65.stanford.edu 19419
Trying 171.64.15.30...
Connected to myth65.stanford.edu.
Escape character is '^]'.
GET http://api.ipify.org/?format=json HTTP/1.1
Host: api.ipify.org
HTTP/1.1 200 OK
connection: keep-alive
content-length: 21
content-type: application/json
date: Wed, 22 May 2019 16:56:33 GMT
server: Cowboy
vary: Origin
via: 1.1 vegur
{"ip":"172.27.64.82"}Connection closed by foreign host.
myth51:$
ThreadPool
to your program, but first, write a sequential version.GET http://www.cornell.edu/research/ HTTP/1.1
GET /research/ HTTP/1.1
HTTPRequest
class, although you will have to update the operator<<
function at a later stage.x-forwarded-proto
and set its value to be http
. If x-forwarded-proto
is already included in the request header, then simply add it again.x-forwarded-for
and set its value to be the IP address of the requesting client. If x-forwarded-for
is already present, then you should extend its value into a comma-separated chain of IP addresses the request has passed through before arriving at your proxy. (The IP address of the machine you’re directly hearing from would be appended to the end).header.h/cc
files to utilize the functions, e.g., string xForwardedForStr = requestHeader.getValueAsString("x-forwarded-for");
request-handler.h/cc
, and some in request.h/cc
. HTTP://
sites you can find!blocked-domains.txt
file that lists domains that should not be let through your proxy. When the server in the blacklist is requested, you should return to the client a status code of 403
, and a payload of "Forbidden Content
":strikeset.cc
file, e.g.,
if (!strikeset.serverIsAllowed(request.getServer()) { ...
"HTTP/1.0"
as the protocol.
HTTPRequestHandler
to check to see if you've already stored a copy of a request -- if you have, just return it instead of forwarding on! You can use the HTTPCache
class to do this check (and to add sites, as well).cache.shouldCache(request, response)
), then you cache it for later.myth51:$ ./proxy --clear-cache
Clearing the cache... wait for it.... done!
Listening for all incoming traffic on port 19419.
ThreadPool
class (we give you a working version in case yours still has bugs)HTTPProxyScheduler
class.
HTTPRequestHandler
, which already has a single HTTPStrikeSet
and a single HTTPCache
. You will need to go back and add synchronization directives (e.g., mutex
es) to your prior code to ensure that you don't have race conditions.
mutex
es.
size_t requestHash = hashRequest(request);)
client-socket.h/cc
files have been updated to include thread-safe versions of their functions, so no need to worry about that.myth63:$ samples/proxy_soln --port 12345
Listening for all incoming traffic on port 12345.
myth65:$ samples/proxy_soln --proxy-server myth63.stanford.edu --proxy-port 12345
Listening for all incoming traffic on port 19419.
Requests will be directed toward another proxy at myth63.stanford.edu:12345.
"x-forwarded-for"
header! You analyze that list to see if you are about to create a chain.
"x-forwarded-proto"
and "x-forwarded-for"
headers.run-proxy-farm.py
program that can manage a chain of proxies (but it doesn't check for cycles -- you would need to modify the python code to do that).https://
sites, you have the option to implement the CONNECT HTTP method, which is actually not that much more work to add. The assignment gives you information about how to support CONNECT.file | changes |
---|---|
cache.cc | (very minor) |
cache.h | (very minor) |
proxy.cc | (very minor) |
request.cc | (minor) |
request.h | (minor) |
request-handler.cc | (major) |
request-handler.h | (major) |
scheduler.cc | (minor) |
scheduler.h | (very minor) |