Simple HTTP Server
Due: March 14
Objective
To gain experience writing a socket server and implementing a real protocol.
Task
For this project, you will write a simple HTTP server which is capable of serving HTML pages to web browsers. You will implement a subset of the HTTP protocol that the web is built on.
HTTP Overview
HTTP is built on requests. A client sends a request to a server and the server handles it. HTTP supports several different types of requests, but the most common is GET, which is the only one we will handle.
A GET request sent from my browser looks like this:
GET / HTTP/1.1 Host: 127.0.0.1:8080 User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive Upgrade-Insecure-Requests: 1
The first line is the most important. It starts with the request method, "GET". Next is the file that is being requested, in this case "/". This means the root file of the website. Finally the protocol version "HTTP/1.1".
The rest of the lines are extra information sent to the server. For our purposes, we can ignore all these lines. The request ends with a blank line.
After sending the request, the client waits for a response. The response can look like this:
HTTP/1.1 200 OK Date: Mon, 25 Feb 2019 17:31:23 GMT Server: Apache/2.4.29 (Ubuntu) Content-Type: text/html; charset=UTF-8 Content-Length: 6339 <file contents>
HTTP allows more header fields that the server can send, but these are the most important ones, which we will implement. Note the blank line between the header and file! More information for each one follows:
HTTP Response
This first line begins wit "HTTP/1.1". Next is the response code. There are many available (here is a full list), but we will stick with these:
Number Response Meaning 200 OK The file was returned successfully 400 Bad Request The server could not make sense of the client's request. 403 Forbidden The file the client wants exists, but we are not able to read it. 404 Not Found The file the client wants does not exist. 405 Method Not Allowed The client is using an unsupported method. We will return this if they use anything other than "GET". 500 Internal Server Error There was some other sort of error. 505 HTTP Version Not Supported The client's HTTP version is not supported by this server. Note that in case of errors, you should still return all of the other response fields, and a "file" as well. The file contents are normally a simple HTML file which repeats the error to the user.
Date
The date should a GMT time and be formatted this specific way. In Python, we can get a string in the right format like this:
import datetime now = datetime.datetime.now(datetime.timezone.utc) date = now.strftime("%a, %d %b %Y %H:%M:%S GMT")Server
This is just the name of the server. You can call yours whatever you like.
Content-Type
This tells the client what sort of data is coming and how it is encoded. We will stick with the format shown here.
Content-Length
This is the number of bytes in the file that will come. It does not include the header.
Program Details
The major things your program will need to do are outlined below:
- When running this on your Google cloud VM, make sure you have an open port, and make a note of your internal and external IPs.
- Start by creating a server socket. Bind the internal address at the port you selected, and listen on it for clients to connect.
- The rest of the program is in an infinite loop. Each iteration of the loop, accept one connection and handle it.
- Once you have a connection, read the request from the socket. The whole
request will come in as one
recv, so use a large size, like 4096. - Next get the first line of the request. Python's string .split() method is handy for this. The first line is all we care about.
- If the first line does not have three "words" in it, return 400. If the first word of the line is not "GET", return 405. If the last word is not "HTTP/1.1", return 505.
- Look at the file name that the user requested. If it is something like "/page.html", then try to open up "page.html" (removing the /) from the current directory. If it is "/", then try to open up "index.html" as the default site page.
- If the file is not found, return 404. If the file exists but can't be
read, return 403. The easiest way to do this is to put the file code in a
tryblock and catch the PythonFileNotFoundErrorandPermissionErrorexceptions. - Then read the contents of the file into a string.
- Next, send all of the header information, with a 200 code. The length of the string after it's encoded can be used for the Content-Length. Then send a blank line, and finally the file contents.
- If your program encounters an exception doing any of the above (that has not already been caught), you can return 500 to the client.
Testing
In order to test your server, you can run it on your VM and then connect to it with a regular web browser. Put your external IP address in as the URL. The default HTTP port is 80. If you are using something else (as we probably are) then you can put a colon after the host in the URL bar. This way you can connect with your HTTP server:

Connecting to your server in the URL bar of a browser
If you leave off a file name, then your server should give an index.html file by default. If there is no file with that name in the same directory, it should return 404.
You can use the following index.html file for testing purposes. You can download it and unpack it with the following commands:
$ wget http://ianfinlayson.net/class/cpsc414/assignments/index.html.gz $ gunzip index.html.gz
You should also test specifying another file. To do this, just append the file name in the URL bar, like this:

Specifying a file name in the URL bar of a browser
Make another HTML file or two to test this functionality.
General Requirements
When writing your program, also be sure to:
- Put a comment at the top of your program with your name, the name of the program, the purpose of the program and the honor pledge.
- Put descriptive comments in your code to explain how it is working.
- Test your program thoroughly before turning it in. This will involve running multiple instances of it to see that the instances can communicate!
Submitting
To submit your program, email the program file to ifinlay@umw.edu.