To gain experience writing a socket server and implementing a real protocol.
For this project, you will write a simple HTTP server which is capable of serving HTML pages to web browsers. You will implement a subset of the HTTP protocol that the web is built on.
HTTP is built on requests. A client sends a request to a server and the server handles it. HTTP supports several different types of requests, but the most common is GET, which is the only one we will handle.
A GET request sent from my browser looks like this:
GET / HTTP/1.1 Host: 127.0.0.1:8080 User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:65.0) Gecko/20100101 Firefox/65.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate Connection: keep-alive Upgrade-Insecure-Requests: 1
The first line is the most important. It starts with the request method, "GET". Next is the file that is being requested, in this case "/". This means the root file of the website. Finally the protocol version "HTTP/1.1".
The rest of the lines are extra information sent to the server. For our purposes, we can ignore all these lines. The request ends with a blank line.
After sending the request, the client waits for a response. The response can look like this:
HTTP/1.1 200 OK Date: Mon, 25 Feb 2019 17:31:23 GMT Server: Apache/2.4.29 (Ubuntu) Content-Type: text/html; charset=UTF-8 Content-Length: 6339 <file contents>
HTTP allows more header fields that the server can send, but these are the most important ones, which we will implement. Note the blank line between the header and file! More information for each one follows:
HTTP Response
This first line begins wit "HTTP/1.1". Next is the response code. There are many available (here is a full list), but we will stick with these:
Number | Response | Meaning |
---|---|---|
200 | OK | The file was returned successfully |
400 | Bad Request | The server could not make sense of the client's request. |
403 | Forbidden | The file the client wants exists, but we are not able to read it. |
404 | Not Found | The file the client wants does not exist. |
405 | Method Not Allowed | The client is using an unsupported method. We will return this if they use anything other than "GET". |
500 | Internal Server Error | There was some other sort of error. |
505 | HTTP Version Not Supported | The client's HTTP version is not supported by this server. |
Note that in case of errors, you should still return all of the other response fields, and a "file" as well. The file contents are normally a simple HTML file which repeats the error to the user.
Date
The date should a GMT time and be formatted this specific way. In Python, we can get a string in the right format like this:
import datetime
now = datetime.datetime.now(datetime.timezone.utc)
date = now.strftime("%a, %d %b %Y %H:%M:%S GMT")
Server
This is just the name of the server. You can call yours whatever you like.
Content-Type
This tells the client what sort of data is coming and how it is encoded. We will stick with the format shown here.
Content-Length
This is the number of bytes in the file that will come. It does not include the header.
The major things your program will need to do are outlined below:
recv
, so use a large size, like
4096.try
block and catch the Python FileNotFoundError
and
PermissionError
exceptions.In order to test your server, you can run it on your VM and then connect to it with a regular web browser. Put your external IP address in as the URL. The default HTTP port is 80. If you are using something else (as we probably are) then you can put a colon after the host in the URL bar. This way you can connect with your HTTP server:
Connecting to your server in the URL bar of a browser
If you leave off a file name, then your server should give an index.html file by default. If there is no file with that name in the same directory, it should return 404.
You can use the following index.html file for testing purposes. You can download it and unpack it with the following commands:
$ wget http://ianfinlayson.net/class/cpsc414/assignments/index.html.gz $ gunzip index.html.gz
You should also test specifying another file. To do this, just append the file name in the URL bar, like this:
Specifying a file name in the URL bar of a browser
Make another HTML file or two to test this functionality.
When writing your program, also be sure to:
To submit your program, email the program file to ifinlay@umw.edu.
Copyright © 2024 Ian Finlayson | Licensed under a Creative Commons BY-NC-SA 4.0 License.