Lab 6: UDP Reliability

Objective

To experiment with the UDP protocol and measure its reliability.

UDP Client/Server Communication

Client/Server communication happens a bit differently with UDP as opposed to TCP. Instead of calling accept and connect to establish a long-term communication channel, UDP programs specify the address each time they send data. Likewise, they receive from any machine that sends data to them. There is no fixed connection.

The server program below waits for a client to send them a message. It then sends 10,000 digits of $\pi$ to that address, and finally a "-" character to mark the end of the transmission.


#!/usr/bin/python3

import socket

# host (internal) IP address and port
HOST = '10.142.0.3'
PORT = 4040

# open the file and read our value of Pi
f = open("pi.txt", "r")
pi = f.read()

# create our socket
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)

# allow us to reuse an address for restarts
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

# set the socket host and port number up
sock.bind((HOST, PORT))

# wait for a client to connect to us with the "pi" message
print("Waiting for a connection...")
request, address = sock.recvfrom(1024)
print("Sending pi to", address)

# send the digits of pi one by one
for digit in pi:
    sock.sendto(digit.encode(), address)

# send lots of the ending character (in case they are dropped)
for i in range(1000):
    sock.sendto(b"-", address)

print("Done!")

# close connections
sock.close()

Notice there is no accept call, and also the use of sendto and recvfrom in place of the regular calls.

Task

For this lab, you will be testing how reliable UDP actually is - in a typical UDP transmission, how many packets are actually lost?

To do this, you will need to run the server above, as well as a client (provided below). You will then write a little code to compare the value of Pi received from the server to the actual value.

You will then run the programs a couple of times to see how many packets are lost and/or rearranged. You will turn in the results of this analysis.

Details

Start by answering the following question: out of the 10,000 digits of Pi we are sending, how many do you think will be lost or sent out of order? Write down your answer. This is just a hypothesis, so no worries if you end up being way off :)
Download the pi-server.py and pi.txt files onto your VM. You can use wget to download them directly.
Edit the server to use your internal IP address, and a port which you have open in the firewall. You may need to open a new port, and be sure to specify that there is a UDP traffic. You can review these directions to open a port.
Download the pi-client.py and pi.txt files to your own machine, or the lab computer. Do not run this program on your VM (or another Google Cloud VM), because then it will be on the same machine (or same LAN) as the server. For this test to be meaningful, there must be some distance between the two programs.
Edit the IP address and port of the client to be those of your VM.
You can now start the server, and then run the client. It should be able to receive a value of $\pi$ from the server.
The next step is to measure how close the version we receive is to the real version. We will do this two ways. First, print out the length of the version we receive (called "received" in the code) and the length of the real version from the file (called "pi"). The real length should be 10,002.

Just the length is not enough to see if the value is correct - some digits may have been sent out of order. In order to measure the closeness of two sequences, we can use the Levenshtein distance, which will tell us the number of changes that were made between the two sequences. The algorithm is difficult to implement, so you can use this Python function to compute it (taken from Stack Overflow):


def MED_character(str1,str2):
    cost = 0
    len1 = len(str1)
    len2 = len(str2)

    # output the length of other string in case the length of any of the string is zero
    if len1 == 0:
        return len2
    if len2 == 0:
        return len1

    # initializing a zero matrix
    accumulator = [[0 for x in range(len2)] for y in range(len1)] 

    # initializing the base cases
    for i in range(0, len1):
        accumulator[i][0] = i;
    for i in range(0, len2):
        accumulator[0][i] = i;

    # we take the accumulator and iterate through it row by row. 
    for i in range(1,len1):
        char1 = str1[i]
        for j in range(1,len2):
            char2 = str2[j]
            cost1 = 0
            if char1 != char2:
                cost1 = 2 # cost for substitution
            accumulator[i][j]=min(accumulator[i-1][j]+1, accumulator[i][j-1]+1, accumulator[i-1][j-1] + cost1 )

    cost = accumulator[len1 - 1][len2 - 1]
    return cost

Call this function on both versions of Pi and print the result of it. This function will take a little bit of time on such large strings (about 1 minute on my machine).

Now run your server and client 3 total times. Record the length of Pi you received with UDP each time, as well as the Levenshtein distance, which is the number of edits needed to change one string into the other.
Is UDP more or less reliable than you hypothesized?

Submitting

When you're finished, email your analysis to ifinlay@umw.edu. This will include your initial hypothesis, the result of the three executions of the client/server and also your answer to number 10.