The network layer allows for two machines to communicate together even when they are not directly connected together. It does this by forwarding packets across a series of routers until it reaches the destination.
However, the network layer makes no guarantee that packets actually arrive at the destination. It is possible packets will be dropped by a router at some point. It is also possible that packets will arrive in a different order then they were sent.
The job of the transport layer is to provide applications with logical communication, giving them the impression that they are directly connected and can send data reliably back and forth:
A logical connection (gold) on top of the network layer.
Of course the transport layer is built on top of the network layer. It gives applications this illusion so they can simply assume that they can communicate without needing to worry about the complexities beneath.
The transport layer can provide the following services:
Unlike the network layer, where IP is the dominant protocol, there are two widely used transport layer protocols: TCP and UDP.
TCP
The Transmission Control Protocol is the more widely used of the two. It provides all of the services of the above list including reliability, and flow and congestion control.
UDP
The User Datagram Protocol is a much more basic protocol which only provides the bare minimum that the transport layer must provide. It only adds multiplexing and demultiplexing, and light error checking. It does not perform any checking for lost or out of order packets, or flow or congestion control.
Since UDP provides so little, you may wonder why anyone would use it? There are a couple reasons:
When we talked about socket programming, we talked about how sockets can be created in Python with this code:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Here the socket.AF_INET
specifies the network layer
protocol of IP, and socket.SOCK_STREAM
specifies the transport
layer protocol of TCP.
We can create a UDP socket instead with this code:
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
TCP is a connection-oriented protocol. With TCP sockets, the server and client
establish a connection with accept
and connect
respectively. Then the call sendall
and recv
to
communicate.
With UDP, there is no fixed connection. To send, we use sendto
which takes the address in the parameter list. To recv, we use recvfrom
which returns the address of whoever sent us data.
Also, of course a UDP socket will not perform any of TCP's reliability or flow control functions.
When packets arrive at the transport layer from the network layer, they must be delivered to some application. However, there are usually multiple applications running at once.
For example, say you have an SSH session open, are using a music streaming service, and have a web browser open. When packets come in to your network card, the network layer recognizes that they are addressed to your machine by IP address, but how does it know which application to deliver them to?
The problem of delivering packets to the correct application is called demultiplexing. The problem of sending data with header information to identify the target application is called multiplexing.
Below is a diagram of a UDP packet (also called a datagram):
A UDP Datagram
The length field is the number of bytes in the packet. The checksum is used for error checking of packets. While other layers provide their own error checking UDP (and TCP) does as well. The main reason is because the layers of the Internet are independent of each other. UDP could be used with different network/link layer protocols which don't do error checking. It's also possible that data was corrupted in a routers memory before being forwarded which would not be caught any other way.
When UDP receives a packet with a bad checksum, it just discards it. UDP does not promise any reliability, but it does not deliver data it knows to be corrupted.
The port numbers are used for multiplexing. Whenever sockets are connected, they are assigned ports by the transport layer to uniquely identify them within the host. When packets arrive from the network layer, UDP simply checks their destination port number, and delivers them to the socket which has bound that port.
Multiplexing in TCP works much the same way.
While UDP does not provide any reliability, TCP does. Before talking about TCP in detail, we will consider the problem TCP has in building reliable communication on top of an inherently unreliable network.
What happens if a packet is corrupted? TCP must handle this problem by somehow telling the sender that they must retransmit it. This is done in TCP by sending ACK messages to acknowledge that a packet was received successfully or NAK packets to indicate an error.
However, it is possible that the ACK or NAK packets themselves are corrupted. What should happen then?
This problem is related to something called the "Two Generals Problem". Consider two generals who have armies which are separated by enemy soldiers:
The two armies can only defeat the enemy if the coordinate their attacks. If only one army attacks, it will be defeated. So they must communicate the time of the attack, but the messages may be intercepted by the enemy in the middle.
If general 1 sends an attack time, how can he be sure that general 2 receives it? Well general 2 can send an acknowledgement back, but how does general 2 know that general 1 received his message?
There is no general solution to this problem, so TCP can't ever really guarantee that both programs communicating are in sync. It is impossible for both parties to be completely sure of a mutual communication like this.
The result is that TCP does guarantee correctness - it will never give an application bad or missing data. However, it does not guarantee progress, that sends or receives will eventually succeed. It's also possible that the other program will crash or go offline, in which case the communication won't work anyway.
We have seen what the job of the transport layer is, and looked at the simpler UDP transport layer protocol. UDP provides logical communication and multiplexing, but does not provide reliability or flow control.
Next time we will look at how TCP provides these services.
Copyright © 2024 Ian Finlayson | Licensed under a Creative Commons BY-NC-SA 4.0 License.