Transport Layer
1. Introduction
Let’s start by positioning the transport layer within the broader context of computer networks. As outlined in the materials, we’re focusing on the transport layer’s role in moving data between applications running on different hosts.
Think of the network layers as a stack: the application layer (where apps like email or the web live) relies on the transport layer to get their data across the network. The transport layer, typically running in the OS kernel, then uses the network layer (like IP) to actually move the data through the network.
Today, we’ll dive into three core topics from the week 4 content: multiplexing-demultiplexing, checksums, and reliable data transfer. By the end of this tutorial, you’ll understand how the transport layer differentiates between multiple applications on a single host, how it detects errors in data, and how it ensures data is delivered correctly—even when the network is unreliable.
The reading guide points to Chapter 3, Sections 3.1 to 3.5.2, which aligns perfectly with what we’ll cover. Let’s get started.
2. Multiplexing & Demultiplexing
First, let’s tackle multiplexing and demultiplexing. These are two sides of the same coin: multiplexing is how the transport layer collects data from multiple applications and packages it for sending, while demultiplexing is how it delivers incoming data to the correct application on the receiving end.
Let’s start with UDP, the connectionless protocol. For UDP demultiplexing, the key is the destination IP address and destination port number . Imagine a server with three applications: one using port 6428, another 5775, and a third 9157. When a UDP segment arrives with a destination port of 5775, the transport layer knows to send that data to the application using port 5775 . This is why, if 100 clients communicate with a UDP server, the server only needs one socket—all segments are sorted by their destination port .
Now, TCP, the connection-oriented protocol, is more complex. TCP uses a 4-tuple for demultiplexing: source IP, source port, destination IP, and destination port . Why? Because a TCP server (like a web server on port 80) can have multiple simultaneous connections with different clients. For example, if two clients connect to the server’s port 80, their segments will have different source IPs and ports, allowing the server to create a separate socket for each connection . So, with 100 clients, a TCP server has 101 active sockets: 1 “welcoming socket” (on port 80) and 100 connection-specific sockets .
To illustrate, consider a web server (IP B, port 80) receiving segments from two browsers: one on host A (port 9157) and another on host C (port 5775). The server uses the 4-tuple (A,9157,B,80) and (C,5775,B,80) to direct each segment to the right browser process .
Let me emphasize: UDP uses destination IP and port because it’s connectionless—no ongoing relationships between senders and receivers. TCP needs the full 4-tuple because it maintains persistent connections, and the extra details are needed to tell those connections apart.
3. UDP Protocol
UDP, or User Datagram Protocol, is described in the materials as a “no frills,” “bare bones” protocol . Unlike TCP, it offers no guarantees: segments can be lost, delivered out of order, or corrupted . But don’t dismiss it—its simplicity is its strength.
First, why use UDP? It has no handshake, so there’s no delay from setting up a connection (unlike TCP’s three-way handshake) . It has a small header, which reduces overhead, and no congestion control, meaning it can send data as fast as the application demands—critical for latency-sensitive apps like video streaming or online gaming .
The UDP header is compact: 32 bits total, split into four fields: source port, destination port, length (of the entire segment, including header), and checksum . The source and destination ports handle demultiplexing, as we discussed, while the checksum is for error detection.
Let’s focus on the UDP checksum. Its goal is to detect flipped bits in the transmitted segment . Here’s how it works: the sender treats the entire segment (including the UDP header, data, and a “pseudo-header” from the IP layer with source/destination IPs) as a sequence of 16-bit integers . It calculates the one’s complement sum of these integers and stores the result in the checksum field . The receiver repeats this calculation; if the result matches the checksum field, the segment is likely uncorrupted .
But the checksum isn’t perfect. As the materials note, it’s possible for bit errors to cancel each other out, leaving the checksum unchanged . Still, it’s a simple and effective way to catch most errors.
Applications using UDP include streaming multimedia (which can tolerate some loss), DNS (quick request/response), SNMP (network management), and even HTTP/3 (which adds reliability at the application layer) . For these apps, speed and low overhead matter more than perfect reliability.
4. Reliable Data Transfer
Reliable data transfer is about ensuring data sent from a sender reaches the receiver correctly, even over an unreliable network that may corrupt, lose, or reorder packets. The materials walk through several hypothetical protocols (rdt1.0 to rdt3.0) and then more efficient pipelined protocols like Go-Back-N and Selective Repeat.
Let’s start with the basics. In a perfectly reliable channel (rdt1.0), no special mechanisms are needed—data is sent and received without errors or loss. But real networks aren’t perfect, so we need protocols that handle problems.
rdt2.0 deals with bit errors. It uses checksums to detect errors and adds feedback from the receiver: ACKs (acknowledgments) for good packets and NAKs (negative acknowledgments) for corrupted ones. If the sender gets a NAK, it retransmits the packet. However, if an ACK or NAK is corrupted, the sender can’t tell if the receiver got the packet—leading to duplicates.
rdt2.1 fixes this by adding sequence numbers (0 and 1) to packets. Now, the receiver can detect duplicates and discard them, while the sender knows to retransmit if it gets a corrupted ACK/NAK. rdt2.2 simplifies further by using only ACKs (no NAKs); the receiver sends an ACK with the sequence number of the last good packet, so a duplicate ACK tells the sender to retransmit —this is the approach TCP uses.
Next, rdt3.0 handles packet loss. The sender starts a timer when it sends a packet; if no ACK arrives before the timer expires, it retransmits the packet. This handles loss, but rdt3.0 is a “stop-and-wait” protocol—it sends one packet, then waits for an ACK before sending the next . This is inefficient: on a 1 Gbps link with 15 ms delay, the sender is only busy 0.027% of the time.
Pipelined protocols solve this by allowing multiple “in-flight” packets (sent but not yet acknowledged).
-
Go-Back-N (GBN): The sender has a window of up to N unacknowledged packets. It uses cumulative ACKs (ACK(n) means all packets up to n are received). If a packet is lost, the sender retransmits it and all subsequent packets in the window. The receiver discards out-of-order packets and re-ACKs the last in-order packet.
-
Selective Repeat (SR): The sender maintains a timer for each unacknowledged packet and retransmits only lost ones. The receiver buffers out-of-order packets and sends individual ACKs for each good packet. To avoid confusion between new and duplicate packets, the window size must be at most half the sequence number space.
Both improve efficiency, but SR is more bandwidth-friendly (only retransmits lost packets) while GBN is simpler to implement.
5. Quiz & Discussion
-
UDP Sockets: 100 clients communicate with a UDP web server. How many active sockets are at the server and each client?
Answer: Server = 1, each client = 1. UDP uses one server socket, with demultiplexing via port numbers. -
TCP Sockets: 100 clients communicate with a TCP web server. Active sockets?
Answer: Server = 101 (1 welcoming socket + 100 connection sockets), each client = 1. TCP needs a separate socket per connection. -
TCP Server Ports: Do all TCP server sockets for 100 clients use the same port?
Answer: Yes. They all use the server’s well-known port (e.g., 80); the 4-tuple (source IP/port, destination IP/port) distinguishes clients. -
RDT with Corruption Only: What’s needed?
Answer: Checksums (detect errors), ACKs/NAKs (feedback), sequence numbers (detect duplicates). -
RDT with Loss + Corruption: What’s needed?
Answer: Checksums, ACKs, sequence numbers, and timeouts (to handle loss). -
GBN vs. SR: Which is false?
Answer: “GBN maintains a separate timer for each outstanding packet.” GBN uses one timer for the oldest unacknowledged packet. -
GBN ACK Behavior: Receiver ACKed up to 24, then gets 27 and 28. ACKs sent?
Answer: GBN sends ACK(24) (cumulative), SR sends ACK(27) and ACK(28) (individual). -
UDP Checksum: What does it include?
Answer: UDP header, data, and IP pseudo-header (source/destination IPs, protocol, length). -
Pipelining Benefit: Why is it better than stop-and-wait?
Answer: Allows multiple in-flight packets, increasing sender utilization. -
Selective Repeat Window: To avoid ambiguity, window size must be?
Answer: ≤ ½ the sequence number space.
Let’s discuss any questions you have—these concepts are foundational for understanding how TCP works, which we’ll cover next week.
Wrap-Up
Today, we covered how the transport layer multiplexes and demultiplexes data (UDP with ports, TCP with 4-tuples), the simplicity and use cases of UDP, and how reliable data transfer protocols evolve from basic error handling to efficient pipelined approaches like GBN and SR. These concepts are key to understanding how networks ensure data gets where it needs to go—whether with speed (UDP) or reliability (TCP).
1599

被折叠的 条评论
为什么被折叠?



