Introduction to TCP Connection Establishment for Software Developers | by Rishabh Rawat | Mar, 2022

Transmission Control Protocol (TCP) provides a reliable, connection-oriented, byte-stream, transport layer service. And its implementation is quite interesting.

In this article, we’ll explore how the TCP connection establishment works, how it ensures reliability by maintaining a connection state, and if it fits every use case.

Before getting into the internals of the handshake, let’s have a look at TCP.

A TCP connection is defined to be a four-tuple consisting of two IP addresses and two port numbers. Each IP address port number pair represents an endpoint.

This means a single server can connect to many clients if their IP address and/or port number is unique.

Transmission Control Protocol (TCP) is one of the transport layer protocols available to us and it is widely used, for good reasons.

To understand why it is even needed, let’s take a look at the protocol stack in the TCP/IP model:

TCP/IP model

The HTTP request coming from the application layer (e.g., your browser) goes through all the layers to get sent across the internet. Internet layer handles sending out the little chunks of data that are also known as IP datagrams. The datagrams act as an envelope for the TCP segments and the job of the IP layer is to send them across the internet.

Since the IP layer is not aware of the TCP connection, two packets corresponding to the same connection often get sent over different routes. This makes the data transfer over the internet unreliable and gives rise to various issues like duplicate packets, out-of-order packets, packet loss, etc.

TCP provides handling for all these scenarios and provides a guaranteed, loss-less, in-order delivery of packets at the receiving end.

Note: The reliability aspect of TCP only applies to both “ends” of a connection. Packets get shuffled, lost, duplicated in transit all the time.

TCP segment carries all the meta-information about the connection in a header. The basic TCP header is 20 bytes (without options); this means 20 bytes of data overhead for any packet to travel.

Let’s understand what constitutes a TCP header:

TCP header

Acknowledgement Number, Window Size, ECE, and ACK bits carry data flowing in opposite directions relative to the sender.

1. Source and Destination port.

2. Sequence Number: This identifies the first byte in the segment sent to receiving TCP.

3. Acknowledgement Number: This contains the next Sequence Number that the sender of the acknowledgement expects to receive; i.e.,

Acknowledgement Number = Sequence Number + 1

4. Window Size: This is the number of bytes the receiving TCP is willing to receive. It is a 16-bit field, limiting the window size to 65,535 bytes. We use Window Scaling as a workaround for this bottleneck.

5. TCP Checksum: This is mandatorily sent by the sending TCP and verified by the receiving end in order to detect data corruption.

6. Urgent Pointer: This mechanism in TCP is used to send some specific and urgent data to the other end. It is valid only if the URG field is set.

7. Other Bit Fields: Majorly two fields are used during the connection establishment process:

  • SYN: This bit is turned on in the first segment, at the start of the connection establishment phase.
  • ACK: It is used when acknowledgement needs to be sent out. Always on except for the first and last (connection teardown) packet.

All this information about the connection is stored in the TCP header. Combining this header with the application data gives us the TCP segment, shown below:

TCP segment consists of header and application data

But till this stage, we are only aware of source and destination ports. We also need the source and destination IP addresses in order to uniquely identify a TCP connection (remember?). And that happens in the next layer (i.e., IP layer) during transmission.

TCP/IP datagram contains the IP header on top of the TCP header

IP layer simply adds its own header on top of the TCP segment it receives making it an IP Datagram. These headers are progressively stripped off at the receiving end, in reverse order.

So, TCP and IP layers collectively make up a unique TCP connection. And we get the TCP/IP Protocol Suite.

I’d recommend opening this packet trace file by Chris Greer alongside. All the fields discussed below can be found in the TCP layer of each packet. Only the first three packets correspond to the handshake process.

Connection establishment is started by an active opener (usually the client) who wants to connect to a passive opener (usually a server) and a total of three TCP segments are transferred during the process.

The goal of this exercise is to let each end of the connection know that a connection is starting, share some important configurations (aka TCP options), and exchange the Initial Sequence Number (ISN).

Active opener decides the ISN when initiating a connection. The unpredictability of ISN is crucial to the security of the connection. An outsider can fool the receiving host (or passive opener) and pretend to be the actual sender if they can predict the ISN.

Let’s take a look at each step more closely:

TCP Three-Way Handshake

[Segment 1]: The client sends an SYN segment

The first TCP segment sent by the active opener (or client) contains the following:

1. Server’s port stored in Destination Port

2. SYN bit set in the TCP Flags

3. ISN of the client stored in Sequence Number

NOTE: The trace file shows a relative value for the Sequence Number to make it human-readable, its real value is shown on the right in hexadecimal.

4. Some configuration options stored in TCP options (we’ll tackle them next)

[Segment 2]: The server responds with an SYN-ACK segment

The server sends its own SYN segment. It also acknowledges the segment received from the client. It sends a segment with:

1. SYN bit turned on

2. Sequence Number = ISN(server)

3. ACK bit turned on (to acknowledge the segment received from the peer)

4. Acknowledgement Number = ISN(client) + 1

[Segment 3]: The client sends a final ACK segment

Finally, the client acknowledges the SYN received from the server with an ACK. Essentially:

1. It sets the ACK bit to acknowledge the server’s SYN segment

2. Sequence Number = ISN(client) + 1

3. Acknowledgement Number = ISN(server) + 1

If the SYN segment is lost, it is retransmitted until an ACK for it is received.

There are some additional configuration settings that help in an efficient flow of data in a TCP connection. Some of these options can only be set once during the connection establishment process while others can be used at any point in time during the connection lifespan.

Let’s take a look at some of the most commonly used TCP options.

Maximum Segment Size (MSS)

It is the largest segment that a TCP is willing to receive from its peer and, consequently, the largest size its peer should ever use when sending.

Maximum Segment Size

The important thing to note here is that MSS only counts the application data and not the TCP and IP headers. Maximum Transferable Unit (MTU), on the other hand, looks at the whole packet including the TCP and IP headers.

MSS and consequently MTU’s size is configurable, but it has to be under the maximum size capability of the Ethernet frame that carries those packets. MTU size can be set greater than the frame’s capability but then the packet would need to go through fragmentation to get delivered.

Window Scaling

Window size tells the peer in the connection how much receive buffer it has allocated or left for that particular connection. This option is set during the connection establishment phase and cannot be changed during the connection lifetime.

The window size in the TCP header is 16 bits, which makes the max value be 65536 bytes (2¹⁶) only. In cases of high latency networks, having a window size of 64KB can make the round trips bigger and the Round Trip Time (RTT) longer.

Window Scaling TCP option is a 14-bit field that left shifts the Window Size value making it significantly larger with max up to 1GB (65,535 bytes * 2¹⁴). This is most useful when working with a high latency large bandwidth situation.

Let’s understand with the help of an illustration here:

Original window size before scaling

Here, the maximum size of packets that the sender can send before receiving any acknowledgement is 64KB. We can observe that the sender is idle after it sends the maximum possible bytes of data and is waiting for an acknowledgement so that it can send more data.

Now, let’s look at the packet transmission after window scaling is introduced:

Increased window size after scaling

After the window scaling is set, the sender is able to send twice the amount of data and it reduces the idle time of the transmitting end. Thus, a better utilization.

Likewise, using a bigger window scaling factor will further increase the effective window size. The bigger the window size, the more data sending TCP can send without receiving any acknowledgements.

Selective Acknowledgements (SACK)

Packets sent over the network often get lost, resulting in sudden jumps in acknowledgement numbers, and it makes the byte-stream non-continuous. This creates holes in the received data and sending TCP doesn’t know which packets need retransmission.

With SACK supported at both ends (negotiated during connection establishment), a receiver is able to communicate the packets it received after the gap. Two fields help in figuring out the missing packets:

1. Acknowledgement Number set to the last packet offset it received before the gap.

2. A SACK block in the TCP options containing the block of data it received after the gap.

The sending TCP takes the offset difference between the first packet after the gap and the last packet before the gap. This makes it easy for the sending TCP to recognize what block of data it needs to retransmit.

So, for example, if a receiving TCP sends a (duplicate) acknowledgement of 1,000 and the SACK block contains a range of 1100–1500, it is clear that the sending TCP needs to retransmit only the packets from 1,000 to 1,100.

NOP

This helps TCP to pad fields to a multiple of four bytes when the actual data doesn’t follow the size constraints.

EOL

It indicates the “end of the options” list and shows that no further processing of the options list is required.

If we closely observe what happens in the handshake process, the two parties are keeping track of an offset value (sequence number) that they use to send and receive data. Both ends maintain a connection state.

Keeping track of an offset value allows both parties in the connection to determine if there’re any issues with the packets being transmitted and received. It helps in determining duplicate packets, correcting out-of-order packets, and retransmitting in case of packet loss.

These issues occur because of how the IP layer works. It has no context of a TCP connection. At the level of the router, it sends packets based on some path computations which means that two packets corresponding to the same TCP connection may go through a different route to reach the same destination. This is why TCP has to handle those scenarios.

Here’s an example of mild packet reordering:

Correcting out-of-order packets

As packets corresponding to the same TCP connection often travel over different routes, they reach the receiving TCP out of order. Since TCP guarantees in-order delivery, it stores the out-of-order packets in its receiver buffer and waits for the missing packets to fill the “holes” in the byte stream.

Mild packet reordering can be observed
  1. The receiving end got the P4 packet before it could receive P3. As a result, it keeps the P4 in its receive buffer and sends a (duplicate) acknowledgement of the last packet received. Then, it waits for P3 to come in.
  2. Once P3 arrives, it sends the acknowledgement corresponding to the last packet it has successfully received, which is P4.

Note: Since TCP acknowledgements are cumulative in nature, sending an acknowledgement of P4 implies the successful reception of P3 as well.

Thus, Sequence Number plays a crucial role in keeping track of lost packets, out-of-order packets, and even duplicate packets. This brings the need for a connection establishment and maintaining the connection state at both ends.

There are several cases where UDP is preferred over TCP. Some factors that contribute to it are:

1. A service can’t afford the overhead of TCP handshakes or the handshake cost is fairly significant relative to the actual data being sent.

2. Occasional packet loss is acceptable (depends on the use-case).

Some examples where UDP is preferred over TCP are multiplayer games, weather data, video streaming, etc.

We’ve learned about the need for a TCP connection, how that helps, and looked at the various configuration options to suit a wide range of requirements.

It’s fascinating to see how much abstraction TCP provides. A developer working on the application level never has to think about it.

If you want to explore more, a good starting point would be Chris Greer’s playlist on TCP.

If you wanna dive deep into TCP/IP, I’d highly recommend the TCP/IP Illustrated Vol. 1 book. It covers the topic in great depth.

This article was originally published on rrawat.com.

Want more such articles?You can join my newsletter here.I write about my learnings and experiences related to web development technologies biweekly.

Leave a Comment