What does TCP do?
TCP/IP
Protocol Suite is the set of communications protocols on which the Internet and
similar networks are built. As its name suggests, Transmission Control Protocol
(TCP) and Internet Protocol (IP) are the most important protocols within the
suite.
IP
in TCP/IP Protocol Suite deals with classic network-layer tasks such as
addressing, datagram packaging and routing, which provide basic internetworking
capabilities. TCP, on the other hand, can be considered as a nice user
interface to the capabilities of IP. It basically fills in the capabilities
that IP does not provide. IP is connectionless, unreliable and unacknowledged.
Data is sent over an IP internetwork in a "best effort" manner. No
connection is established, there is no guarantee that the data is sent to the
destination and the sender doesn't know if the data is got to the destination.
Many applications, however, need to be able to know that the data they send
will get to its destination without loss and error. Besides, they would want
the connection between two devices to be managed and the problems such as
congestion and flow control to be taken care automatically by the mechanism
which would manage the connection. Unless any special meschanism were
dedicated, applications would need to carry out these tasks individually. But,
as you can imagine, that would be a serious waste of effort. Hopefully, OSI
Referefence Model presents Transport Layer to handle all these important issues
and TCP is a full-featured Transport Protocol.
Shortly
speaking, TCP handles connections and provides reliability and flow control.
Reliability can be defined as ensuring that the data which is sent actually
arrives its destination, otherwise, detecting this and resending the data. Data
Flow Control is managing the rate at which data is sent so that it does not
overwhelm the device that is receiving it. TCP is a rather complex protocol
that includes a number of sophisticated functions to ensure that applications
function in the potentially difficult environment of a large internetwork.
Now
let's hit the road and see some important functions of TCP:
TCP Addressing with Ports
TCP/IP is designed to allow
different applications to send and receive data simultaneously by using the
same Internet Protocol (IP) Software on a host device. It is necessary to
multiplex transmitted data from many applications as it is passed down to the
IP Layer in order to achieve simultaneous data sending. Accordingly, as the data
is received, data is demultiplexed and the appropriate data is passed to each
application software on the receiving host.
Transport Layer
Protocols TCP and UDP represent a transition point between OSI model hardware-related
layers (Layer 1,2,3) and the software-related layers (Layer 5,6,7). In regular
usage of internet, most of us are usually running several different
applications simultaneously. So, a typical TCP/IP host has many processes which
want to send data to and receive data from remote hosts. All data to be sent
must be sent using the same interface to the internetwork, using the IP Layer.
This means that the data from all applications that need TCP/IP communication
is aggregated for the use of IP (Network) Layer. The stream of data to be sent
is packaged as segments in Transport Layer and then the segmented data is
passed to the IP Layer, where they are packaged as IP datagrams and sent out to
the internetwork in order to be delivered to different destinations. This
technical term is called multiplexing.
A complementary mechanism is
responsible for the receipt of datagrams. While IP Layer multiplexes datagrams
from several different application processes to be sent out, at the same time,
it receives many datagrams that are intended for different processes and that
come from different remote hosts. This is the reverse of multiplexing, which is
called demultiplexing.
Here
comes the question: Think of a series of IP datagrams that are received by a
host. How are they demultiplexed to different application processes in the receiving
host device? Does Destination Internet Address Field in IP Header help? Well,
it actually sucks in this case. It is
because all the datagrams that are received in the IP Layer are expected
to have the same Destination Internet Address field (which is supposed to be
the receivng host device's IP).
So,
how can we manage that? Demultiplexing the received data to the application
processes is achieved in 2 steps actually. In IP Layer, Destination Protocol
field in the IP header indicates where to send the decapsulated data in the
received datagram. This is most probably TCP or UDP. This means TCP or UDP must
figure out which process to send the data to. To make this possible, an
additional addressing element is necessary. This address allows a more specific
location which is a software process to be identified within a particular IP
address. In TCP/IP, this transport layer adress is called a port : TCP/IP
Transport Layer addressing is accomplished using ports. Each port number within
a particular IP device identifies a software process.
TCP header (UDP header as well) has 2 adressing fields: Source Port and Destination Port. These are analogous to the Source Internet Address and Destination Internet Address fields in the IP Header, but at a higher level of detail. They identify the originating process on the source machine and the destination process on the destination machine. They are filled in by TCP Software (or UDP Software) before transmission. The ports are used to direct the data to the correct process on the destination device.
![]() |
The figure shows how TCP and UDP ports are used to achieve
software multiplexing and demultiplexing.
|
TCP Data Handling
Looking at the picture from
OSI Reference Model Perspective, when an application process wants to send data
out to the internetwork, the data is grouped in messages. The messages can be
regarded as a letter in envelope, containing a piece of information. As the
message is passed down to lower layers, it is encapsulated in the lower layers' headers
until it is sent out by the Physical Layer.
TCP Segments
TCP Segments
TCP,
as a handy protocol, is capable of accepting application data of any size and
is responsible for dividing the big streams of data to the data segments that
Internet Protocol could handle. This is why, we describe TCP as a Stream-Oriented
Protocol. IP, on the other hand, is a message-oriented protocol, and truth to be
told, it badly needs TCP to handle the large streams of data sent by an
application process. TCP's
Stream Orientation Capability provides serious flexibility to the applications
since they don't need to worry about data packaging, and they can send files or
messages of any size. TCP takes care of packaging these bytes into messages
called segments.
Sequence
Numbers and Message Identification
TCP
is a reliable transport protocol. It means that TCP needs to keep track of all
data it receives from an application in order to make sure that all of the data
is received by the destination. Moreover, TCP must make sure the data is
received in the order it was sent, and must retransmit any lost data (God bless
you son, you are a hero!!).
Sequence
numbers in TCP headers help TCP handle reliability. Data segments that are
grouped by TCP, travel in IP diagrams. It is probable that they can be lost or
delivered out of order during transmission. To prevent data loss: the sender
increments the sequence number for each byte it sends to the receiver and fills
the TCP Header's Sequence number field with the Sequence Number of the last
byte it is sending in the data segment. The receiver acknowledges the last byte
it receives. If the receiver does not acknowledge all the bytes in a given
timeout value, then the sender interprets the situation as some data is lost,
therefore it retransmits the segments which comprise the lost bytes. To order
the receiving segments: The receiver collects the data from arriving segments,
looks at the Sequence Number Fields in the arriving segments and reconstructs an exact copy of
the stream.
As
mentioned above, the segments/messages involve message identifiers which are
actually sequence numbers and the receiver uses this identifier in the acknowledgement.
Message identification is important in order to handle data integrity and
prevent data loss.
At this point Host B
sends data segment 5 and waits for Host A to send an "ACK" so it can
continue sending the rest of the data. Host A receives the 5th data segment and
sends "ACK 7" which means 'I received the previous data segment, now
please send me the next 3'. The next step is not shown on the diagram but it
would be Host B sending data segments 7, 8 and 9.
TCP
Windowing
We
already know that the sender has to wait for acknowledgement for some time
after it sends data to the receiver. Now, let us consider different cases:
Suppose, sender has to wait for acknowledgement after each byte it sends. That
would be an awful waste of performance, the communication would be interrupted
so often. OK, let us suppose the sender has to wait for acknowledgement after
each segment it sends. Well, this seems
better, but why the hell should I wait after each segment I send? This
way, my data throughput would still be not as good as I want. Isn't there a
better option? OK boy, what if we get acknowledged after we send many segments?
That absolutely sounds better!! This way, our data sending would be interrupted
less and data throughput would be better.
A
TCP Window is the amount of unacknowledged data a sender can send on a
particular connection before it gets an acknowledgement back from the receiver.
In other words, it is the number of data segments the transmitting machine is
allowed to send without receiving an acknowledgement for them. The Window size is
expressed in number of bytes and is determined by the receiving device when the
connection is established and it can vary later.
The
sending device can send all segments within the TCP Window size (as specified
in the TCP header) without receiving an ACK, and should start a timer for each
of them. The receiving device should acknowledge each segment it received,
indicating the sequence number of the last well-received packet. After receiving
the ACK from the receiver, the sender slides the window to the right side.
TCP
basically places a memory buffer between the application and the network data
flow. The buffer allows TCP to receive and process data independently of the
upper application. The main purpose of the sliding window is to prevent from
the sender to send too many packets/segments to overflow the network resource
or the receiver's buffer.
Window
announcements are sent by the receiver to the sender when the receiver
acknowledges data receipt and the window announcement simply informs the sender
of the current window size. If a window size of zero is reported, the sender
must wait for an acknowledgement before sending the next segment of data. If the
receiver reports that the buffer size is larger than the size of a single data
packet/segment, the sender figures out that it can send multiple segments
before waiting for an acknowledgement. Transmitting multiple segments between
acknowledgements allows data to be transferred faster and more
efficiently.
TCP
Window Size
One
important concept to be mentioned in detail is the Window Size in TCP Header,
which helps the receiver not to be overwhelmed by extra data that it cannot
handle at a time. Think of a web server which has to service thousands of
clients simultaneously. In this case, the server would want to inform clients
that establish connection to it saying that: I want to handle the following
number of messages from you at a time". The client would use this send
limit to restrict the rate at which it sent messages to the server. The server
could adjust Window Size depending on its current load and other factors to
maximize performance in its communication session with the client. This enhanced
system would thus provide reliability, efficieny and basic data flow control.
Please
note that TCP Window Size and Maximum Segment Size (MSS) are different
concepts. MSS is a parameter which specifies the data size in TCP segment in
terms of bytes. TCP Window size is the parameter that specifies the size of TCP
Window, which comprises TCP segments.
Positive Acknowledgement with Retransmission
As mentioned earlier, IP is unrealiable. It works in a "send and forget" manner. From another perspective, it is an open loop system. There is no feedback from the receiver, therefore the sender never knows if the transmitted datagram gets to the destination. TCP, as a complementary protocol to IP, provides a closed loop system with the acknowledgement feedback mechanism it presents. Since IP is unreliable, there may be situations like the message may in fact never get to its destination or the acknowledgement from the receiver gets lost on his way back to the sender. In such cases, the sender would wait for the acknowledgement forever. To prevent this from happening, when the sender first sends the message, it starts a timer. This timer allows sufficient time for the message to get to the receiver and the acknowledgement to travel back, plus some additional time to allow for possible delays. If the timer expires before the acknowledgementis received, the sender assumes there was a problem and retransmits its original message. This method is called "Positive Acknowledgement with Retransmission" (PAR).
As mentioned earlier, IP is unrealiable. It works in a "send and forget" manner. From another perspective, it is an open loop system. There is no feedback from the receiver, therefore the sender never knows if the transmitted datagram gets to the destination. TCP, as a complementary protocol to IP, provides a closed loop system with the acknowledgement feedback mechanism it presents. Since IP is unreliable, there may be situations like the message may in fact never get to its destination or the acknowledgement from the receiver gets lost on his way back to the sender. In such cases, the sender would wait for the acknowledgement forever. To prevent this from happening, when the sender first sends the message, it starts a timer. This timer allows sufficient time for the message to get to the receiver and the acknowledgement to travel back, plus some additional time to allow for possible delays. If the timer expires before the acknowledgementis received, the sender assumes there was a problem and retransmits its original message. This method is called "Positive Acknowledgement with Retransmission" (PAR).
Acknowledgements
as well as sequence numbers play an important role in order to achieve
reliability in a TCP connection. Here, I want to show you a
figure which depicts how Acknowledgement works:
Looking
at the figure, we see that the window size is 3 data segments. Host B sends 3
data segments to Host A and they are received in perfect condition so, Host A
sends an "ACK 4" acknowledging the 3 data segments and requesting the
next 3 data segments which will be 4, 5, 6. As a result, Host B sends data
segments 4, 5, 6 but 5 gets lost somewhere along the way and Host A doesn't
receive it so, after a bit of waiting, it realises that 5 got lost and sends an
"ACK 5" to Host B, indicating that it would like data segment 5 retransmitted.
Now you see why this method is called "Positive Acknowledgement with
Retransmission".
No comments:
Post a Comment