Science Fair Project Encyclopedia
Transmission Control Protocol
The Transmission Control Protocol (TCP) is one of the core protocols of the Internet protocol suite. Using TCP, programs on networked computers can create connections to one another, over which they can send data. The protocol guarantees that data sent by one endpoint will be received in the same order by the other, and without any pieces missing. It also distinguishes data for different applications (such as a Web server and an email server) on the same computer.
In the Internet protocol suite, TCP is the intermediate layer between the Internet Protocol below it, and an application above it. Applications most often need reliable pipe-like connections to each other, whereas the Internet Protocol does not provide such streams, but rather only unreliable packets. It does the task of the transport layer in the simplified OSI model of computer networks.
Applications send streams of 8-bit bytes to TCP for delivery through the network, and TCP divides the byte stream into appropriately sized segments (usually delineated by the maximum transmission unit (MTU) size of the data link layer of the network the computer is attached to). TCP then passes the resulting packets to the Internet Protocol, for delivery through an internet to the TCP module of the entity at the other end. TCP checks to make sure that no packets are lost by giving each byte a sequence number, which is also used to make sure that the data are delivered to the entity at the other end in the correct order. The TCP module at the far end sends back an acknowledgement for bytes which have been successfully received; a timer at the sending TCP will cause a timeout if an acknowledgement is not received within a reasonable round trip time, and the (presumably lost) data will then be re-transmitted. The TCP checks that no bytes are damaged by using a checksum; one is computed at the sender for each block of data before it is sent, and checked at the receiver.
Protocol operation in detail
TCP connections contain three phases: connection establishment, data transfer and connection termination. A 3-way handshake is used to establish a connection. A four-way handshake is used to tear-down a connection. During connection establishment, parameters such as sequence numbers are initialized to help ensure ordered delivery and robustness.
Connection establishment (3-way handshake)
While it is possible for a pair of end hosts to initiate a connection between themselves simultaneously, typically one end opens a socket and listens passively for a connection from the other. This is commonly referred to as a passive open, and it designates the server-side of a connection. The client-side of a connection initiates an active open by sending an initial SYN segment to the server as part of the 3-way handshake. The server-side should respond to a valid SYN request with a SYN/ACK. Finally, the client-side should respond to the server with an ACK, completing the 3-way handshake and connection establishment phase.
During the data transfer phase, a number of key mechanisms determine TCP's reliability and robustness. These include using sequence numbers for ordering received TCP segments and detecting duplicate data, checksums for segment error detection, and acknowledgements and timers for detecting and adjusting to loss or delay.
During the TCP connection establishment phase, initial sequence numbers (ISNs) are exchanged between the two TCP speakers. These sequence numbers are used to identify data in the byte stream, and are numbers that identify (and count) application data bytes. There are always a pair of sequence numbers included in every TCP segment, which are referred to as the sequence number and the acknowledgement number. A TCP sender refers to its own sequence number simply as the sequence number, while the TCP sender refers to receiver's sequence number as the acknowledgement number. To maintain reliability, a receiver acknowledges TCP segment data by indicating it has received up to some location of contiguous bytes in the stream. An enhancement to TCP, called selective acknowledgement (SACK), allows a TCP receiver to acknowledge out of order blocks.
Through the use of sequence and acknowledgement numbers, TCP can properly deliver received segments in the correct byte stream order to a receiving application. Sequence numbers are 32-bit, unsigned numbers, which wrap to zero on the next byte in the stream after 232-1. One key to maintaining robustness and security for TCP connections is in the selection of the ISN.
A 16-bit checksum, consisting of the one's complement of the one's complement sum of the contents of the TCP segment header and data, is computed by a sender, and included in a segment transmission. (The one's complement sum is used because the end-around carry of that method means that it can be computed in any multiple of that length - 16-bit, 32-bit, 64-bit, etc - and the result, once folded, will be the same.) The TCP receiver recomputes the checksum on the received TCP header and data. The complement was used (above) so that the receiver does not have to zero the checksum field, after saving the checksum value elsewhere; instead, the receiver simply computes the one's complement sum with the checksum in situ, and the result should be -0. If so, the segment is assumed to have arrived intact and without error.
Note that the TCP checksum also covers a 96 bit pseudo header containing the Source Address, the Destination Address, the Protocol, and TCP length. This provides protection against misrouted segments.
The TCP checksum is a quite weak check by modern standards. Data Link Layers with a high probability of bit error rates may require additional link error correction/detection capabilities. If TCP were to be redesigned today, it would most probably have a 32-bit cyclic redundancy check specified as an error check instead of the current checksum. The weak checksum is partially compensated for by the common use of a CRC or better integrity check at layer 2, below both TCP and IP, such as is used in PPP or the Ethernet frame. However, this does not mean that the 16-bit TCP checksum is redundant: remarkably, surveys of Internet traffic have shown that software and hardware errors that introduce errors in packets between CRC-protected hops are common, and that the end-to-end 16-bit TCP checksum catches most of these simple errors. This is the end-to-end principle at work.
Acknowledgements for data sent, or lack of acknowledgements, are used by senders to implicitly interpret network conditions between the TCP sender and receiver. Coupled with timers, TCP senders and receivers can alter the behavior of the flow of data. This is more generally referred to as flow control, congestion control and/or congestion avoidance. TCP uses a number of mechanisms to achieve high performance and avoid congesting the network (i.e. send data faster than either the network, or the host on the other end, can utilize it). These mechanisms include the use of a sliding window, the slow-start algorithm, the congestion avoidance algorithm, the fast retransmit and fast recovery algorithms, and more. Enhancing TCP to reliably handle loss, minimize errors, manage congestion and go fast in very high-speed environments are ongoing areas of research and standards development.
TCP window size
The TCP receive window size is the amount of receive data (in bytes) that can be buffered during a connection. The sending host can send only that amount of data before it must wait for an acknowledgment and window update from the receiving host. The Windows TCP/IP stack is designed to self-tune itself in most environments, and uses larger default window sizes than earlier versions
For more efficient use of high bandwidth networks, a larger TCP window size may be used. The TCP window size field controls the flow of data and is limited to 2 bytes, or a window size of 65,535 bytes.
Since the size field cannot be expanded, a scaling factor is used. TCP window scale is an option used to increase the maximum window size from 65,535 bytes to 1 Gigabyte.
The window scale option is used only during the TCP 3-way handshake. The window scale value represents the number of bits to left-shift the 16-bit window size field. The window scale value can be set from 0 (no shift) to 14.
The connection termination phase uses a four-way handshake, with each side of the connection terminating independently. Therefore, a typical teardown requires a pair of FIN and ACK segments from each TCP endpoint.
TCP uses the notion of port numbers to identify sending and receiving applications. Each side of a TCP connection has an associated 16-bit unsigned port number assigned to the sending or receiving application. Ports are categorized into three basic categories: well known, registered and dynamic/private. The well known ports are assigned by the Internet Assigned Numbers Authority (IANA) and are typically used by system-level or root processes. Well known applications running as servers and passively listening for connections typically use these ports. Some examples include: FTP (21), TELNET (23), SMTP (25) and HTTP (80). Registered ports are typically used by end user applications as ephemeral source ports when contacting servers, but they can also identify named services that have been registered by a third party. Dynamic/private ports can also be used by end user applications, but are less commonly so. Dynamic/private ports do not contain any meaning outside of any particular TCP connection. There are 65535 possible ports officially recognised.
TCP is both a complex and evolving protocol. However, while significant enhancements have been made and proposed over the years, its most basic operation has not changed significantly since RFC 793, published in 1981. RFC 1122, Host Requirements for Internet Hosts, clarified a number of TCP protocol implementation requirements. RFC 2581, TCP Congestion Control, one of the most important TCP related RFCs in recent years, describes updated algorithms to be used in order to avoid undue congestion. In 2001, RFC 3168 was written to describe explicit congestion notification (ECN), a congestion avoidance signalling mechanism. In the early 21st century, TCP is typically used in approximately 95% of all Internet packets. Common applications that use TCP include HTTP/HTTPS (world wide web), SMTP/POP3/IMAP (email) and FTP (file transfer). Its widespread use is testimony to the original designers that their creation was exceptionally well done.
Recently, a new protocol has been developed called FAST TCP (Fast Active queue management Scalable Transmission Control Protocol) by scientists at Caltech. It is similiar to TCP Vegas in that both use end-to-end packet queueing delay as the primary congestion signal. (There is still much debate on if queueing delay is an appropriate signal for congestion control.)
TCP software limitations
Networking speed gradually increased over the years. What started as a protocol built for unreliable low speed networks (few Kbytes per second) is now required to run at 1 Gigabit per second. The TCP software implementations on host systems require extensive computing power. Gigabit TCP communication, alone, eats up 100% of a 2.4 GHz Pentium processor.
Hardware TCP Implementations
One way to overcome the processing power requirements of TCP is building hardware implementations of it, widely known as TCP Offload Engine (TOE). The main problem of TOEs is that they are hard to integrate into computing systems, requiring extensive changes in the Operating System of the computer or device. A better solution are the Network Traffic Accelerators (NTA) or TCP accelerators. NTAs are hardware coprocessors that assist the main CPU in a system to process the TCP packets. NTAs generally don't require changes in the Operating System and are relatively simple to implement.
Alternatives to TCP
However, TCP is not appropriate for many applications, and newer transport layer protocols are being designed and deployed to address some of the inherent weaknesses. For example, real-time applications often do not need, and will suffer from TCP's reliable delivery mechanisms. In those types of applications it is often better to deal with some loss, errors or congestion than to try to adjust for them. Example applications that do not typically use TCP include real-time streaming multimedia (such as Internet radio), real-time multiplayer games and voice over IP (VoIP). Any application that doesn't require reliability, or that wants to minimize functionality, may choose to avoid using TCP. In many cases, the User Datagram Protocol (UDP) may be used in place of TCP when just application multiplexing services are required.
- TCP congestion avoidance algorithms for more on TCP Reno, TCP Vegas, TCP Westwood, BIC and Hybla
- Port numbers for a complete (growing) list of ports/services
- RFC793 in plain-text format
- RFC793 in HTML format
- IANA Port Assignments
- Sally Floyd's homepage
- John Kristoff's Overview of TCP (Fundamental concepts behind TCP and how it is used to transport data between two endpoints)
- When The CRC and TCP Checksum Disagree
- Introduction to TCP/IP - with some pictures
- The basics of Transmission Control Protocol
- Tcp/Ip port numbers. Information for Unix based system administrators
- TCP, Transmission Control Protocol
- Network World - Tech Insider
- EE Times - TCP Accelerators
- Understanding TCP Reset Attacks
- TCP/IP Sequence Diagrams
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details