summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2018.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc2018.txt')
-rw-r--r--doc/rfc/rfc2018.txt675
1 files changed, 675 insertions, 0 deletions
diff --git a/doc/rfc/rfc2018.txt b/doc/rfc/rfc2018.txt
new file mode 100644
index 0000000..1d84811
--- /dev/null
+++ b/doc/rfc/rfc2018.txt
@@ -0,0 +1,675 @@
+
+
+
+
+
+
+Network Working Group M. Mathis
+Request for Comments: 2018 J. Mahdavi
+Category: Standards Track PSC
+ S. Floyd
+ LBNL
+ A. Romanow
+ Sun Microsystems
+ October 1996
+
+
+ TCP Selective Acknowledgment Options
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Abstract
+
+ TCP may experience poor performance when multiple packets are lost
+ from one window of data. With the limited information available
+ from cumulative acknowledgments, a TCP sender can only learn about a
+ single lost packet per round trip time. An aggressive sender could
+ choose to retransmit packets early, but such retransmitted segments
+ may have already been successfully received.
+
+ A Selective Acknowledgment (SACK) mechanism, combined with a
+ selective repeat retransmission policy, can help to overcome these
+ limitations. The receiving TCP sends back SACK packets to the sender
+ informing the sender of data that has been received. The sender can
+ then retransmit only the missing data segments.
+
+ This memo proposes an implementation of SACK and discusses its
+ performance and related issues.
+
+Acknowledgements
+
+ Much of the text in this document is taken directly from RFC1072 "TCP
+ Extensions for Long-Delay Paths" by Bob Braden and Van Jacobson. The
+ authors would like to thank Kevin Fall (LBNL), Christian Huitema
+ (INRIA), Van Jacobson (LBNL), Greg Miller (MITRE), Greg Minshall
+ (Ipsilon), Lixia Zhang (XEROX PARC and UCLA), Dave Borman (BSDI),
+ Allison Mankin (ISI) and others for their review and constructive
+ comments.
+
+
+
+
+Mathis, et. al. Standards Track [Page 1]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+1. Introduction
+
+ Multiple packet losses from a window of data can have a catastrophic
+ effect on TCP throughput. TCP [Postel81] uses a cumulative
+ acknowledgment scheme in which received segments that are not at the
+ left edge of the receive window are not acknowledged. This forces
+ the sender to either wait a roundtrip time to find out about each
+ lost packet, or to unnecessarily retransmit segments which have been
+ correctly received [Fall95]. With the cumulative acknowledgment
+ scheme, multiple dropped segments generally cause TCP to lose its
+ ACK-based clock, reducing overall throughput.
+
+ Selective Acknowledgment (SACK) is a strategy which corrects this
+ behavior in the face of multiple dropped segments. With selective
+ acknowledgments, the data receiver can inform the sender about all
+ segments that have arrived successfully, so the sender need
+ retransmit only the segments that have actually been lost.
+
+ Several transport protocols, including NETBLT [Clark87], XTP
+ [Strayer92], RDP [Velten84], NADIR [Huitema81], and VMTP [Cheriton88]
+ have used selective acknowledgment. There is some empirical evidence
+ in favor of selective acknowledgments -- simple experiments with RDP
+ have shown that disabling the selective acknowledgment facility
+ greatly increases the number of retransmitted segments over a lossy,
+ high-delay Internet path [Partridge87]. A recent simulation study by
+ Kevin Fall and Sally Floyd [Fall95], demonstrates the strength of TCP
+ with SACK over the non-SACK Tahoe and Reno TCP implementations.
+
+ RFC1072 [VJ88] describes one possible implementation of SACK options
+ for TCP. Unfortunately, it has never been deployed in the Internet,
+ as there was disagreement about how SACK options should be used in
+ conjunction with the TCP window shift option (initially described
+ RFC1072 and revised in [Jacobson92]).
+
+ We propose slight modifications to the SACK options as proposed in
+ RFC1072. Specifically, sending a selective acknowledgment for the
+ most recently received data reduces the need for long SACK options
+ [Keshav94, Mathis95]. In addition, the SACK option now carries full
+ 32 bit sequence numbers. These two modifications represent the only
+ changes to the proposal in RFC1072. They make SACK easier to
+ implement and address concerns about robustness.
+
+ The selective acknowledgment extension uses two TCP options. The
+ first is an enabling option, "SACK-permitted", which may be sent in a
+ SYN segment to indicate that the SACK option can be used once the
+ connection is established. The other is the SACK option itself,
+ which may be sent over an established connection once permission has
+ been given by SACK-permitted.
+
+
+
+Mathis, et. al. Standards Track [Page 2]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+ The SACK option is to be included in a segment sent from a TCP that
+ is receiving data to the TCP that is sending that data; we will refer
+ to these TCP's as the data receiver and the data sender,
+ respectively. We will consider a particular simplex data flow; any
+ data flowing in the reverse direction over the same connection can be
+ treated independently.
+
+2. Sack-Permitted Option
+
+ This two-byte option may be sent in a SYN by a TCP that has been
+ extended to receive (and presumably process) the SACK option once the
+ connection has opened. It MUST NOT be sent on non-SYN segments.
+
+ TCP Sack-Permitted Option:
+
+ Kind: 4
+
+ +---------+---------+
+ | Kind=4 | Length=2|
+ +---------+---------+
+
+3. Sack Option Format
+
+ The SACK option is to be used to convey extended acknowledgment
+ information from the receiver to the sender over an established TCP
+ connection.
+
+ TCP SACK Option:
+
+ Kind: 5
+
+ Length: Variable
+
+ +--------+--------+
+ | Kind=5 | Length |
+ +--------+--------+--------+--------+
+ | Left Edge of 1st Block |
+ +--------+--------+--------+--------+
+ | Right Edge of 1st Block |
+ +--------+--------+--------+--------+
+ | |
+ / . . . /
+ | |
+ +--------+--------+--------+--------+
+ | Left Edge of nth Block |
+ +--------+--------+--------+--------+
+ | Right Edge of nth Block |
+ +--------+--------+--------+--------+
+
+
+
+Mathis, et. al. Standards Track [Page 3]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+ The SACK option is to be sent by a data receiver to inform the data
+ sender of non-contiguous blocks of data that have been received and
+ queued. The data receiver awaits the receipt of data (perhaps by
+ means of retransmissions) to fill the gaps in sequence space between
+ received blocks. When missing segments are received, the data
+ receiver acknowledges the data normally by advancing the left window
+ edge in the Acknowledgement Number Field of the TCP header. The SACK
+ option does not change the meaning of the Acknowledgement Number
+ field.
+
+ This option contains a list of some of the blocks of contiguous
+ sequence space occupied by data that has been received and queued
+ within the window.
+
+ Each contiguous block of data queued at the data receiver is defined
+ in the SACK option by two 32-bit unsigned integers in network byte
+ order:
+
+ * Left Edge of Block
+
+ This is the first sequence number of this block.
+
+ * Right Edge of Block
+
+ This is the sequence number immediately following the last
+ sequence number of this block.
+
+ Each block represents received bytes of data that are contiguous and
+ isolated; that is, the bytes just below the block, (Left Edge of
+ Block - 1), and just above the block, (Right Edge of Block), have not
+ been received.
+
+ A SACK option that specifies n blocks will have a length of 8*n+2
+ bytes, so the 40 bytes available for TCP options can specify a
+ maximum of 4 blocks. It is expected that SACK will often be used in
+ conjunction with the Timestamp option used for RTTM [Jacobson92],
+ which takes an additional 10 bytes (plus two bytes of padding); thus
+ a maximum of 3 SACK blocks will be allowed in this case.
+
+ The SACK option is advisory, in that, while it notifies the data
+ sender that the data receiver has received the indicated segments,
+ the data receiver is permitted to later discard data which have been
+ reported in a SACK option. A discussion appears below in Section 8
+ of the consequences of advisory SACK, in particular that the data
+ receiver may renege, or drop already SACKed data.
+
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 4]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+4. Generating Sack Options: Data Receiver Behavior
+
+ If the data receiver has received a SACK-Permitted option on the SYN
+ for this connection, the data receiver MAY elect to generate SACK
+ options as described below. If the data receiver generates SACK
+ options under any circumstance, it SHOULD generate them under all
+ permitted circumstances. If the data receiver has not received a
+ SACK-Permitted option for a given connection, it MUST NOT send SACK
+ options on that connection.
+
+ If sent at all, SACK options SHOULD be included in all ACKs which do
+ not ACK the highest sequence number in the data receiver's queue. In
+ this situation the network has lost or mis-ordered data, such that
+ the receiver holds non-contiguous data in its queue. RFC 1122,
+ Section 4.2.2.21, discusses the reasons for the receiver to send ACKs
+ in response to additional segments received in this state. The
+ receiver SHOULD send an ACK for every valid segment that arrives
+ containing new data, and each of these "duplicate" ACKs SHOULD bear a
+ SACK option.
+
+ If the data receiver chooses to send a SACK option, the following
+ rules apply:
+
+ * The first SACK block (i.e., the one immediately following the
+ kind and length fields in the option) MUST specify the contiguous
+ block of data containing the segment which triggered this ACK,
+ unless that segment advanced the Acknowledgment Number field in
+ the header. This assures that the ACK with the SACK option
+ reflects the most recent change in the data receiver's buffer
+ queue.
+
+ * The data receiver SHOULD include as many distinct SACK blocks as
+ possible in the SACK option. Note that the maximum available
+ option space may not be sufficient to report all blocks present in
+ the receiver's queue.
+
+ * The SACK option SHOULD be filled out by repeating the most
+ recently reported SACK blocks (based on first SACK blocks in
+ previous SACK options) that are not subsets of a SACK block
+ already included in the SACK option being constructed. This
+ assures that in normal operation, any segment remaining part of a
+ non-contiguous block of data held by the data receiver is reported
+ in at least three successive SACK options, even for large-window
+ TCP implementations [RFC1323]). After the first SACK block, the
+ following SACK blocks in the SACK option may be listed in
+ arbitrary order.
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 5]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+ It is very important that the SACK option always reports the block
+ containing the most recently received segment, because this provides
+ the sender with the most up-to-date information about the state of
+ the network and the data receiver's queue.
+
+5. Interpreting the Sack Option and Retransmission Strategy: Data
+ Sender Behavior
+
+ When receiving an ACK containing a SACK option, the data sender
+ SHOULD record the selective acknowledgment for future reference. The
+ data sender is assumed to have a retransmission queue that contains
+ the segments that have been transmitted but not yet acknowledged, in
+ sequence-number order. If the data sender performs re-packetization
+ before retransmission, the block boundaries in a SACK option that it
+ receives may not fall on boundaries of segments in the retransmission
+ queue; however, this does not pose a serious difficulty for the
+ sender.
+
+ One possible implementation of the sender's behavior is as follows.
+ Let us suppose that for each segment in the retransmission queue
+ there is a (new) flag bit "SACKed", to be used to indicate that this
+ particular segment has been reported in a SACK option.
+
+ When an acknowledgment segment arrives containing a SACK option, the
+ data sender will turn on the SACKed bits for segments that have been
+ selectively acknowledged. More specifically, for each block in the
+ SACK option, the data sender will turn on the SACKed flags for all
+ segments in the retransmission queue that are wholly contained within
+ that block. This requires straightforward sequence number
+ comparisons.
+
+ After the SACKed bit is turned on (as the result of processing a
+ received SACK option), the data sender will skip that segment during
+ any later retransmission. Any segment that has the SACKed bit turned
+ off and is less than the highest SACKed segment is available for
+ retransmission.
+
+ After a retransmit timeout the data sender SHOULD turn off all of the
+ SACKed bits, since the timeout might indicate that the data receiver
+ has reneged. The data sender MUST retransmit the segment at the left
+ edge of the window after a retransmit timeout, whether or not the
+ SACKed bit is on for that segment. A segment will not be dequeued
+ and its buffer freed until the left window edge is advanced over it.
+
+
+
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 6]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+5.1 Congestion Control Issues
+
+ This document does not attempt to specify in detail the congestion
+ control algorithms for implementations of TCP with SACK. However,
+ the congestion control algorithms present in the de facto standard
+ TCP implementations MUST be preserved [Stevens94]. In particular, to
+ preserve robustness in the presence of packets reordered by the
+ network, recovery is not triggered by a single ACK reporting out-of-
+ order packets at the receiver. Further, during recovery the data
+ sender limits the number of segments sent in response to each ACK.
+ Existing implementations limit the data sender to sending one segment
+ during Reno-style fast recovery, or to two segments during slow-start
+ [Jacobson88]. Other aspects of congestion control, such as reducing
+ the congestion window in response to congestion, must similarly be
+ preserved.
+
+ The use of time-outs as a fall-back mechanism for detecting dropped
+ packets is unchanged by the SACK option. Because the data receiver
+ is allowed to discard SACKed data, when a retransmit timeout occurs
+ the data sender MUST ignore prior SACK information in determining
+ which data to retransmit.
+
+ Future research into congestion control algorithms may take advantage
+ of the additional information provided by SACK. One such area for
+ future research concerns modifications to TCP for a wireless or
+ satellite environment where packet loss is not necessarily an
+ indication of congestion.
+
+6. Efficiency and Worst Case Behavior
+
+ If the return path carrying ACKs and SACK options were lossless, one
+ block per SACK option packet would always be sufficient. Every
+ segment arriving while the data receiver holds discontinuous data
+ would cause the data receiver to send an ACK with a SACK option
+ containing the one altered block in the receiver's queue. The data
+ sender is thus able to construct a precise replica of the receiver's
+ queue by taking the union of all the first SACK blocks.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 7]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+ Since the return path is not lossless, the SACK option is defined to
+ include more than one SACK block in a single packet. The redundant
+ blocks in the SACK option packet increase the robustness of SACK
+ delivery in the presence of lost ACKs. For a receiver that is also
+ using the time stamp option [Jacobson92], the SACK option has room to
+ include three SACK blocks. Thus each SACK block will generally be
+ repeated at least three times, if necessary, once in each of three
+ successive ACK packets. However, if all of the ACK packets reporting
+ a particular SACK block are dropped, then the sender might assume
+ that the data in that SACK block has not been received, and
+ unnecessarily retransmit those segments.
+
+ The deployment of other TCP options may reduce the number of
+ available SACK blocks to 2 or even to 1. This will reduce the
+ redundancy of SACK delivery in the presence of lost ACKs. Even so,
+ the exposure of TCP SACK in regard to the unnecessary retransmission
+ of packets is strictly less than the exposure of current
+ implementations of TCP. The worst-case conditions necessary for the
+ sender to needlessly retransmit data is discussed in more detail in a
+ separate document [Floyd96].
+
+ Older TCP implementations which do not have the SACK option will not
+ be unfairly disadvantaged when competing against SACK-capable TCPs.
+ This issue is discussed in more detail in [Floyd96].
+
+7. Sack Option Examples
+
+ The following examples attempt to demonstrate the proper behavior of
+ SACK generation by the data receiver.
+
+ Assume the left window edge is 5000 and that the data transmitter
+ sends a burst of 8 segments, each containing 500 data bytes.
+
+ Case 1: The first 4 segments are received but the last 4 are
+ dropped.
+
+ The data receiver will return a normal TCP ACK segment
+ acknowledging sequence number 7000, with no SACK option.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 8]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+ Case 2: The first segment is dropped but the remaining 7 are
+ received.
+
+ Upon receiving each of the last seven packets, the data
+ receiver will return a TCP ACK segment that acknowledges
+ sequence number 5000 and contains a SACK option specifying
+ one block of queued data:
+
+ Triggering ACK Left Edge Right Edge
+ Segment
+
+ 5000 (lost)
+ 5500 5000 5500 6000
+ 6000 5000 5500 6500
+ 6500 5000 5500 7000
+ 7000 5000 5500 7500
+ 7500 5000 5500 8000
+ 8000 5000 5500 8500
+ 8500 5000 5500 9000
+
+
+ Case 3: The 2nd, 4th, 6th, and 8th (last) segments are
+ dropped.
+
+ The data receiver ACKs the first packet normally. The
+ third, fifth, and seventh packets trigger SACK options as
+ follows:
+
+ Triggering ACK First Block 2nd Block 3rd Block
+ Segment Left Right Left Right Left Right
+ Edge Edge Edge Edge Edge Edge
+
+ 5000 5500
+ 5500 (lost)
+ 6000 5500 6000 6500
+ 6500 (lost)
+ 7000 5500 7000 7500 6000 6500
+ 7500 (lost)
+ 8000 5500 8000 8500 7000 7500 6000 6500
+ 8500 (lost)
+
+
+
+
+
+
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 9]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+ Suppose at this point, the 4th packet is received out of order.
+ (This could either be because the data was badly misordered in the
+ network, or because the 2nd packet was retransmitted and lost, and
+ then the 4th packet was retransmitted). At this point the data
+ receiver has only two SACK blocks to report. The data receiver
+ replies with the following Selective Acknowledgment:
+
+ Triggering ACK First Block 2nd Block 3rd Block
+ Segment Left Right Left Right Left Right
+ Edge Edge Edge Edge Edge Edge
+
+ 6500 5500 6000 7500 8000 8500
+
+ Suppose at this point, the 2nd segment is received. The data
+ receiver then replies with the following Selective Acknowledgment:
+
+ Triggering ACK First Block 2nd Block 3rd Block
+ Segment Left Right Left Right Left Right
+ Edge Edge Edge Edge Edge Edge
+
+ 5500 7500 8000 8500
+
+8. Data Receiver Reneging
+
+ Note that the data receiver is permitted to discard data in its queue
+ that has not been acknowledged to the data sender, even if the data
+ has already been reported in a SACK option. Such discarding of
+ SACKed packets is discouraged, but may be used if the receiver runs
+ out of buffer space.
+
+ The data receiver MAY elect not to keep data which it has reported in
+ a SACK option. In this case, the receiver SACK generation is
+ additionally qualified:
+
+ * The first SACK block MUST reflect the newest segment. Even if
+ the newest segment is going to be discarded and the receiver has
+ already discarded adjacent segments, the first SACK block MUST
+ report, at a minimum, the left and right edges of the newest
+ segment.
+
+ * Except for the newest segment, all SACK blocks MUST NOT report
+ any old data which is no longer actually held by the receiver.
+
+ Since the data receiver may later discard data reported in a SACK
+ option, the sender MUST NOT discard data before it is acknowledged by
+ the Acknowledgment Number field in the TCP header.
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 10]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+9. Security Considerations
+
+ This document neither strengthens nor weakens TCP's current security
+ properties.
+
+10. References
+
+ [Cheriton88] Cheriton, D., "VMTP: Versatile Message Transaction
+ Protocol", RFC 1045, Stanford University, February 1988.
+
+ [Clark87] Clark, D., Lambert, M., and L. Zhang, "NETBLT: A Bulk Data
+ Transfer Protocol", RFC 998, MIT, March 1987.
+
+ [Fall95] Fall, K. and Floyd, S., "Comparisons of Tahoe, Reno, and
+ Sack TCP", ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z, December 1995.
+
+ [Floyd96] Floyd, S., "Issues of TCP with SACK",
+ ftp://ftp.ee.lbl.gov/papers/issues_sa.ps.Z, January 1996.
+
+ [Huitema81] Huitema, C., and Valet, I., An Experiment on High Speed
+ File Transfer using Satellite Links, 7th Data Communication
+ Symposium, Mexico, October 1981.
+
+ [Jacobson88] Jacobson, V., "Congestion Avoidance and Control",
+ Proceedings of SIGCOMM '88, Stanford, CA., August 1988.
+
+ [Jacobson88}, Jacobson, V. and R. Braden, "TCP Extensions for Long-
+ Delay Paths", RFC 1072, October 1988.
+
+ [Jacobson92] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions
+ for High Performance", RFC 1323, May 1992.
+
+ [Keshav94] Keshav, presentation to the Internet End-to-End Research
+ Group, November 1994.
+
+ [Mathis95] Mathis, M., and Mahdavi, J., TCP Forward Acknowledgment
+ Option, presentation to the Internet End-to-End Research Group, June
+ 1995.
+
+ [Partridge87] Partridge, C., "Private Communication", February 1987.
+
+ [Postel81] Postel, J., "Transmission Control Protocol - DARPA
+ Internet Program Protocol Specification", RFC 793, DARPA, September
+ 1981.
+
+ [Stevens94] Stevens, W., TCP/IP Illustrated, Volume 1: The Protocols,
+ Addison-Wesley, 1994.
+
+
+
+
+Mathis, et. al. Standards Track [Page 11]
+
+RFC 2018 TCP Selective Acknowledgement Options October 1996
+
+
+ [Strayer92] Strayer, T., Dempsey, B., and Weaver, A., XTP -- the
+ xpress transfer protocol. Addison-Wesley Publishing Company, 1992.
+
+ [Velten84] Velten, D., Hinden, R., and J. Sax, "Reliable Data
+ Protocol", RFC 908, BBN, July 1984.
+
+11. Authors' Addresses
+
+ Matt Mathis and Jamshid Mahdavi
+ Pittsburgh Supercomputing Center
+ 4400 Fifth Ave
+ Pittsburgh, PA 15213
+ mathis@psc.edu
+ mahdavi@psc.edu
+
+ Sally Floyd
+ Lawrence Berkeley National Laboratory
+ One Cyclotron Road
+ Berkeley, CA 94720
+ floyd@ee.lbl.gov
+
+ Allyn Romanow
+ Sun Microsystems, Inc.
+ 2550 Garcia Ave., MPK17-202
+ Mountain View, CA 94043
+ allyn@eng.sun.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Mathis, et. al. Standards Track [Page 12]
+