diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1072.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc1072.txt')
-rw-r--r-- | doc/rfc/rfc1072.txt | 893 |
1 files changed, 893 insertions, 0 deletions
diff --git a/doc/rfc/rfc1072.txt b/doc/rfc/rfc1072.txt new file mode 100644 index 0000000..6ed8d5b --- /dev/null +++ b/doc/rfc/rfc1072.txt @@ -0,0 +1,893 @@ +Network Working Group V. Jacobson +Request for Comments: 1072 LBL + R. Braden + ISI + October 1988 + + + TCP Extensions for Long-Delay Paths + + +Status of This Memo + + This memo proposes a set of extensions to the TCP protocol to provide + efficient operation over a path with a high bandwidth*delay product. + These extensions are not proposed as an Internet standard at this + time. Instead, they are intended as a basis for further + experimentation and research on transport protocol performance. + Distribution of this memo is unlimited. + +1. INTRODUCTION + + Recent work on TCP performance has shown that TCP can work well over + a variety of Internet paths, ranging from 800 Mbit/sec I/O channels + to 300 bit/sec dial-up modems [Jacobson88]. However, there is still + a fundamental TCP performance bottleneck for one transmission regime: + paths with high bandwidth and long round-trip delays. The + significant parameter is the product of bandwidth (bits per second) + and round-trip delay (RTT in seconds); this product is the number of + bits it takes to "fill the pipe", i.e., the amount of unacknowledged + data that TCP must handle in order to keep the pipeline full. TCP + performance problems arise when this product is large, e.g., + significantly exceeds 10**5 bits. We will refer to an Internet path + operating in this region as a "long, fat pipe", and a network + containing this path as an "LFN" (pronounced "elephan(t)"). + + High-capacity packet satellite channels (e.g., DARPA's Wideband Net) + are LFN's. For example, a T1-speed satellite channel has a + bandwidth*delay product of 10**6 bits or more; this corresponds to + 100 outstanding TCP segments of 1200 bytes each! Proposed future + terrestrial fiber-optical paths will also fall into the LFN class; + for example, a cross-country delay of 30 ms at a DS3 bandwidth + (45Mbps) also exceeds 10**6 bits. + + Clever algorithms alone will not give us good TCP performance over + LFN's; it will be necessary to actually extend the protocol. This + RFC proposes a set of TCP extensions for this purpose. + + There are three fundamental problems with the current TCP over LFN + + + +Jacobson & Braden [Page 1] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + paths: + + + (1) Window Size Limitation + + The TCP header uses a 16 bit field to report the receive window + size to the sender. Therefore, the largest window that can be + used is 2**16 = 65K bytes. (In practice, some TCP + implementations will "break" for windows exceeding 2**15, + because of their failure to do unsigned arithmetic). + + To circumvent this problem, we propose a new TCP option to allow + windows larger than 2**16. This option will define an implicit + scale factor, to be used to multiply the window size value found + in a TCP header to obtain the true window size. + + + (2) Cumulative Acknowledgments + + Any packet losses in an LFN can have a catastrophic effect on + throughput. This effect is exaggerated by the simple cumulative + acknowledgment of TCP. Whenever a segment is lost, the + transmitting TCP will (eventually) time out and retransmit the + missing segment. However, the sending TCP has no information + about segments that may have reached the receiver and been + queued because they were not at the left window edge, so it may + be forced to retransmit these segments unnecessarily. + + We propose a TCP extension to implement selective + acknowledgements. By sending selective acknowledgments, the + receiver of data can inform the sender about all segments that + have arrived successfully, so the sender need retransmit only + the segments that have actually been lost. + + Selective acknowledgments have been included in a number of + experimental Internet protocols -- VMTP [Cheriton88], NETBLT + [Clark87], and RDP [Velten84]. There is some empirical evidence + in favor of selective acknowledgments -- simple experiments with + RDP have shown that disabling the selective acknowlegment + facility greatly increases the number of retransmitted segments + over a lossy, high-delay Internet path [Partridge87]. A + simulation study of a simple form of selective acknowledgments + added to the ISO transport protocol TP4 also showed promise of + performance improvement [NBS85]. + + + + + + + +Jacobson & Braden [Page 2] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + (3) Round Trip Timing + + TCP implements reliable data delivery by measuring the RTT, + i.e., the time interval between sending a segment and receiving + an acknowledgment for it, and retransmitting any segments that + are not acknowledged within some small multiple of the average + RTT. Experience has shown that accurate, current RTT estimates + are necessary to adapt to changing traffic conditions and, + without them, a busy network is subject to an instability known + as "congestion collapse" [Nagle84]. + + In part because TCP segments may be repacketized upon + retransmission, and in part because of complications due to the + cumulative TCP acknowledgement, measuring a segments's RTT may + involve a non-trivial amount of computation in some + implementations. To minimize this computation, some + implementations time only one segment per window. While this + yields an adequate approximation to the RTT for small windows + (e.g., a 4 to 8 segment Arpanet window), for an LFN (e.g., 100 + segment Wideband Network windows) it results in an unacceptably + poor RTT estimate. + + In the presence of errors, the problem becomes worse. Zhang + [Zhang86], Jain [Jain86] and Karn [Karn87] have shown that it is + not possible to accumulate reliable RTT estimates if + retransmitted segments are included in the estimate. Since a + full window of data will have been transmitted prior to a + retransmission, all of the segments in that window will have to + be ACKed before the next RTT sample can be taken. This means at + least an additional window's worth of time between RTT + measurements and, as the error rate approaches one per window of + data (e.g., 10**-6 errors per bit for the Wideband Net), it + becomes effectively impossible to obtain an RTT measurement. + + We propose a TCP "echo" option that allows each segment to carry + its own timestamp. This will allow every segment, including + retransmissions, to be timed at negligible computational cost. + + + In designing new TCP options, we must pay careful attention to + interoperability with existing implementations. The only TCP option + defined to date is an "initial option", i.e., it may appear only on a + SYN segment. It is likely that most implementations will properly + ignore any options in the SYN segment that they do not understand, so + new initial options should not cause a problem. On the other hand, + we fear that receiving unexpected non-initial options may cause some + TCP's to crash. + + + + +Jacobson & Braden [Page 3] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + Therefore, in each of the extensions we propose, non-initial options + may be sent only if an exchange of initial options has indicated that + both sides understand the extension. This approach will also allow a + TCP to determine when the connection opens how big a TCP header it + will be sending. + +2. TCP WINDOW SCALE OPTION + + The obvious way to implement a window scale factor would be to define + a new TCP option that could be included in any segment specifying a + window. The receiver would include it in every acknowledgment + segment, and the sender would interpret it. Unfortunately, this + simple approach would not work. The sender must reliably know the + receiver's current scale factor, but a TCP option in an + acknowledgement segment will not be delivered reliably (unless the + ACK happens to be piggy-backed on data). + + However, SYN segments are always sent reliably, suggesting that each + side may communicate its window scale factor in an initial TCP + option. This approach has a disadvantage: the scale must be + established when the connection is opened, and cannot be changed + thereafter. However, other alternatives would be much more + complicated, and we therefore propose a new initial option called + Window Scale. + +2.1 Window Scale Option + + This three-byte option may be sent in a SYN segment by a TCP (1) + to indicate that it is prepared to do both send and receive window + scaling, and (2) to communicate a scale factor to be applied to + its receive window. The scale factor is encoded logarithmically, + as a power of 2 (presumably to be implemented by binary shifts). + + Note: the window in the SYN segment itself is never scaled. + + TCP Window Scale Option: + + Kind: 3 + + +---------+---------+---------+ + | Kind=3 |Length=3 |shift.cnt| + +---------+---------+---------+ + + Here shift.cnt is the number of bits by which the receiver right- + shifts the true receive-window value, to scale it into a 16-bit + value to be sent in TCP header (this scaling is explained below). + The value shift.cnt may be zero (offering to scale, while applying + a scale factor of 1 to the receive window). + + + +Jacobson & Braden [Page 4] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + This option is an offer, not a promise; both sides must send + Window Scale options in their SYN segments to enable window + scaling in either direction. + +2.2 Using the Window Scale Option + + A model implementation of window scaling is as follows, using the + notation of RFC-793 [Postel81]: + + * The send-window (SND.WND) and receive-window (RCV.WND) sizes + in the connection state block and in all sequence space + calculations are expanded from 16 to 32 bits. + + * Two window shift counts are added to the connection state: + snd.scale and rcv.scale. These are shift counts to be + applied to the incoming and outgoing windows, respectively. + The precise algorithm is shown below. + + * All outgoing SYN segments are sent with the Window Scale + option, containing a value shift.cnt = R that the TCP would + like to use for its receive window. + + * Snd.scale and rcv.scale are initialized to zero, and are + changed only during processing of a received SYN segment. If + the SYN segment contains a Window Scale option with shift.cnt + = S, set snd.scale to S and set rcv.scale to R; otherwise, + both snd.scale and rcv.scale are left at zero. + + * The window field (SEG.WND) in the header of every incoming + segment, with the exception of SYN segments, will be left- + shifted by snd.scale bits before updating SND.WND: + + SND.WND = SEG.WND << snd.scale + + (assuming the other conditions of RFC793 are met, and using + the "C" notation "<<" for left-shift). + + * The window field (SEG.WND) of every outgoing segment, with + the exception of SYN segments, will have been right-shifted + by rcv.scale bits: + + SEG.WND = RCV.WND >> rcv.scale. + + + TCP determines if a data segment is "old" or "new" by testing if + its sequence number is within 2**31 bytes of the left edge of the + window. If not, the data is "old" and discarded. To insure that + new data is never mistakenly considered old and vice-versa, the + + + +Jacobson & Braden [Page 5] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + left edge of the sender's window has to be at least 2**31 away + from the right edge of the receiver's window. Similarly with the + sender's right edge and receiver's left edge. Since the right and + left edges of either the sender's or receiver's window differ by + the window size, and since the sender and receiver windows can be + out of phase by at most the window size, the above constraints + imply that 2 * the max window size must be less than 2**31, or + + max window < 2**30 + + Since the max window is 2**S (where S is the scaling shift count) + times at most 2**16 - 1 (the maximum unscaled window), the maximum + window is guaranteed to be < 2*30 if S <= 14. Thus, the shift + count must be limited to 14. (This allows windows of 2**30 = 1 + Gbyte.) If a Window Scale option is received with a shift.cnt + value exceeding 14, the TCP should log the error but use 14 + instead of the specified value. + + +3. TCP SELECTIVE ACKNOWLEDGMENT OPTIONS + + To minimize the impact on the TCP protocol, the selective + acknowledgment extension uses the form of two new TCP options. The + first is an enabling option, "SACK-permitted", that may be sent in a + SYN segment to indicate that the the SACK option may be used once the + connection is established. The other is the SACK option itself, + which may be sent over an established connection once permission has + been given by SACK-permitted. + + The SACK option is to be included in a segment sent from a TCP that + is receiving data to the TCP that is sending that data; we will refer + to these TCP's as the data receiver and the data sender, + respectively. We will consider a particular simplex data flow; any + data flowing in the reverse direction over the same connection can be + treated independently. + +3.1 SACK-Permitted Option + + This two-byte option may be sent in a SYN by a TCP that has been + extended to receive (and presumably process) the SACK option once + the connection has opened. + + + + + + + + + + +Jacobson & Braden [Page 6] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + TCP Sack-Permitted Option: + + Kind: 4 + + +---------+---------+ + | Kind=4 | Length=2| + +---------+---------+ + +3.2 SACK Option + + The SACK option is to be used to convey extended acknowledgment + information over an established connection. Specifically, it is + to be sent by a data receiver to inform the data transmitter of + non-contiguous blocks of data that have been received and queued. + The data receiver is awaiting the receipt of data in later + retransmissions to fill the gaps in sequence space between these + blocks. At that time, the data receiver will acknowledge the data + normally by advancing the left window edge in the Acknowledgment + Number field of the TCP header. + + It is important to understand that the SACK option will not change + the meaning of the Acknowledgment Number field, whose value will + still specify the left window edge, i.e., one byte beyond the last + sequence number of fully-received data. The SACK option is + advisory; if it is ignored, TCP acknowledgments will continue to + function as specified in the protocol. + + However, SACK will provide additional information that the data + transmitter can use to optimize retransmissions. The TCP data + receiver may include the SACK option in an acknowledgment segment + whenever it has data that is queued and unacknowledged. Of + course, the SACK option may be sent only when the TCP has received + the SACK-permitted option in the SYN segment for that connection. + + TCP SACK Option: + + Kind: 5 + + Length: Variable + + + +--------+--------+--------+--------+--------+--------+...---+ + | Kind=5 | Length | Relative Origin | Block Size | | + +--------+--------+--------+--------+--------+--------+...---+ + + + This option contains a list of the blocks of contiguous sequence + space occupied by data that has been received and queued within + + + +Jacobson & Braden [Page 7] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + the window. Each block is contiguous and isolated; that is, the + octets just below the block, + + Acknowledgment Number + Relative Origin -1, + + and just above the block, + + Acknowledgment Number + Relative Origin + Block Size, + + have not been received. + + Each contiguous block of data queued at the receiver is defined in + the SACK option by two 16-bit integers: + + + * Relative Origin + + This is the first sequence number of this block, relative to + the Acknowledgment Number field in the TCP header (i.e., + relative to the data receiver's left window edge). + + + * Block Size + + This is the size in octets of this block of contiguous data. + + + A SACK option that specifies n blocks will have a length of 4*n+2 + octets, so the 44 bytes available for TCP options can specify a + maximum of 10 blocks. Of course, if other TCP options are + introduced, they will compete for the 44 bytes, and the limit of + 10 may be reduced in particular segments. + + There is no requirement on the order in which blocks can appear in + a single SACK option. + + Note: requiring that the blocks be ordered would allow a + slightly more efficient algorithm in the transmitter; however, + this does not seem to be an important optimization. + +3.3 SACK with Window Scaling + + If window scaling is in effect, then 16 bits may not be sufficient + for the SACK option fields that define the origin and length of a + block. There are two possible ways to handle this: + + (1) Expand the SACK origin and length fields to 24 or 32 bits. + + + + +Jacobson & Braden [Page 8] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + (2) Scale the SACK fields by the same factor as the window. + + + The first alternative would significantly reduce the number of + blocks possible in a SACK option; therefore, we have chosen the + second alternative, scaling the SACK information as well as the + window. + + Scaling the SACK information introduces some loss of precision, + since a SACK option must report queued data blocks whose origins + and lengths are multiples of the window scale factor rcv.scale. + These reported blocks must be equal to or smaller than the actual + blocks of queued data. + + Specifically, suppose that the receiver has a contiguous block of + queued data that occupies sequence numbers L, L+1, ... L+N-1, and + that the window scale factor is S = rcv.scale. Then the + corresponding block that will be reported in a SACK option will + be: + + Relative Origin = int((L+S-1)/S) + + Block Size = int((L+N)/S) - (Relative Origin) + + where the function int(x) returns the greatest integer contained + in x. + + The resulting loss of precision is not a serious problem for the + sender. If the data-sending TCP keeps track of the boundaries of + all segments in its retransmission queue, it will generally be + able to infer from the imprecise SACK data which full segments + don't need to be retransmitted. This will fail only if S is + larger than the maximum segment size, in which case some segments + may be retransmitted unnecessarily. If the sending TCP does not + keep track of transmitted segment boundaries, the imprecision of + the scaled SACK quantities will only result in retransmitting a + small amount of unneeded sequence space. On the average, the data + sender will unnecessarily retransmit J*S bytes of the sequence + space for each SACK received; here J is the number of blocks + reported in the SACK, and S = snd.scale. + +3.4 SACK Option Examples + + Assume the left window edge is 5000 and that the data transmitter + sends a burst of 8 segments, each containing 500 data bytes. + Unless specified otherwise, we assume that the scale factor S = 1. + + + + + +Jacobson & Braden [Page 9] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + Case 1: The first 4 segments are received but the last 4 are + dropped. + + The data receiver will return a normal TCP ACK segment + acknowledging sequence number 7000, with no SACK option. + + + Case 2: The first segment is dropped but the remaining 7 are + received. + + The data receiver will return a TCP ACK segment that + acknowledges sequence number 5000 and contains a SACK option + specifying one block of queued data: + + Relative Origin = 500; Block Size = 3500 + + + Case 3: The 2nd, 4th, 6th, and 8th (last) segments are + dropped. + + The data receiver will return a TCP ACK segment that + acknowledges sequence number 5500 and contains a SACK option + specifying the 3 blocks: + + Relative Origin = 500; Block Size = 500 + Relative Origin = 1500; Block Size = 500 + Relative Origin = 2500; Block Size = 500 + + + Case 4: Same as Case 3, except Scale Factor S = 16. + + The SACK option would specify the 3 scaled blocks: + + Relative Origin = 32; Block Size = 30 + Relative Origin = 94; Block Size = 31 + Relative Origin = 157; Block Size = 30 + + These three reported blocks have sequence numbers 512 through + 991, 1504 through 1999, and 2512 through 2992, respectively. + + +3.5 Generating the SACK Option + + Let us assume that the data receiver maintains a queue of valid + segments that it has neither passed to the user nor acknowledged + because of earlier missing data, and that this queue is ordered by + starting sequence number. Computation of the SACK option can be + done with one pass down this queue. Segments that occupy + + + +Jacobson & Braden [Page 10] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + contiguous sequence space are aggregated into a single SACK block, + and each gap in the sequence space (except a gap that is + terminated by the right window edge) triggers the start of a new + SACK block. If this algorithm defines more than 10 blocks, only + the first 10 can be included in the option. + +3.6 Interpreting the SACK Option + + The data transmitter is assumed to have a retransmission queue + that contains the segments that have been transmitted but not yet + acknowledged, in sequence-number order. If the data transmitter + performs re-packetization before retransmission, the block + boundaries in a SACK option that it receives may not fall on + boundaries of segments in the retransmission queue; however, this + does not pose a serious difficulty for the transmitter. + + Let us suppose that for each segment in the retransmission queue + there is a (new) flag bit "ACK'd", to be used to indicate that + this particular segment has been entirely acknowledged. When a + segment is first transmitted, it will be entered into the + retransmission queue with its ACK'd bit off. If the ACK'd bit is + subsequently turned on (as the result of processing a received + SACK option), the data transmitter will skip this segment during + any later retransmission. However, the segment will not be + dequeued and its buffer freed until the left window edge is + advanced over it. + + When an acknowledgment segment arrives containing a SACK option, + the data transmitter will turn on the ACK'd bits for segments that + have been selectively acknowleged. More specifically, for each + block in the SACK option, the data transmitter will turn on the + ACK'd flags for all segments in the retransmission queue that are + wholly contained within that block. This requires straightforward + sequence number comparisons. + + +4. TCP ECHO OPTIONS + + A simple method for measuring the RTT of a segment would be: the + sender places a timestamp in the segment and the receiver returns + that timestamp in the corresponding ACK segment. When the ACK segment + arrives at the sender, the difference between the current time and + the timestamp is the RTT. To implement this timing method, the + receiver must simply reflect or echo selected data (the timestamp) + from the sender's segments. This idea is the basis of the "TCP Echo" + and "TCP Echo Reply" options. + + + + + +Jacobson & Braden [Page 11] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + +4.1 TCP Echo and TCP Echo Reply Options + + TCP Echo Option: + + Kind: 6 + + Length: 6 + + +--------+--------+--------+--------+--------+--------+ + | Kind=6 | Length | 4 bytes of info to be echoed | + +--------+--------+--------+--------+--------+--------+ + + This option carries four bytes of information that the receiving TCP + may send back in a subsequent TCP Echo Reply option (see below). A + TCP may send the TCP Echo option in any segment, but only if a TCP + Echo option was received in a SYN segment for the connection. + + When the TCP echo option is used for RTT measurement, it will be + included in data segments, and the four information bytes will define + the time at which the data segment was transmitted in any format + convenient to the sender. + + TCP Echo Reply Option: + + Kind: 7 + + Length: 6 + + +--------+--------+--------+--------+--------+--------+ + | Kind=7 | Length | 4 bytes of echoed info | + +--------+--------+--------+--------+--------+--------+ + + + A TCP that receives a TCP Echo option containing four information + bytes will return these same bytes in a TCP Echo Reply option. + + This TCP Echo Reply option must be returned in the next segment + (e.g., an ACK segment) that is sent. If more than one Echo option is + received before a reply segment is sent, the TCP must choose only one + of the options to echo, ignoring the others; specifically, it must + choose the newest segment with the oldest sequence number (see next + section.) + + To use the TCP Echo and Echo Reply options, a TCP must send a TCP + Echo option in its own SYN segment and receive a TCP Echo option in a + SYN segment from the other TCP. A TCP that does not implement the + TCP Echo or Echo Reply options must simply ignore any TCP Echo + options it receives. However, a TCP should not receive one of these + + + +Jacobson & Braden [Page 12] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + options in a non-SYN segment unless it included a TCP Echo option in + its own SYN segment. + +4.2 Using the Echo Options + + If we wish to use the Echo/Echo Reply options for RTT measurement, we + have to define what the receiver does when there is not a one-to-one + correspondence between data and ACK segments. Assuming that we want + to minimize the state kept in the receiver (i.e., the number of + unprocessed Echo options), we can plan on a receiver remembering the + information value from at most one Echo between ACKs. There are + three situations to consider: + + (A) Delayed ACKs. + + Many TCP's acknowledge only every Kth segment out of a group of + segments arriving within a short time interval; this policy is + known generally as "delayed ACK's". The data-sender TCP must + measure the effective RTT, including the additional time due to + delayed ACK's, or else it will retransmit unnecessarily. Thus, + when delayed ACK's are in use, the receiver should reply with + the Echo option information from the earliest unacknowledged + segment. + + (B) A hole in the sequence space (segment(s) have been lost). + + The sender will continue sending until the window is filled, and + we may be generating ACKs as these out-of-order segments arrive + (e.g., for the SACK information or to aid "fast retransmit"). + An Echo Reply option will tell the sender the RTT of some + recently sent segment (since the ACK can only contain the + sequence number of the hole, the sender may not be able to + determine which segment, but that doesn't matter). If the loss + was due to congestion, these RTTs may be particularly valuable + to the sender since they reflect the network characteristics + immediately after the congestion. + + (C) A filled hole in the sequence space. + + The segment that fills the hole represents the most recent + measurement of the network characteristics. On the other hand, + an RTT computed from an earlier segment would probably include + the sender's retransmit time-out, badly biasing the sender's + average RTT estimate. + + + Case (A) suggests the receiver should remember and return the Echo + option information from the oldest unacknowledged segment. Cases (B) + + + +Jacobson & Braden [Page 13] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + and (C) suggest that the option should come from the most recent + unacknowledged segment. An algorithm that covers all three cases is + for the receiver to return the Echo option information from the + newest segment with the oldest sequence number, as specified earlier. + + A model implementation of these options is as follows. + + + (1) Receiver Implementation + + A 32-bit slot for Echo option data, rcv.echodata, is added to + the receiver connection state, together with a flag, + rcv.echopresent, that indicates whether there is anything in the + slot. When the receiver generates a segment, it checks + rcv.echopresent and, if it is set, adds an echo-reply option + containing rcv.echodata to the outgoing segment then clears + rcv.echopresent. + + If an incoming segment is in the window and contains an echo + option, the receiver checks rcv.echopresent. If it isn't set, + the value of the echo option is copied to rcv.echodata and + rcv.echopresent is set. If rcv.echopresent is already set, the + receiver checks whether the segment is at the left edge of the + window. If so, the segment's echo option value is copied to + rcv.echodata (this is situation (C) above). Otherwise, the + segment's echo option is ignored. + + + (2) Sender Implementation + + The sender's connection state has a single flag bit, + snd.echoallowed, added. If snd.echoallowed is set or if the + segment contains a SYN, the sender is free to add a TCP Echo + option (presumably containing the current time in some units + convenient to the sender) to every outgoing segment. + + Snd.echoallowed should be set if a SYN is received with a TCP + Echo option (presumably, a host that implements the option will + attempt to use it to time the SYN segment). + + +5. CONCLUSIONS AND ACKNOWLEDGMENTS + +We have proposed five new TCP options for scaled windows, selective +acknowledgments, and round-trip timing, in order to provide efficient +operation over large-bandwidth*delay-product paths. These extensions +are designed to provide compatible interworking with TCP's that do not +implement the extensions. + + + +Jacobson & Braden [Page 14] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + +The Window Scale option was originally suggested by Mike St. Johns of +USAF/DCA. The present form of the option was suggested by Mike Karels +of UC Berkeley in response to a more cumbersome scheme proposed by Van +Jacobson. Gerd Beling of FGAN (West Germany) contributed the initial +definition of the SACK option. + +All three options have evolved through discussion with the End-to-End +Task Force, and the authors are grateful to the other members of the +Task Force for their advice and encouragement. + +6. REFERENCES + + [Cheriton88] Cheriton, D., "VMTP: Versatile Message Transaction + Protocol", RFC 1045, Stanford University, February 1988. + + [Jain86] Jain, R., "Divergence of Timeout Algorithms for Packet + Retransmissions", Proc. Fifth Phoenix Conf. on Comp. and Comm., + Scottsdale, Arizona, March 1986. + + [Karn87] Karn, P. and C. Partridge, "Estimating Round-Trip Times + in Reliable Transport Protocols", Proc. SIGCOMM '87, Stowe, VT, + August 1987. + + [Clark87] Clark, D., Lambert, M., and L. Zhang, "NETBLT: A Bulk + Data Transfer Protocol", RFC 998, MIT, March 1987. + + [Nagle84] Nagle, J., "Congestion Control in IP/TCP + Internetworks", RFC 896, FACC, January 1984. + + [NBS85] Colella, R., Aronoff, R., and K. Mills, "Performance + Improvements for ISO Transport", Ninth Data Comm Symposium, + published in ACM SIGCOMM Comp Comm Review, vol. 15, no. 5, + September 1985. + + [Partridge87] Partridge, C., "Private Communication", February + 1987. + + [Postel81] Postel, J., "Transmission Control Protocol - DARPA + Internet Program Protocol Specification", RFC 793, DARPA, + September 1981. + + [Velten84] Velten, D., Hinden, R., and J. Sax, "Reliable Data + Protocol", RFC 908, BBN, July 1984. + + [Jacobson88] Jacobson, V., "Congestion Avoidance and Control", to + be presented at SIGCOMM '88, Stanford, CA., August 1988. + + [Zhang86] Zhang, L., "Why TCP Timers Don't Work Well", Proc. + + + +Jacobson & Braden [Page 15] + +RFC 1072 TCP Extensions for Long-Delay Paths October 1988 + + + SIGCOMM '86, Stowe, Vt., August 1986. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Jacobson & Braden [Page 16] + |