diff options
Diffstat (limited to 'doc/rfc/rfc4653.txt')
-rw-r--r-- | doc/rfc/rfc4653.txt | 1011 |
1 files changed, 1011 insertions, 0 deletions
diff --git a/doc/rfc/rfc4653.txt b/doc/rfc/rfc4653.txt new file mode 100644 index 0000000..93747ba --- /dev/null +++ b/doc/rfc/rfc4653.txt @@ -0,0 +1,1011 @@ + + + + + + +Network Working Group S. Bhandarkar +Request for Comments: 4653 A. L. N. Reddy +Category: Experimental Texas A&M University + M. Allman + ICIR/ICSI + E. Blanton + Purdue University + August 2006 + + + Improving the Robustness of TCP to Non-Congestion Events + + +Status of This Memo + + This memo defines an Experimental Protocol for the Internet + community. It does not specify an Internet standard of any kind. + Discussion and suggestions for improvement are requested. + Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2006). + +Abstract + + This document specifies Non-Congestion Robustness (NCR) for TCP. In + the absence of explicit congestion notification from the network, TCP + uses loss as an indication of congestion. One of the ways TCP + detects loss is using the arrival of three duplicate acknowledgments. + However, this heuristic is not always correct, notably in the case + when network paths reorder segments (for whatever reason), resulting + in degraded performance. TCP-NCR is designed to mitigate this + degraded performance by increasing the number of duplicate + acknowledgments required to trigger loss recovery, based on the + current state of the connection, in an effort to better disambiguate + true segment loss from segment reordering. This document specifies + the changes to TCP, as well as the costs and benefits of these + modifications. + + + + + + + + + + + + +Bhandarkar, et al. Experimental [Page 1] + +RFC 4653 Improving the Robustness of TCP August 2006 + + +Table of Contents + + 1. Introduction ....................................................2 + 1.1. Terminology ................................................4 + 2. NCR Description .................................................5 + 3. Algorithm .......................................................6 + 3.1. Initialization .............................................8 + 3.2. Terminating Extended Limited Transmit and + Preventing Bursts ..........................................9 + 3.3. Extended Limited Transmit .................................10 + 3.4. Entering Loss Recovery ....................................11 + 4. Advantages .....................................................12 + 5. Disadvantages ..................................................12 + 6. Related Work ...................................................13 + 7. Security Considerations ........................................14 + 8. Acknowledgments ................................................14 + 9. IANA Considerations ............................................14 + 10. References ....................................................14 + 10.1. Normative References .....................................14 + 10.2. Informative References ...................................15 + +1. Introduction + + One strength of TCP [RFC793] lies in its ability to adjust its + sending rate according to the perceived congestion in the network + [Jac88, RFC2581]. In the absence of explicit notification of + congestion from the network, TCP uses segment loss as an indication + of congestion (i.e., assuming queue overflow). TCP receivers send + cumulative acknowledgments (ACKs) indicating the next sequence number + expected from the sender for arriving segments [RFC793]. When + segments arrive out of order, duplicate ACKs are generated. As + specified in [RFC2581], a TCP sender uses the arrival of three + duplicate ACKs as an indication of segment loss. The TCP sender + retransmits the lost segment and reduces the load imposed on the + network, assuming the segment loss was caused by resource contention + within the network path. The TCP sender does not assume loss on the + first or second duplicate ACK, but waits for three duplicate ACKs to + account for minor packet reordering. However, the use of this + constant threshold of duplicate ACKs has several problems that can be + mitigated with a dynamic threshold. + + The following is an example of TCP's behavior: + + + TCP A is the data sender, and TCP B is the data receiver. + + + TCP A sends 10 segments, each consisting of a single data byte + (i.e., transmits bytes 1-10 in segments 1-10). + + + + +Bhandarkar, et al. Experimental [Page 2] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + + Assume segment 3 is dropped in the network. + + + TCP B cumulatively acknowledges segments 1 and 2, making the + cumulative ACK transmitted to the sender 3 (the next expected + sequence number). (Note: TCP B may generate one or two ACKs, + depending on whether delayed ACKs [RFC1122, RFC2581] are + employed.) + + + The arrival of segments 4-10 at TCP B will each trigger the + transmission of a cumulative ACK for sequence number 3. (Note: + [RFC2581] recommends that delayed ACKs not be used when the ACK + is triggered by an out-of-order segment.) + + + When TCP A receives the third duplicate ACK (or fourth ACK + overall) for sequence number 3, TCP A will retransmit + segment 3 and reduce the sending rate by roughly half (see + [RFC2581] for specifics on the congestion control state + adjustments). + + Alternatively, suppose segment 3 was not dropped by the network, but + rather delayed such that segment 3 arrives at TCP B after segment 10. + The above scenario will play out in precisely the same manner + insomuch as a retransmission of segment 3 will be triggered. In + other words, TCP is not capable of disambiguating this reordering + event from a segment loss, resulting in an unnecessary retransmission + and rate reduction. + + The following is the specific motivation behind making TCP robust to + reordered segments: + + * A number of Internet measurement studies have shown that packet + reordering is not a rare phenomenon [Pax97, BPS99, JIDKT03, + GPL04]. Further, the reordering can be well beyond that required + for fast retransmit to be falsely triggered. + + * [BA02, ZKFP03] show the negative performance implications that + packet reordering has on current TCP. + + * The requirement imposed by TCP for almost in-order packet + delivery places a constraint on the design of future technology. + Novel routing algorithms, network components, link-layer + retransmission mechanisms, and applications could all be looked + at with a fresh perspective if TCP were to be more robust to + segment reordering. For instance, high-speed packet switches + could cause resequencing of packets if TCP were more robust. + There has been work proposed in the literature explicitly to + ensure that packet ordering is maintained in such switches (e.g., + [KM02]). Also, link-layer mechanisms that attempt to recover + + + +Bhandarkar, et al. Experimental [Page 3] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + from packet corruption by retransmitting could be allowed to + reorder packets, and thus increase the chances of local loss + repair rather than rely on TCP to repair the loss (and, + needlessly reduce its sending rate). Additional examples include + multi-path routing, high-delay satellite links, and some of the + schemes proposed for a differentiated services architecture. By + making TCP more robust to non-congestion events, TCP-NCR may open + the design space of the future Internet components. + + In this document, we specify a set of TCP sender modifications to + provide Non-Congestion Robustness (NCR) to TCP. In particular, these + changes are built on top of TCP with selective acknowledgments + (SACKs) [RFC2018] and the SACK-based loss recovery scheme given in + [RFC3517], since SACK is widely deployed at this point ([MAF05] + indicates that 68% of web servers and 88% of web clients utilize SACK + as of spring 2004). + + Note that the TCP-NCR algorithm provided in this document could be + easily adapted to SCTP [RFC2960] since SCTP uses congestion control + algorithms similar to TCP's (and thus has the same reordering + robustness issues). + + As noted in several places in the remainder of this document, we + consider TCP-NCR experimental in that more experience with the + techniques is required before TCP-NCR should be used on a large scale + on the Internet. We encourage implementation and experimentation + with TCP-NCR in the hopes of gaining an understanding of its + suitability for wide-scale deployment. + + The remainder of this document is organized as follows. Section 2 + provides a high-level description of the TCP-NCR mechanisms. In + Section 3, we specify the TCP-NCR algorithm. Section 4 provides a + brief overview of the benefits of TCP-NCR, while Section 5 discusses + the drawbacks. Section 6 discusses related work. Section 7 + discusses security concerns. + +1.1. Terminology + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [RFC2119]. + + Readers should be familiar with the TCP terminology (e.g., + FlightSize, Pipe) given in [RFC2581] and [RFC3517]. + + + + + + + +Bhandarkar, et al. Experimental [Page 4] + +RFC 4653 Improving the Robustness of TCP August 2006 + + +2. NCR Description + + As discussed above, in the face of packet reordering, three duplicate + ACKs may not be enough to disambiguate loss from reordering. In this + section we provide a non-normative sketch of TCP-NCR. The detailed + algorithms for implementing Non-Congestion Robustness for TCP are + presented in the next section. + + The general idea behind TCP-NCR is to increase the threshold used to + trigger a fast retransmission from the current fixed value of three + duplicate ACKs [RFC2581] to approximately a congestion window of data + having left the network (but not less than the currently standardized + value of three duplicate ACKs). Since cwnd represents the amount of + data a TCP flow can transmit in one round-trip time (RTT), waiting to + receive notice that cwnd bytes have left the network before deciding + whether the root cause is loss or reordering imposes a delay of + roughly one RTT on both the retransmission and the congestion control + response. The appropriate choice for a new value of the threshold is + essentially a trade-off between making the best decision regarding + the cause of the duplicate ACKs and responsiveness. The choice to + trigger a retransmission only after a cwnd's worth of data is known + to have left the network represents roughly the largest amount of + time a TCP can wait before the (often costly) retransmission timeout + may be triggered. Therefore, the algorithm described in this + document attempts to make the best decision possible at the expense + of timeliness. + + Simply increasing the threshold before retransmitting a segment can + make TCP brittle to packet loss or ACK loss since such loss reduces + the number of duplicate ACKs that will arrive at the sender from the + receiver. For instance, if the cwnd is 10 segments and one segment + is lost, a duplicate ACK threshold of 10 will never be met because + duplicate ACKs corresponding to at most 9 segments will arrive at the + sender. To offset the issue of loss, we extend TCP's Limited + Transmit [RFC3042] scheme to allow for the sending of new data during + the period when the TCP sender is disambiguating loss and reordering. + This new data serves to increase the likelihood that enough duplicate + ACKs arrive at the sender to trigger loss recovery if it is + appropriate. + + Note that TCP tightly couples reliability and congestion control: + when a segment is declared lost, a retransmission is triggered, and a + change to the sending rate is also made on the assumption that the + drop is due to resource contention [RFC2581]. Therefore, simply by + changing the retransmission trigger, the congestion control response + is also changed. However, we lack experience on the Internet as to + whether delaying the point that a rate reduction takes place is + + + + +Bhandarkar, et al. Experimental [Page 5] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + appropriate for wide-scale deployment. Therefore, the Extended + Limited Transmit mechanism proposed in this document offers two + variants for experimentation. + + The first Extended Limited Transmit variant, Careful Limited + Transmit, calls for the transmission of one previously unsent + segment, in response to duplicate acknowledgments, for every two + segments that are known to have left the network. This effectively + halves the sending rate, since normal TCP operation calls for the + sending of one segment for every segment that has left the network. + Further, the halving starts immediately and is not delayed until a + retransmission is triggered. In the case of packet reordering (i.e., + not segment loss), the congestion control state is restored to its + previous state when reordering is determined. + + The second variant, Aggressive Limited Transmit, calls for + transmitting one previously unsent data segment, in response to + duplicate acknowledgments, for every segment known to have left the + network. With this variant, while waiting to disambiguate the loss + from a reordering event, ACK-clocked transmission continues at + roughly the same rate as before the event started. Retransmission + and the sending rate reduction happen per [RFC2581, RFC3517], albeit + with the delayed threshold described above. Although this approach + delays legitimate rate reductions (possibly slightly and temporarily + aggravating overall congestion on the network), the scheme has the + advantage of not reducing the transmission rate in the face of + segment reordering. + + Which of the two Extended Limited Transmit variants is best for use + on the Internet is an open question. + +3. Algorithm + + The TCP-NCR modifications make two fundamental changes to the way + [RFC3517] currently operates, as follows. + + First, the trigger for retransmitting a segment is changed from three + duplicate ACKs [RFC2581, RFC3517] to indications that a congestion + window's worth of data has left the network. Second, TCP-NCR + decouples initial congestion control decisions from retransmission + decisions, in some cases delaying congestion control changes relative + to TCP's current behavior as defined in [RFC2581]. The algorithm + provides two alternatives for extending Limited Transmit. The two + variants of extended Limited Transmit are: + + + + + + + +Bhandarkar, et al. Experimental [Page 6] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + Careful Limited Transmit + + This variant calls for reducing the sending rate at + approximately the same time [RFC2581] implementations reduce + the congestion window, while at the same time withholding a + retransmission (and the final congestion determination) for + approximately one RTT. + + Aggressive Limited Transmit + + This variant calls for maintaining the sending rate in the + face of duplicate ACKs until TCP concludes that a segment is + lost and needs to be retransmitted (which TCP-NCR delays by + one RTT when compared with current loss recovery schemes). + + A TCP-NCR implementation MUST use either Careful Limited Transmit or + Aggressive Limited Transmit. + + A constant MUST be set, depending on which variant of extended + Limited Transmit is used, as follows: + + Careful Limited Transmit + + LT_F = 2/3 + + Aggressive Limited Transmit + + LT_F = 1/2 + + This constant reflects the fraction of outstanding data (including + data sent during Extended Limited Transmit) that must be SACKed + before a retransmission is triggered. Since Aggressive Limited + Transmit sends a new segment for every segment known to have left the + network, a total of roughly cwnd segments will be sent during + Aggressive Limited Transmit, and therefore ideally a total of roughly + 2*cwnd segments will be outstanding when a retransmission is + triggered. The duplicate ACK threshold is then set to LT_F = 1/2 of + 2*cwnd (or about 1 RTT worth of data). The factor is different for + Careful Limited Transmit because the sender only transmits one new + segment for every two segments that are SACKed and therefore will + ideally have a total of 1.5*cwnd segments outstanding when the + retransmission is to be triggered. Hence, the required threshold is + LT_F=2/3 of 1.5*cwnd to delay the retransmission by roughly 1 RTT. + + There are situations whereby the sender cannot transmit new data + during Extended Limited Transmit (e.g., lack of data from the + application, receiver's advertised window limit). These situations + can lead to the problems discussed in the last section when a TCP + + + +Bhandarkar, et al. Experimental [Page 7] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + does not employ Extended Limited Transmit and is starved for ACKs. + Therefore, TCP-NCR adapts the duplicate ACK threshold on each SACK + arrival to be as robust as possible given the actual amount of data + that has been transmitted, or roughly LT_F times the number of + outstanding segments. + + The TCP-NCR modifications specified in this document lend themselves + to incremental deployment. Only the TCP implementation on the sender + side requires modification (assuming both hosts support SACK). The + changes themselves are modest. However, as will be discussed below, + availability of additional buffer space at the receiver will help + maximize the benefits of using TCP-NCR but is not strictly necessary. + + The following algorithms depend on the notions provided by [RFC3517], + and we assume the reader is familiar with the terminology given in + [RFC3517]. The TCP-NCR algorithm can be adapted to alternate SACK- + based loss recovery schemes. [BR04, BSRV04] outline non-SACK-based + algorithms; however, we do not specify those algorithms in this + document and do not recommend them due to both the complexity and + security implications of having only a gross understanding of the + number of outstanding segments in the network. + + A TCP connection using the Nagle algorithm [RFC896, RFC1122] MAY + employ the TCP-NCR algorithm. If a TCP implementation does implement + TCP-NCR, the implementation MUST follow the various specifications + provided in Sections 3.1 - 3.4. If the Nagle algorithm is not being + used, there is no way to accurately calculate the number of + outstanding segments in the network (and, therefore, no good way to + derive an appropriate duplicate ACK threshold) without adding state + to the TCP sender. A TCP connection that does not employ the Nagle + algorithm SHOULD NOT use TCP-NCR. We envision that NCR could be + adapted to an implementation that carefully tracks the sequence + numbers transmitted in each segment. However, we leave this as + future work. + +3.1. Initialization + + When entering a period of loss/reordering detection and Extended + Limited Transmit, a TCP-NCR MUST initialize several state variables. + A TCP MUST enter Extended Limited Transmit upon receiving the first + ACK with a SACK block after the reception of an ACK that (a) did not + contain SACK information and (b) did increase the connection's + cumulative ACK point. The initializations are: + + (I.1) The TCP MUST save the current FlightSize. + + FlightSizePrev = FlightSize + + + + +Bhandarkar, et al. Experimental [Page 8] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + (I.2) The TCP MUST set a variable for tracking the number of + segments for which an ACK does not trigger a transmission + during Careful Limited Transmit. + + Skipped = 0 + + (Note: Skipped is not used during Aggressive Limited + Transmit.) + + (I.3) The TCP MUST set DupThresh (from [RFC3517]) based on the + current FlightSize. + + DupThresh = max (LT_F * (FlightSize / SMSS),3) + + Note: We keep the lower bound of DupThresh = 3 from + [RFC2581, RFC3517]. + + In addition to the above steps, the incoming ACK MUST be processed + with the E series of steps in Section 3.3. + +3.2. Terminating Extended Limited Transmit and Preventing Bursts + + Extended Limited Transmit MUST be terminated at the start of loss + recovery as outlined in Section 3.4. + + The arrival of an ACK that advances the cumulative ACK point while in + Extended Limited Transmit, but before loss recovery is triggered, + signals that a series of duplicate ACKs was caused by reordering and + not congestion. Therefore, the receipt of an ACK that extends the + cumulative ACK point MUST terminate Extended Limited Transmit. As + described below (in (T.4)), an ACK that extends the cumulative ACK + point and *also* contains SACK information will also trigger the + beginning of a new Extended Limited Transmit phase. + + Upon the termination of Extended Limited Transmit, and especially + when using the Careful variant, TCP-NCR may be in a situation where + the entire cwnd is not being utilized, and therefore TCP-NCR will be + prone to transmitting a burst of segments into the network. + Therefore, to mitigate this bursting when a TCP-NCR in the Extended + Limited Transmit phase receives an ACK that updates the cumulative + ACK point (regardless of whether the ACK contains SACK information), + the following steps MUST be taken: + + + + + + + + + +Bhandarkar, et al. Experimental [Page 9] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + (T.1) A TCP MUST reset cwnd to: + + cwnd = min (FlightSize + SMSS,FlightSizePrev) + + This step ensures that cwnd is not grossly larger than the + amount of data outstanding, a situation that would cause a + line rate burst. + + (T.2) A TCP MUST set ssthresh to: + + ssthresh = FlightSizePrev + + This step provides TCP-NCR with a sense of "history". If step + (T.1) reduces cwnd below FlightSizePrev, this step ensures that + TCP-NCR will slow start back to the operating point in effect + before Extended Limited Transmit. + + (T.3) A TCP is now permitted to transmit previously unsent data as + allowed by cwnd, FlightSize, application data availability, and + the receiver's advertised window. + + (T.4) When an incoming ACK extends the cumulative ACK point and also + contains SACK information, the initializations in steps (I.2) + and (I.3) from Section 3.1 MUST be taken (but step (I.1) MUST + NOT be executed) to re-start Extended Limited Transmit. In + addition, the series of steps in Section 3.3 (the "E" steps) + MUST be taken. + +3.3. Extended Limited Transmit + + On each ACK containing SACK information that arrives after TCP-NCR + has entered the Extended Limited Transmit phase (as outlined in + Section 3.1) and before Extended Limited Transmit terminates, the + sender MUST use the following procedure. + + (E.1) The SetPipe () procedure from [RFC3517] MUST be used to set + the "pipe" variable (which represents the number of bytes + still considered "in the network"). Note: the current value + of DupThresh MUST be used by SetPipe () to produce an accurate + assessment of the amount of data still considered in the + network. + + (E.2) If the comparison in equation (1), below, holds and there are + SMSS bytes of previously unsent data available for + transmission, then the sender MUST transmit one segment of SMSS + bytes. + + (pipe + Skipped) <= (FlightSizePrev - SMSS) (1) + + + +Bhandarkar, et al. Experimental [Page 10] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + If the comparison in equation (1) does not hold or no new data + can be transmitted (due to lack of data from the application + or the advertised window limit), skip to step (E.6). + + (E.3) Pipe MUST be incremented by SMSS bytes. + + (E.4) If using Careful Limited Transmit, Skipped MUST be incremented + by SMSS bytes to ensure that the next SMSS bytes of SACKed data + processed does not trigger a Limited Transmit transmission + (since the goal of Careful Limited Transmit is to send upon + receipt of every second duplicate ACK). + + (E.5) A TCP MUST return to step (E.2) to ensure that as many bytes + as are appropriate are transmitted. This provides robustness + to ACK loss that can be (largely) compensated for using SACK + information. + + (E.6) DupThresh MUST be reset via: + + DupThresh = max (LT_F * (FlightSize / SMSS),3) + + where FlightSize is the total number of bytes that have not + been cumulatively acknowledged (which is different from + "pipe"). + +3.4. Entering Loss Recovery + + When a segment is deemed lost via the algorithms in [RFC3517], + Extended Limited Transmit MUST be terminated, leaving the algorithms + in [RFC3517] to govern TCP's behavior. One slight change to + [RFC3517] MUST be made, however. In Section 5, step (2) of [RFC3517] + MUST be changed to: + + (2) ssthresh = cwnd = (FlightSizePrev / 2) + + This ensures that the congestion control modifications are made with + respect to the amount of data in the network before FlightSize was + increased by Extended Limited Transmit. + + Note: Once the algorithm in [RFC3517] takes over from Extended + Limited Transmit, the DupThresh value MUST be held constant until the + loss recovery phase is terminated. + + + + + + + + + +Bhandarkar, et al. Experimental [Page 11] + +RFC 4653 Improving the Robustness of TCP August 2006 + + +4. Advantages + + The major advantages of TCP-NCR are twofold. As discussed in Section + 1, TCP-NCR will open up the design space for network applications and + components that are currently constrained by TCP's lack of robustness + to packet reordering. The second advantage is in terms of an + increase in TCP performance. + + [BR04] presents ns-2 [NS-2] simulations of a pre-cursor to the TCP- + NCR algorithm specified in this document, called TCP-DCR (Delayed + Congestion Response). The paper shows that TCP-DCR aids performance + in comparison to unmodified TCP in the presence of packet reordering. + In addition, the extended version of [BR04] presents results based on + emulations involving Linux (kernel 2.4.24). These results show that + the performance of TCP-DCR is similar to Linux's native + implementation that seeks to "undo" wrong decisions according to + duplicate-SACK (DSACK) [RFC2883] feedback (similar to the schemes + outlined in [ZKFP03]), when packets are reordered by less than one + RTT. The advantage of using TCP-DCR over the DSACK-based scheme is + that the DSACK-based scheme tries to estimate the exact amount of + reordering in the network using fairly complex algorithms, whereas + TCP-DCR achieves similar results with less complicated modifications. + + In addition, [BR04,BSRV04] illustrate the ability of TCP-DCR to allow + for the improvement of other parts of the system. For example, these + papers show that increasing TCP's robustness to packet reordering + allows a novel wireless ARQ mechanism to be added at the link-layer. + The added robustness of the link-layer to channel errors, in turn, + increases TCP performance by not requiring TCP to retransmit packets + that were dropped due to corruption (and thus also prevents TCP from + needlessly reducing the sending rate when retransmitting these + segments). + +5. Disadvantages + + Although all the changes outlined above are implemented in the + sender, the receiver also potentially has a part to play. In + particular, TCP-NCR increases the receiver's buffering requirement by + up to an extra cwnd -- in the case of the TCP sender using Aggressive + Limited Transmit and actual loss occurring in the network. + Therefore, to maximize the benefits from TCP-NCR, receivers should + advertise a large window to absorb the extra out-of-order traffic. + In the case that the additional buffer requirements are not met, the + use of the above algorithm takes into account the reduced advertised + window -- with a corresponding loss in robustness to packet + reordering. + + + + + +Bhandarkar, et al. Experimental [Page 12] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + In addition, using TCP-NCR could delay the delivery of data to the + application by up to one RTT because the fast retransmission point is + delayed by roughly one RTT in TCP-NCR. Applications that are + sensitive to such delays should turn off the TCP-NCR option. For + instance, a socket option could be introduced to allow applications + to control whether NCR would be used for a particular connection. + + Finally, the use of TCP-NCR makes the recovery from congestion events + sluggish in comparison to the standard reaction in [RFC2581]. [BR04, + BSRV04] show (via simulation) that the delay in congestion response + has minimal impact on the connection itself and the traffic sharing a + bottleneck. [BBFS01] also indicates (again, via simulation) that + "slowly responsive" congestion control may be safe for deployment in + the Internet. These studies suggest that schemes that slightly delay + congestion control decisions may be reasonable; however, further + experimentation on the Internet is required to verify these results. + +6. Related Work + + Over the past few years, several solutions have been proposed to + improve the performance of TCP in the face of segment reordering. + These schemes generally fall into one of two categories (with some + overlap): mechanisms that try to prevent spurious retransmits from + happening and mechanisms that try to detect spurious retransmits and + "undo" the needless congestion control state changes that have been + taken. + + [BA02,ZKFP03] attempt to prevent segment reordering from triggering + spurious retransmits by using various algorithms to approximate the + duplicate ACK threshold required to disambiguate loss and reordering + over a given network path at a given time. TCP-NCR similarly tries + to prevent spurious retransmits. However, TCP-NCR takes a simplified + approach compared to those in [BA02, ZKFP03], in that TCP-NCR simply + delays retransmission by an amount based on the current cwnd (in + comparison to standard TCP), while the other schemes use relatively + complex algorithms in an attempt to derive a more precise value for + DupThresh that depends on the current patterns of packet reordering. + While TCP-NCR offers simplicity, the other schemes may offer more + precision such that applications would not be forced to wait as long + for their retransmissions. Future work could be undertaken to + achieve robustness without needless delay. + + On the other hand, several schemes have been developed to detect and + mitigate needless retransmissions after the fact. [RFC3522, RFC3708, + BA02, RFC4015, RFC4138] present algorithms to detect spurious + retransmits and mitigate the changes these events made to the + congestion control state. TCP-NCR could be used in conjunction with + these algorithms, with TCP-NCR attempting to prevent spurious + + + +Bhandarkar, et al. Experimental [Page 13] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + retransmits and some other scheme kicking in if the prevention + failed. In addition, note that TCP-NCR is concentrated on preventing + spurious fast retransmits; some of the above algorithms also attempt + to detect and mitigate spurious timeout-based retransmits. + +7. Security Considerations + + General attacks against the congestion control of TCP are described + in [RFC2581]. SACK-based loss recovery for TCP [RFC3517] mitigates + some of the duplicate ACK attacks against TCP's congestion control. + This document builds upon that work, and the Extended Limited + Transmit algorithms specified in this document have been designed to + thwart the ACK division problems that are described in [RFC3465]. + +8. Acknowledgments + + Feedback from Lars Eggert, Ted Faber, Wesley Eddy, Gorry Fairhurst, + Sally Floyd, Sara Landstrom, Nauzad Sadry, Pasi Sarolahti, Joe Touch, + Nitin Vaidya, and the TCPM working group have contributed + significantly to this document. Our thanks to all! + +9. References + +9.1. Normative References + + [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC + 793, September 1981. + + [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP + Selective Acknowledgement Options", RFC 2018, October 1996. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion + Control", RFC 2581, April 1999. + + [RFC3042] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing + TCP's Loss Recovery Using Limited Transmit", RFC 3042, + January 2001. + + [RFC3517] Blanton, E., Allman, M., Fall, K., and L. Wang, "A + Conservative Selective Acknowledgment (SACK)-based Loss + Recovery Algorithm for TCP", RFC 3517, April 2003. + + + + + + + +Bhandarkar, et al. Experimental [Page 14] + +RFC 4653 Improving the Robustness of TCP August 2006 + + +9.2. Informative References + + [BA02] E. Blanton and M. Allman, "On Making TCP More Robust to + Packet Reordering," ACM Computer Communication Review, + January 2002. + + [BBFS01] D. Bansal, H. Balakrishnan, S. Floyd and S. Shenker, + "Dynamic Behavior of Slowly Responsive Congestion Control + Algorithms", Proceedings of ACM SIGCOMM, Sep. 2001. + + [BPS99] J. Bennett, C. Partridge, and N. Shectman, "Packet + reordering is not pathological network behavior," IEEE/ACM + Transactions on Networking, December 1999. + + [BR04] Sumitha Bhandarkar and A. L. Narasimha Reddy, "TCP-DCR: + Making TCP Robust to Non-Congestion Events", In the + Proceedings of Networking 2004 conference, May 2004. + Extended version available as tech report TAMU-ECE-2003-04. + + [BSRV04] Sumitha Bhandarkar, Nauzad Sadry, A. L. Narasimha Reddy and + Nitin Vaidya, "TCP-DCR: A Novel Protocol for Tolerating + Wireless Channel Errors", to appear in IEEE Transactions on + Mobile Computing. + + [GPL04] Ladan Gharai, Colin Perkins and Tom Lehman, "Packet + Reordering, High Speed Networks and Transport Protocol + Performance", ICCCN 2004, October 2004. + + [Jac88] V. Jacobson, "Congestion Avoidance and Control", Computer + Communication Review, vol. 18, no. 4, pp. 314-329, Aug. + 1988. ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. + + [JIDKT03] S. Jaiswal, G. Iannaccone, C. Diot, J. Kurose, and D. + Towsley, "Measurement and Classification of Out-of-Sequence + Packets in a Tier-1 IP Backbone," Proceedings of IEEE + INFOCOM, 2003. + + [KM02] I. Keslassy and N. McKeown, "Maintaining packet order in + twostage switches," Proceedings of the IEEE Infocom, June + 2002 + + [MAF05] A. Medina, M. Allman, S. Floyd. Measuring the Evolution of + Transport Protocols in the Internet. ACM Computer + Communication Review, 35(2), April 2005. + + [NS-2] ns-2 Network Simulator. http://www.isi.edu/nsnam/ + + + + + +Bhandarkar, et al. Experimental [Page 15] + +RFC 4653 Improving the Robustness of TCP August 2006 + + + [Pax97] V. Paxson, "End-to-End Internet Packet Dynamics," + Proceedings of ACM SIGCOMM, September 1997. + + [RFC896] Nagle, J., "Congestion control in IP/TCP internetworks", + RFC 896, January 1984. + + [RFC1122] Braden, R., "Requirements for Internet Hosts - + Communication Layers", STD 3, RFC 1122, October 1989. + + [RFC2883] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An + Extension to the Selective Acknowledgement (SACK) Option + for TCP", RFC 2883, July 2000. + + [RFC2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. + Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, V. + Paxson. Stream Control Transmission Protocol. October + 2000. + + [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte + Counting (ABC)", RFC 3465, February 2003. + + [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for + TCP", RFC 3522, April 2003. + + [RFC3708] Blanton, E. and M. Allman, "Using TCP Duplicate Selective + Acknowledgement (DSACKs) and Stream Control Transmission + Protocol (SCTP) Duplicate Transmission Sequence Numbers + (TSNs) to Detect Spurious Retransmissions", RFC 3708, + February 2004. + + [RFC4015] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm for + TCP", RFC 4015, February 2005. + + [RFC4138] Sarolahti, P. and M. Kojo, "Forward RTO-Recovery (F-RTO): + An Algorithm for Detecting Spurious Retransmission Timeouts + with TCP and the Stream Control Transmission Protocol + (SCTP)", RFC 4138, August 2005. + + [ZKFP03] M. Zhang, B. Karp, S. Floyd, L. Peterson, "RR-TCP: A + Reordering-Robust TCP with DSACK", in Proceedings of the + Eleventh IEEE International Conference on Networking + Protocols (ICNP 2003), Atlanta, GA, November, 2003. + + + + + + + + + +Bhandarkar, et al. Experimental [Page 16] + +RFC 4653 Improving the Robustness of TCP August 2006 + + +Authors' Addresses + + Sumitha Bhandarkar + Dept. of Elec. Engg. + 214 ZACH + College Station, TX 77843-3128 + + Phone: (512) 468-8078 + EMail: sumitha@tamu.edu + URL: http://students.cs.tamu.edu/sumitha/ + + + A. L. Narasimha Reddy + Professor + Dept. of Elec. Engg. + 315C WERC + College Station, TX 77843-3128 + + Phone: (979) 845-7598 + EMail: reddy@ee.tamu.edu + URL: http://ee.tamu.edu/~reddy/ + + + Mark Allman + ICSI Center for Internet Research + 1947 Center Street, Suite 600 + Berkeley, CA 94704-1198 + + Phone: (440) 235-1792 + EMail: mallman@icir.org + URL: http://www.icir.org/mallman/ + + + Ethan Blanton + Purdue University Computer Science + 305 North University Street + West Lafayette, IN 47907 + + EMail: eblanton@cs.purdue.edu + + + + + + + + + + + + +Bhandarkar, et al. Experimental [Page 17] + +RFC 4653 Improving the Robustness of TCP August 2006 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2006). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET + ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, + INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE + INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + +Acknowledgement + + Funding for the RFC Editor function is provided by the IETF + Administrative Support Activity (IASA). + + + + + + + +Bhandarkar, et al. Experimental [Page 18] + |