summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc3782.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc3782.txt')
-rw-r--r--doc/rfc/rfc3782.txt1067
1 files changed, 1067 insertions, 0 deletions
diff --git a/doc/rfc/rfc3782.txt b/doc/rfc/rfc3782.txt
new file mode 100644
index 0000000..7df0e7f
--- /dev/null
+++ b/doc/rfc/rfc3782.txt
@@ -0,0 +1,1067 @@
+
+
+
+
+
+
+Network Working Group S. Floyd
+Request for Comments: 3782 ICSI
+Obsoletes: 2582 T. Henderson
+Category: Standards Track Boeing
+ A. Gurtov
+ TeliaSonera
+ April 2004
+
+
+ The NewReno Modification to TCP's Fast Recovery Algorithm
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2004). All Rights Reserved.
+
+Abstract
+
+ The purpose of this document is to advance NewReno TCP's Fast
+ Retransmit and Fast Recovery algorithms in RFC 2582 from Experimental
+ to Standards Track status.
+
+ The main change in this document relative to RFC 2582 is to specify
+ the Careful variant of NewReno's Fast Retransmit and Fast Recovery
+ algorithms. The base algorithm described in RFC 2582 did not attempt
+ to avoid unnecessary multiple Fast Retransmits that can occur after a
+ timeout. However, RFC 2582 also defined "Careful" and "Less Careful"
+ variants that avoid these unnecessary Fast Retransmits, and
+ recommended the Careful variant. This document specifies the
+ previously-named "Careful" variant as the basic version of NewReno
+ TCP.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Floyd, et al. Standards Track [Page 1]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+1. Introduction
+
+ For the typical implementation of the TCP Fast Recovery algorithm
+ described in [RFC2581] (first implemented in the 1990 BSD Reno
+ release, and referred to as the Reno algorithm in [FF96]), the TCP
+ data sender only retransmits a packet after a retransmit timeout has
+ occurred, or after three duplicate acknowledgements have arrived
+ triggering the Fast Retransmit algorithm. A single retransmit
+ timeout might result in the retransmission of several data packets,
+ but each invocation of the Fast Retransmit algorithm in RFC 2581
+ leads to the retransmission of only a single data packet.
+
+ Problems can arise, therefore, when multiple packets are dropped from
+ a single window of data and the Fast Retransmit and Fast Recovery
+ algorithms are invoked. In this case, if the SACK option is
+ available, the TCP sender has the information to make intelligent
+ decisions about which packets to retransmit and which packets not to
+ retransmit during Fast Recovery. This document applies only for TCP
+ connections that are unable to use the TCP Selective Acknowledgement
+ (SACK) option, either because the option is not locally supported or
+ because the TCP peer did not indicate a willingness to use SACK.
+
+ In the absence of SACK, there is little information available to the
+ TCP sender in making retransmission decisions during Fast Recovery.
+ From the three duplicate acknowledgements, the sender infers a packet
+ loss, and retransmits the indicated packet. After this, the data
+ sender could receive additional duplicate acknowledgements, as the
+ data receiver acknowledges additional data packets that were already
+ in flight when the sender entered Fast Retransmit.
+
+ In the case of multiple packets dropped from a single window of data,
+ the first new information available to the sender comes when the
+ sender receives an acknowledgement for the retransmitted packet (that
+ is, the packet retransmitted when Fast Retransmit was first entered).
+ If there is a single packet drop and no reordering, then the
+ acknowledgement for this packet will acknowledge all of the packets
+ transmitted before Fast Retransmit was entered. However, if there
+ are multiple packet drops, then the acknowledgement for the
+ retransmitted packet will acknowledge some but not all of the packets
+ transmitted before the Fast Retransmit. We call this acknowledgement
+ a partial acknowledgment.
+
+ Along with several other suggestions, [Hoe95] suggested that during
+ Fast Recovery the TCP data sender responds to a partial
+ acknowledgment by inferring that the next in-sequence packet has been
+ lost, and retransmitting that packet. This document describes a
+ modification to the Fast Recovery algorithm in RFC 2581 that
+ incorporates a response to partial acknowledgements received during
+
+
+
+Floyd, et al. Standards Track [Page 2]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ Fast Recovery. We call this modified Fast Recovery algorithm
+ NewReno, because it is a slight but significant variation of the
+ basic Reno algorithm in RFC 2581. This document does not discuss the
+ other suggestions in [Hoe95] and [Hoe96], such as a change to the
+ ssthresh parameter during Slow-Start, or the proposal to send a new
+ packet for every two duplicate acknowledgements during Fast Recovery.
+ The version of NewReno in this document also draws on other
+ discussions of NewReno in the literature [LM97, Hen98].
+
+ We do not claim that the NewReno version of Fast Recovery described
+ here is an optimal modification of Fast Recovery for responding to
+ partial acknowledgements, for TCP connections that are unable to use
+ SACK. Based on our experiences with the NewReno modification in the
+ NS simulator [NS] and with numerous implementations of NewReno, we
+ believe that this modification improves the performance of the Fast
+ Retransmit and Fast Recovery algorithms in a wide variety of
+ scenarios.
+
+2. Terminology and Definitions
+
+ In this document, the key words "MUST", "MUST NOT", "REQUIRED",
+ "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
+ and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119
+ [RFC2119]. This RFC indicates requirement levels for compliant TCP
+ implementations implementing the NewReno Fast Retransmit and Fast
+ Recovery algorithms described in this document.
+
+ This document assumes that the reader is familiar with the terms
+ SENDER MAXIMUM SEGMENT SIZE (SMSS), CONGESTION WINDOW (cwnd), and
+ FLIGHT SIZE (FlightSize) defined in [RFC2581]. FLIGHT SIZE is
+ defined as in [RFC2581] as follows:
+
+ FLIGHT SIZE:
+ The amount of data that has been sent but not yet acknowledged.
+
+3. The Fast Retransmit and Fast Recovery Algorithms in NewReno
+
+ The standard implementation of the Fast Retransmit and Fast Recovery
+ algorithms is given in [RFC2581]. This section specifies the basic
+ NewReno algorithm. Sections 4 through 6 describe some optional
+ variants, and the motivations behind them, that an implementor may
+ want to consider when tuning performance for certain network
+ scenarios. Sections 7 and 8 provide some guidance to implementors
+ based on experience with NewReno implementations.
+
+ The NewReno modification concerns the Fast Recovery procedure that
+ begins when three duplicate ACKs are received and ends when either a
+ retransmission timeout occurs or an ACK arrives that acknowledges all
+
+
+
+Floyd, et al. Standards Track [Page 3]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ of the data up to and including the data that was outstanding when
+ the Fast Recovery procedure began.
+
+ The NewReno algorithm specified in this document differs from the
+ implementation in [RFC2581] in the introduction of the variable
+ "recover" in step 1, in the response to a partial or new
+ acknowledgement in step 5, and in modifications to step 1 and the
+ addition of step 6 for avoiding multiple Fast Retransmits caused by
+ the retransmission of packets already received by the receiver.
+
+ The algorithm specified in this document uses a variable "recover",
+ whose initial value is the initial send sequence number.
+
+ 1) Three duplicate ACKs:
+ When the third duplicate ACK is received and the sender is not
+ already in the Fast Recovery procedure, check to see if the
+ Cumulative Acknowledgement field covers more than "recover". If
+ so, go to Step 1A. Otherwise, go to Step 1B.
+
+ 1A) Invoking Fast Retransmit:
+ If so, then set ssthresh to no more than the value given in
+ equation 1 below. (This is equation 3 from [RFC2581]).
+
+ ssthresh = max (FlightSize / 2, 2*SMSS) (1)
+
+ In addition, record the highest sequence number transmitted in
+ the variable "recover", and go to Step 2.
+
+ 1B) Not invoking Fast Retransmit:
+ Do not enter the Fast Retransmit and Fast Recovery procedure. In
+ particular, do not change ssthresh, do not go to Step 2 to
+ retransmit the "lost" segment, and do not execute Step 3 upon
+ subsequent duplicate ACKs.
+
+ 2) Entering Fast Retransmit:
+ Retransmit the lost segment and set cwnd to ssthresh plus 3*SMSS.
+ This artificially "inflates" the congestion window by the number
+ of segments (three) that have left the network and the receiver
+ has buffered.
+
+ 3) Fast Recovery:
+ For each additional duplicate ACK received while in Fast
+ Recovery, increment cwnd by SMSS. This artificially inflates the
+ congestion window in order to reflect the additional segment that
+ has left the network.
+
+
+
+
+
+
+Floyd, et al. Standards Track [Page 4]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ 4) Fast Recovery, continued:
+ Transmit a segment, if allowed by the new value of cwnd and the
+ receiver's advertised window.
+
+ 5) When an ACK arrives that acknowledges new data, this ACK could be
+ the acknowledgment elicited by the retransmission from step 2, or
+ elicited by a later retransmission.
+
+ Full acknowledgements:
+ If this ACK acknowledges all of the data up to and including
+ "recover", then the ACK acknowledges all the intermediate
+ segments sent between the original transmission of the lost
+ segment and the receipt of the third duplicate ACK. Set cwnd to
+ either (1) min (ssthresh, FlightSize + SMSS) or (2) ssthresh,
+ where ssthresh is the value set in step 1; this is termed
+ "deflating" the window. (We note that "FlightSize" in step 1
+ referred to the amount of data outstanding in step 1, when Fast
+ Recovery was entered, while "FlightSize" in step 5 refers to the
+ amount of data outstanding in step 5, when Fast Recovery is
+ exited.) If the second option is selected, the implementation is
+ encouraged to take measures to avoid a possible burst of data, in
+ case the amount of data outstanding in the network is much less
+ than the new congestion window allows. A simple mechanism is to
+ limit the number of data packets that can be sent in response to
+ a single acknowledgement; this is known as "maxburst_" in the NS
+ simulator. Exit the Fast Recovery procedure.
+
+ Partial acknowledgements:
+ If this ACK does *not* acknowledge all of the data up to and
+ including "recover", then this is a partial ACK. In this case,
+ retransmit the first unacknowledged segment. Deflate the
+ congestion window by the amount of new data acknowledged by the
+ cumulative acknowledgement field. If the partial ACK
+ acknowledges at least one SMSS of new data, then add back SMSS
+ bytes to the congestion window. As in Step 3, this artificially
+ inflates the congestion window in order to reflect the additional
+ segment that has left the network. Send a new segment if
+ permitted by the new value of cwnd. This "partial window
+ deflation" attempts to ensure that, when Fast Recovery eventually
+ ends, approximately ssthresh amount of data will be outstanding
+ in the network. Do not exit the Fast Recovery procedure (i.e.,
+ if any duplicate ACKs subsequently arrive, execute Steps 3 and 4
+ above).
+
+ For the first partial ACK that arrives during Fast Recovery, also
+ reset the retransmit timer. Timer management is discussed in
+ more detail in Section 4.
+
+
+
+
+Floyd, et al. Standards Track [Page 5]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ 6) Retransmit timeouts:
+ After a retransmit timeout, record the highest sequence number
+ transmitted in the variable "recover" and exit the Fast Recovery
+ procedure if applicable.
+
+ Step 1 specifies a check that the Cumulative Acknowledgement field
+ covers more than "recover". Because the acknowledgement field
+ contains the sequence number that the sender next expects to receive,
+ the acknowledgement "ack_number" covers more than "recover" when:
+
+ ack_number - 1 > recover;
+
+ i.e., at least one byte more of data is acknowledged beyond the
+ highest byte that was outstanding when Fast Retransmit was last
+ entered.
+
+ Note that in Step 5, the congestion window is deflated after a
+ partial acknowledgement is received. The congestion window was
+ likely to have been inflated considerably when the partial
+ acknowledgement was received. In addition, depending on the original
+ pattern of packet losses, the partial acknowledgement might
+ acknowledge nearly a window of data. In this case, if the congestion
+ window was not deflated, the data sender might be able to send nearly
+ a window of data back-to-back.
+
+ This document does not specify the sender's response to duplicate
+ ACKs when the Fast Retransmit/Fast Recovery algorithm is not invoked.
+ This is addressed in other documents, such as those describing the
+ Limited Transmit procedure [RFC3042]. This document also does not
+ address issues of adjusting the duplicate acknowledgement threshold,
+ but assumes the threshold specified in the IETF standards; the
+ current standard is RFC 2581, which specifies a threshold of three
+ duplicate acknowledgements.
+
+ As a final note, we would observe that in the absence of the SACK
+ option, the data sender is working from limited information. When
+ the issue of recovery from multiple dropped packets from a single
+ window of data is of particular importance, the best alternative
+ would be to use the SACK option.
+
+4. Resetting the Retransmit Timer in Response to Partial
+ Acknowledgements
+
+ One possible variant to the response to partial acknowledgements
+ specified in Section 3 concerns when to reset the retransmit timer
+ after a partial acknowledgement. The algorithm in Section 3, Step 5,
+ resets the retransmit timer only after the first partial ACK. In
+ this case, if a large number of packets were dropped from a window of
+
+
+
+Floyd, et al. Standards Track [Page 6]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ data, the TCP data sender's retransmit timer will ultimately expire,
+ and the TCP data sender will invoke Slow-Start. (This is illustrated
+ on page 12 of [F98].) We call this the Impatient variant of NewReno.
+ We note that the Impatient variant in Section 3 doesn't follow the
+ recommended algorithm in RFC 2988 of restarting the retransmit timer
+ after every packet transmission or retransmission [RFC2988, Step
+ 5.1].
+
+ In contrast, the NewReno simulations in [FF96] illustrate the
+ algorithm described above with the modification that the retransmit
+ timer is reset after each partial acknowledgement. We call this the
+ Slow-but-Steady variant of NewReno. In this case, for a window with
+ a large number of packet drops, the TCP data sender retransmits at
+ most one packet per roundtrip time. (This behavior is illustrated in
+ the New-Reno TCP simulation of Figure 5 in [FF96], and on page 11 of
+ [F98]).
+
+ When N packets have been dropped from a window of data for a large
+ value of N, the Slow-but-Steady variant can remain in Fast Recovery
+ for N round-trip times, retransmitting one more dropped packet each
+ round-trip time; for these scenarios, the Impatient variant gives a
+ faster recovery and better performance. The tests "ns test-suite-
+ newreno.tcl impatient1" and "ns test-suite-newreno.tcl slow1" in the
+ NS simulator illustrate such a scenario, where the Impatient variant
+ performs better than the Slow-but-Steady variant. The Impatient
+ variant can be particularly important for TCP connections with large
+ congestion windows, as illustrated by the tests "ns test-suite-
+ newreno.tcl impatient4" and "ns test-suite-newreno.tcl slow4" in the
+ NS simulator.
+
+ One can also construct scenarios where the Slow-but-Steady variant
+ gives better performance than the Impatient variant. As an example,
+ this occurs when only a small number of packets are dropped, the RTO
+ is sufficiently small that the retransmit timer expires, and
+ performance would have been better without a retransmit timeout. The
+ tests "ns test-suite-newreno.tcl impatient2" and "ns test-suite-
+ newreno.tcl slow2" in the NS simulator illustrate such a scenario.
+
+ The Slow-but-Steady variant can also achieve higher goodput than the
+ Impatient variant, by avoiding unnecessary retransmissions. This
+ could be of special interest for cellular links, where every
+ transmission costs battery power and money. The tests "ns test-
+ suite-newreno.tcl impatient3" and "ns test-suite-newreno.tcl slow3"
+ in the NS simulator illustrate such a scenario. The Slow-but-Steady
+ variant can also be more robust to delay variation in the network,
+ where a delay spike might force the Impatient variant into a timeout
+ and go-back-N recovery.
+
+
+
+
+Floyd, et al. Standards Track [Page 7]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ Neither of the two variants discussed above are optimal. Our
+ recommendation is for the Impatient variant, as specified in Section
+ 3 of this document, because of the poor performance of the Slow-but-
+ Steady variant for TCP connections with large congestion windows.
+
+ One possibility for a more optimal algorithm would be one that
+ recovered from multiple packet drops as quickly as does slow-start,
+ while resetting the retransmit timers after each partial
+ acknowledgement, as described in the section below. We note,
+ however, that there is a limitation to the potential performance in
+ this case in the absence of the SACK option.
+
+5. Retransmissions after a Partial Acknowledgement
+
+ One possible variant to the response to partial acknowledgements
+ specified in Section 3 would be to retransmit more than one packet
+ after each partial acknowledgement, and to reset the retransmit timer
+ after each retransmission. The algorithm specified in Section 3
+ retransmits a single packet after each partial acknowledgement. This
+ is the most conservative alternative, in that it is the least likely
+ to result in an unnecessarily-retransmitted packet. A variant that
+ would recover faster from a window with many packet drops would be to
+ effectively Slow-Start, retransmitting two packets after each partial
+ acknowledgement. Such an approach would take less than N roundtrip
+ times to recover from N losses [Hoe96]. However, in the absence of
+ SACK, recovering as quickly as slow-start introduces the likelihood
+ of unnecessarily retransmitting packets, and this could significantly
+ complicate the recovery mechanisms.
+
+ We note that the response to partial acknowledgements specified in
+ Section 3 of this document and in RFC 2582 differs from the response
+ in [FF96], even though both approaches only retransmit one packet in
+ response to a partial acknowledgement. Step 5 of Section 3 specifies
+ that the TCP sender responds to a partial ACK by deflating the
+ congestion window by the amount of new data acknowledged, adding back
+ SMSS bytes if the partial ACK acknowledges at least SMSS bytes of new
+ data, and sending a new segment if permitted by the new value of
+ cwnd. Thus, only one previously-sent packet is retransmitted in
+ response to each partial acknowledgement, but additional new packets
+ might be transmitted as well, depending on the amount of new data
+ acknowledged by the partial acknowledgement. In contrast, the
+ variant of NewReno illustrated in [FF96] simply set the congestion
+ window to ssthresh when a partial acknowledgement was received. The
+ approach in [FF96] is more conservative, and does not attempt to
+ accurately track the actual number of outstanding packets after a
+ partial acknowledgement is received. While either of these
+ approaches gives acceptable performance, the variant specified in
+ Section 3 recovers more smoothly when multiple packets are dropped
+
+
+
+Floyd, et al. Standards Track [Page 8]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ from a window of data. (The [FF96] behavior can be seen in the NS
+ simulator by setting the variable "partial_window_deflation_" for
+ "Agent/TCP/Newreno" to 0; the behavior specified in Section 3 is
+ achieved by setting "partial_window_deflation_" to 1.)
+
+6. Avoiding Multiple Fast Retransmits
+
+ This section describes the motivation for the sender's state variable
+ "recover", and discusses possible heuristics for distinguishing
+ between a retransmitted packet that was dropped, and three duplicate
+ acknowledgements from the unnecessary retransmission of three
+ packets.
+
+ In the absence of the SACK option or timestamps, a duplicate
+ acknowledgement carries no information to identify the data packet or
+ packets at the TCP data receiver that triggered that duplicate
+ acknowledgement. In this case, the TCP data sender is unable to
+ distinguish between a duplicate acknowledgement that results from a
+ lost or delayed data packet, and a duplicate acknowledgement that
+ results from the sender's unnecessary retransmission of a data packet
+ that had already been received at the TCP data receiver. Because of
+ this, with the Retransmit and Fast Recovery algorithms in Reno TCP,
+ multiple segment losses from a single window of data can sometimes
+ result in unnecessary multiple Fast Retransmits (and multiple
+ reductions of the congestion window) [F94].
+
+ With the Fast Retransmit and Fast Recovery algorithms in Reno TCP,
+ the performance problems caused by multiple Fast Retransmits are
+ relatively minor compared to the potential problems with Tahoe TCP,
+ which does not implement Fast Recovery. Nevertheless, unnecessary
+ Fast Retransmits can occur with Reno TCP unless some explicit
+ mechanism is added to avoid this, such as the use of the "recover"
+ variable. (This modification is called "bugfix" in [F98], and is
+ illustrated on pages 7 and 9 of that document. Unnecessary Fast
+ Retransmits for Reno without "bugfix" is illustrated on page 6 of
+ [F98].)
+
+ Section 3 of [RFC2582] defined a default variant of NewReno TCP that
+ did not use the variable "recover", and did not check if duplicate
+ ACKs cover the variable "recover" before invoking Fast Retransmit.
+ With this default variant from RFC 2582, the problem of multiple Fast
+ Retransmits from a single window of data can occur after a Retransmit
+ Timeout (as in page 8 of [F98]) or in scenarios with reordering (as
+ in the validation test "./test-all-newreno newreno5_noBF" in
+ directory "tcl/test" of the NS simulator. This gives performance
+ similar to that on page 8 of [F03].) RFC 2582 also defined Careful
+ and Less Careful variants of the NewReno algorithm, and recommended
+ the Careful variant.
+
+
+
+Floyd, et al. Standards Track [Page 9]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ The algorithm specified in Section 3 of this document corresponds to
+ the Careful variant of NewReno TCP from RFC 2582, and eliminates the
+ problem of multiple Fast Retransmits. This algorithm uses the
+ variable "recover", whose initial value is the initial send sequence
+ number. After each retransmit timeout, the highest sequence number
+ transmitted so far is recorded in the variable "recover".
+
+ If, after a retransmit timeout, the TCP data sender retransmits three
+ consecutive packets that have already been received by the data
+ receiver, then the TCP data sender will receive three duplicate
+ acknowledgements that do not cover more than "recover". In this
+ case, the duplicate acknowledgements are not an indication of a new
+ instance of congestion. They are simply an indication that the
+ sender has unnecessarily retransmitted at least three packets.
+
+ However, when a retransmitted packet is itself dropped, the sender
+ can also receive three duplicate acknowledgements that do not cover
+ more than "recover". In this case, the sender would have been better
+ off if it had initiated Fast Retransmit. For a TCP that implements
+ the algorithm specified in Section 3 of this document, the sender
+ does not infer a packet drop from duplicate acknowledgements in this
+ scenario. As always, the retransmit timer is the backup mechanism
+ for inferring packet loss in this case.
+
+ There are several heuristics, based on timestamps or on the amount of
+ advancement of the cumulative acknowledgement field, that allow the
+ sender to distinguish, in some cases, between three duplicate
+ acknowledgements following a retransmitted packet that was dropped,
+ and three duplicate acknowledgements from the unnecessary
+ retransmission of three packets [Gur03, GF04]. The TCP sender MAY
+ use such a heuristic to decide to invoke a Fast Retransmit in some
+ cases, even when the three duplicate acknowledgements do not cover
+ more than "recover".
+
+ For example, when three duplicate acknowledgements are caused by the
+ unnecessary retransmission of three packets, this is likely to be
+ accompanied by the cumulative acknowledgement field advancing by at
+ least four segments. Similarly, a heuristic based on timestamps uses
+ the fact that when there is a hole in the sequence space, the
+ timestamp echoed in the duplicate acknowledgement is the timestamp of
+ the most recent data packet that advanced the cumulative
+ acknowledgement field [RFC1323]. If timestamps are used, and the
+ sender stores the timestamp of the last acknowledged segment, then
+ the timestamp echoed by duplicate acknowledgements can be used to
+ distinguish between a retransmitted packet that was dropped and three
+ duplicate acknowledgements from the unnecessary retransmission of
+ three packets. The heuristics are illustrated in the NS simulator in
+ the validation test "./test-all-newreno".
+
+
+
+Floyd, et al. Standards Track [Page 10]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+6.1. ACK Heuristic
+
+ If the ACK-based heuristic is used, then following the advancement of
+ the cumulative acknowledgement field, the sender stores the value of
+ the previous cumulative acknowledgement as prev_highest_ack, and
+ stores the latest cumulative ACK as highest_ack. In addition, the
+ following step is performed if Step 1 in Section 3 fails, before
+ proceeding to Step 1B.
+
+ 1*) If the Cumulative Acknowledgement field didn't cover more than
+ "recover", check to see if the congestion window is greater than
+ SMSS bytes and the difference between highest_ack and
+ prev_highest_ack is at most 4*SMSS bytes. If true, duplicate
+ ACKs indicate a lost segment (proceed to Step 1A in Section 3).
+ Otherwise, duplicate ACKs likely result from unnecessary
+ retransmissions (proceed to Step 1B in Section 3).
+
+ The congestion window check serves to protect against fast retransmit
+ immediately after a retransmit timeout, similar to the
+ "exitFastRetrans_" variable in NS. Examples of applying the ACK
+ heuristic are in validation tests "./test-all-newreno
+ newreno_rto_loss_ack" and "./test-all-newreno newreno_rto_dup_ack" in
+ directory "tcl/test" of the NS simulator.
+
+ If several ACKs are lost, the sender can see a jump in the cumulative
+ ACK of more than three segments, and the heuristic can fail. A
+ validation test for this scenario is "./test-all-newreno
+ newreno_rto_loss_ackf". RFC 2581 recommends that a receiver should
+ send duplicate ACKs for every out-of-order data packet, such as a
+ data packet received during Fast Recovery. The ACK heuristic is more
+ likely to fail if the receiver does not follow this advice, because
+ then a smaller number of ACK losses are needed to produce a
+ sufficient jump in the cumulative ACK.
+
+6.2. Timestamp Heuristic
+
+ If this heuristic is used, the sender stores the timestamp of the
+ last acknowledged segment. In addition, the second paragraph of step
+ 1 in Section 3 is replaced as follows:
+
+ 1**) If the Cumulative Acknowledgement field didn't cover more than
+ "recover", check to see if the echoed timestamp in the last
+ non-duplicate acknowledgment equals the stored timestamp. If
+ true, duplicate ACKs indicate a lost segment (proceed to Step 1A
+ in Section 3). Otherwise, duplicate ACKs likely result from
+ unnecessary retransmissions (proceed to Step 1B in Section 3).
+
+
+
+
+
+Floyd, et al. Standards Track [Page 11]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ Examples of applying the timestamp heuristic are in validation tests
+ "./test-all-newreno newreno_rto_loss_tsh" and "./test-all-newreno
+ newreno_rto_dup_tsh". The timestamp heuristic works correctly, both
+ when the receiver echoes timestamps as specified by [RFC1323], and by
+ its revision attempts. However, if the receiver arbitrarily echoes
+ timestamps, the heuristic can fail. The heuristic can also fail if a
+ timeout was spurious and returning ACKs are not from retransmitted
+ segments. This can be prevented by detection algorithms such as
+ [RFC3522].
+
+7. Implementation Issues for the Data Receiver
+
+ [RFC2581] specifies that "Out-of-order data segments SHOULD be
+ acknowledged immediately, in order to accelerate loss recovery."
+ Neal Cardwell has noted that some data receivers do not send an
+ immediate acknowledgement when they send a partial acknowledgment,
+ but instead wait first for their delayed acknowledgement timer to
+ expire [C98]. As [C98] notes, this severely limits the potential
+ benefit of NewReno by delaying the receipt of the partial
+ acknowledgement at the data sender. Echoing RFC 2581, our
+ recommendation is that the data receiver send an immediate
+ acknowledgement for an out-of-order segment, even when that out-of-
+ order segment fills a hole in the buffer.
+
+8. Implementation Issues for the Data Sender
+
+ In Section 3, Step 5 above, it is noted that implementations should
+ take measures to avoid a possible burst of data when leaving Fast
+ Recovery, in case the amount of new data that the sender is eligible
+ to send due to the new value of the congestion window is large. This
+ can arise during NewReno when ACKs are lost or treated as pure window
+ updates, thereby causing the sender to underestimate the number of
+ new segments that can be sent during the recovery procedure.
+ Specifically, bursts can occur when the FlightSize is much less than
+ the new congestion window when exiting from Fast Recovery. One
+ simple mechanism to avoid a burst of data when leaving Fast Recovery
+ is to limit the number of data packets that can be sent in response
+ to a single acknowledgment. (This is known as "maxburst_" in the ns
+ simulator.) Other possible mechanisms for avoiding bursts include
+ rate-based pacing, or setting the slow-start threshold to the
+ resultant congestion window and then resetting the congestion window
+ to FlightSize. A recommendation on the general mechanism to avoid
+ excessively bursty sending patterns is outside the scope of this
+ document.
+
+ An implementation may want to use a separate flag to record whether
+ or not it is presently in the Fast Recovery procedure. The use of
+ the value of the duplicate acknowledgment counter for this purpose is
+
+
+
+Floyd, et al. Standards Track [Page 12]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ not reliable because it can be reset upon window updates and out-of-
+ order acknowledgments.
+
+ When not in Fast Recovery, the value of the state variable "recover"
+ should be pulled along with the value of the state variable for
+ acknowledgments (typically, "snd_una") so that, when large amounts of
+ data have been sent and acked, the sequence space does not wrap and
+ falsely indicate that Fast Recovery should not be entered (Section 3,
+ step 1, last paragraph).
+
+ It is important for the sender to respond correctly to duplicate ACKs
+ received when the sender is no longer in Fast Recovery (e.g., because
+ of a Retransmit Timeout). The Limited Transmit procedure [RFC3042]
+ describes possible responses to the first and second duplicate
+ acknowledgements. When three or more duplicate acknowledgements are
+ received, the Cumulative Acknowledgement field doesn't cover more
+ than "recover", and a new Fast Recovery is not invoked, it is
+ important that the sender not execute the Fast Recovery steps (3) and
+ (4) in Section 3. Otherwise, the sender could end up in a chain of
+ spurious timeouts. We mention this only because several NewReno
+ implementations had this bug, including the implementation in the NS
+ simulator. (This bug in the NS simulator was fixed in July 2003,
+ with the variable "exitFastRetrans_".)
+
+9. Simulations
+
+ Simulations with NewReno are illustrated with the validation test
+ "tcl/test/test-all-newreno" in the NS simulator. The command
+ "../../ns test-suite-newreno.tcl reno" shows a simulation with Reno
+ TCP, illustrating the data sender's lack of response to a partial
+ acknowledgement. In contrast, the command "../../ns test-suite-
+ newreno.tcl newreno_B" shows a simulation with the same scenario
+ using the NewReno algorithms described in this paper.
+
+10. Comparisons between Reno and NewReno TCP
+
+ As we stated in the introduction, we believe that the NewReno
+ modification described in this document improves the performance of
+ the Fast Retransmit and Fast Recovery algorithms of Reno TCP in a
+ wide variety of scenarios. This has been discussed in some depth in
+ [FF96], which illustrates Reno TCP's poor performance when multiple
+ packets are dropped from a window of data and also illustrates
+ NewReno TCP's good performance in that scenario.
+
+ We do, however, know of one scenario where Reno TCP gives better
+ performance than NewReno TCP, that we describe here for the sake of
+ completeness. Consider a scenario with no packet loss, but with
+ sufficient reordering so that the TCP sender receives three duplicate
+
+
+
+Floyd, et al. Standards Track [Page 13]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ acknowledgements. This will trigger the Fast Retransmit and Fast
+ Recovery algorithms. With Reno TCP or with Sack TCP, this will
+ result in the unnecessary retransmission of a single packet, combined
+ with a halving of the congestion window (shown on pages 4 and 6 of
+ [F03]). With NewReno TCP, however, this reordering will also result
+ in the unnecessary retransmission of an entire window of data (shown
+ on page 5 of [F03]).
+
+ While Reno TCP performs better than NewReno TCP in the presence of
+ reordering, NewReno's superior performance in the presence of
+ multiple packet drops generally outweighs its less optimal
+ performance in the presence of reordering. (Sack TCP is the
+ preferred solution, with good performance in both scenarios.) This
+ document recommends the Fast Retransmit and Fast Recovery algorithms
+ of NewReno TCP instead of those of Reno TCP for those TCP connections
+ that do not support SACK. We would also note that NewReno's Fast
+ Retransmit and Fast Recovery mechanisms are widely deployed in TCP
+ implementations in the Internet today, as documented in [PF01]. For
+ example, tests of TCP implementations in several thousand web servers
+ in 2001 showed that for those TCP connections where the web browser
+ was not SACK-capable, more web servers used the Fast Retransmit and
+ Fast Recovery algorithms of NewReno than those of Reno or Tahoe TCP
+ [PF01].
+
+11. Changes Relative to RFC 2582
+
+ The purpose of this document is to advance the NewReno's Fast
+ Retransmit and Fast Recovery algorithms in RFC 2582 to Standards
+ Track.
+
+ The main change in this document relative to RFC 2582 is to specify
+ the Careful variant of NewReno's Fast Retransmit and Fast Recovery
+ algorithms. The base algorithm described in RFC 2582 did not attempt
+ to avoid unnecessary multiple Fast Retransmits that can occur after a
+ timeout (described in more detail in the section above). However,
+ RFC 2582 also defined "Careful" and "Less Careful" variants that
+ avoid these unnecessary Fast Retransmits, and recommended the Careful
+ variant. This document specifies the previously-named "Careful"
+ variant as the basic version of NewReno. As described below, this
+ algorithm uses a variable "recover", whose initial value is the send
+ sequence number.
+
+ The algorithm specified in Section 3 checks whether the
+ acknowledgement field of a partial acknowledgement covers *more* than
+ "recover", as defined in Section 3. Another possible variant would
+ be to simply require that the acknowledgement field covers *more than
+ or equal to* "recover" before initiating another Fast Retransmit. We
+ called this the Less Careful variant in RFC 2582.
+
+
+
+Floyd, et al. Standards Track [Page 14]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ There are two separate scenarios in which the TCP sender could
+ receive three duplicate acknowledgements acknowledging "recover" but
+ no more than "recover". One scenario would be that the data sender
+ transmitted four packets with sequence numbers higher than "recover",
+ that the first packet was dropped in the network, and the following
+ three packets triggered three duplicate acknowledgements
+ acknowledging "recover". The second scenario would be that the
+ sender unnecessarily retransmitted three packets below "recover", and
+ that these three packets triggered three duplicate acknowledgements
+ acknowledging "recover". In the absence of SACK, the TCP sender is
+ unable to distinguish between these two scenarios.
+
+ For the Careful variant of Fast Retransmit, the data sender would
+ have to wait for a retransmit timeout in the first scenario, but
+ would not have an unnecessary Fast Retransmit in the second scenario.
+ For the Less Careful variant to Fast Retransmit, the data sender
+ would Fast Retransmit as desired in the first scenario, and would
+ unnecessarily Fast Retransmit in the second scenario. This document
+ only specifies the Careful variant in Section 3. Unnecessary Fast
+ Retransmits with the Less Careful variant in scenarios with
+ reordering are illustrated in page 8 of [F03].
+
+ The document also specifies two heuristics that the TCP sender MAY
+ use to decide to invoke Fast Retransmit even when the three duplicate
+ acknowledgements do not cover more than "recover". These heuristics,
+ an ACK-based heuristic and a timestamp heuristic, are described in
+ Sections 6.1 and 6.2 respectively.
+
+12. Conclusions
+
+ This document specifies the NewReno Fast Retransmit and Fast Recovery
+ algorithms for TCP. This NewReno modification to TCP can even be
+ important for TCP implementations that support the SACK option,
+ because the SACK option can only be used for TCP connections when
+ both TCP end-nodes support the SACK option. NewReno performs better
+ than Reno (RFC 2581) in a number of scenarios discussed herein.
+
+ A number of options to the basic algorithm presented in Section 3 are
+ also described. These include the handling of the retransmission
+ timer (Section 4), the response to partial acknowledgments (Section
+ 5), and the value of the congestion window when leaving Fast Recovery
+ (section 3, step 5). Our belief is that the differences between
+ these variants of NewReno are small compared to the differences
+ between Reno and NewReno. That is, the important thing is to
+ implement NewReno instead of Reno, for a TCP connection without SACK;
+ it is less important exactly which of the variants of NewReno is
+ implemented.
+
+
+
+
+Floyd, et al. Standards Track [Page 15]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+13. Security Considerations
+
+ RFC 2581 discusses general security considerations concerning TCP
+ congestion control. This document describes a specific algorithm
+ that conforms with the congestion control requirements of RFC 2581,
+ and so those considerations apply to this algorithm, too. There are
+ no known additional security concerns for this specific algorithm.
+
+14. Acknowledgements
+
+ Many thanks to Anil Agarwal, Mark Allman, Armando Caro, Jeffrey Hsu,
+ Vern Paxson, Kacheong Poon, Keyur Shah, and Bernie Volz for detailed
+ feedback on this document or on its precursor, RFC 2582.
+
+15. References
+
+15.1. Normative References
+
+ [RFC2018] Mathis, M., Mahdavi, J., Floyd, S. and A. Romanow, "TCP
+ Selective Acknowledgement Options", RFC 2018, October 1996.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [RFC2581] Allman, M., Paxson, V. and W. Stevens, "TCP Congestion
+ Control", RFC 2581, April 1999.
+
+ [RFC2582] Floyd, S. and T. Henderson, "The NewReno Modification to
+ TCP's Fast Recovery Algorithm", RFC 2582, April 1999.
+
+ [RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission
+ Timer", RFC 2988, November 2000.
+
+ [RFC3042] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's
+ Loss Recovery Using Limited Transmit", RFC 3042, January
+ 2001.
+
+15.2. Informative References
+
+ [C98] Cardwell, N., "delayed ACKs for retransmitted packets:
+ ouch!". November 1998, Email to the tcpimpl mailing list,
+ Message-ID "Pine.LNX.4.02A.9811021421340.26785-
+ 100000@sake.cs.washington.edu", archived at "http://tcp-
+ impl.lerc.nasa.gov/tcp-impl".
+
+
+
+
+
+
+
+Floyd, et al. Standards Track [Page 16]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ [F98] Floyd, S., Revisions to RFC 2001, "Presentation to the
+ TCPIMPL Working Group", August 1998. URLs
+ "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.ps" and
+ "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.pdf".
+
+ [F03] Floyd, S., "Moving NewReno from Experimental to Proposed
+ Standard? Presentation to the TSVWG Working Group", March
+ 2003. URLs "http://www.icir.org/floyd/talks/newreno-
+ Mar03.ps" and "http://www.icir.org/floyd/talks/newreno-
+ Mar03.pdf".
+
+ [FF96] Fall, K. and S. Floyd, "Simulation-based Comparisons of
+ Tahoe, Reno and SACK TCP", Computer Communication Review,
+ July 1996. URL "ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z".
+
+ [F94] Floyd, S., "TCP and Successive Fast Retransmits", Technical
+ report, October 1994. URL
+ "ftp://ftp.ee.lbl.gov/papers/fastretrans.ps".
+
+ [GF04] Gurtov, A. and S. Floyd, "Resolving Acknowledgment
+ Ambiguity in non-SACK TCP", Next Generation Teletraffic and
+ Wired/Wireless Advanced Networking (NEW2AN'04), February
+ 2004. URL "http://www.cs.helsinki.fi/u/gurtov/papers/
+ heuristics.html".
+
+ [Gur03] Gurtov, A., "[Tsvwg] resolving the problem of unnecessary
+ fast retransmits in go-back-N", email to the tsvwg mailing
+ list, message ID <3F25B467.9020609@cs.helsinki.fi>, July
+ 28, 2003. URL "http://www1.ietf.org/mail-archive/working-
+ groups/tsvwg/current/msg04334.html".
+
+ [Hen98] Henderson, T., Re: NewReno and the 2001 Revision. September
+ 1998. Email to the tcpimpl mailing list, Message ID
+ "Pine.BSI.3.95.980923224136.26134A-
+ 100000@raptor.CS.Berkeley.EDU", archived at "http://tcp-
+ impl.lerc.nasa.gov/tcp-impl".
+
+ [Hoe95] Hoe, J., "Startup Dynamics of TCP's Congestion Control and
+ Avoidance Schemes", Master's Thesis, MIT, 1995.
+
+ [Hoe96] Hoe, J., "Improving the Start-up Behavior of a Congestion
+ Control Scheme for TCP", ACM SIGCOMM, August 1996. URL
+ "http://www.acm.org/sigcomm/sigcomm96/program.html".
+
+ [LM97] Lin, D. and R. Morris, "Dynamics of Random Early
+ Detection", SIGCOMM 97, September 1997. URL
+ "http://www.acm.org/sigcomm/sigcomm97/program.html".
+
+
+
+
+Floyd, et al. Standards Track [Page 17]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+ [NS] The Network Simulator (NS). URL
+ "http://www.isi.edu/nsnam/ns/".
+
+ [PF01] Padhye, J. and S. Floyd, "Identifying the TCP Behavior of
+ Web Servers", June 2001, SIGCOMM 2001.
+
+ [RFC1323] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions for
+ High Performance", RFC 1323, May 1992.
+
+ [RFC3517] Blanton, E., Allman, M., Fall, K. and L. Wang, "A
+ Conservative Selective Acknowledgment (SACK)-based Loss
+ Recovery Algorithm for TCP", RFC 3517, April 2003.
+
+ [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for
+ TCP", RFC 3522, April 2003.
+
+Authors' Addresses
+
+ Sally Floyd
+ International Computer Science Institute
+
+ Phone: +1 (510) 666-2989
+ EMail: floyd@acm.org
+ URL: http://www.icir.org/floyd/
+
+
+ Tom Henderson
+ The Boeing Company
+
+ EMail: thomas.r.henderson@boeing.com
+
+
+ Andrei Gurtov
+ TeliaSonera
+
+ EMail: andrei.gurtov@teliasonera.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Floyd, et al. Standards Track [Page 18]
+
+RFC 3782 NewReno Modification to Fast Recovery Algorithm April 2004
+
+
+Full Copyright Statement
+
+ Copyright (C) The Internet Society (2004). This document is subject
+ to the rights, licenses and restrictions contained in BCP 78, and
+ except as set forth therein, the authors retain all their rights.
+
+ This document and the information contained herein are provided on an
+ "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
+ REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
+ INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
+ IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
+ THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+ WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+ The IETF takes no position regarding the validity or scope of any
+ Intellectual Property Rights or other rights that might be claimed
+ to pertain to the implementation or use of the technology
+ described in this document or the extent to which any license
+ under such rights might or might not be available; nor does it
+ represent that it has made any independent effort to identify any
+ such rights. Information on the procedures with respect to
+ rights in RFC documents can be found in BCP 78 and BCP 79.
+
+ Copies of IPR disclosures made to the IETF Secretariat and any
+ assurances of licenses to be made available, or the result of an
+ attempt made to obtain a general license or permission for the use
+ of such proprietary rights by implementers or users of this
+ specification can be obtained from the IETF on-line IPR repository
+ at http://www.ietf.org/ipr.
+
+ The IETF invites any interested party to bring to its attention
+ any copyrights, patents or patent applications, or other
+ proprietary rights that may cover technology that may be required
+ to implement this standard. Please address the information to the
+ IETF at ietf-ipr@ietf.org.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+
+Floyd, et al. Standards Track [Page 19]
+