summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc6298.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc6298.txt')
-rw-r--r--doc/rfc/rfc6298.txt619
1 files changed, 619 insertions, 0 deletions
diff --git a/doc/rfc/rfc6298.txt b/doc/rfc/rfc6298.txt
new file mode 100644
index 0000000..4466ef5
--- /dev/null
+++ b/doc/rfc/rfc6298.txt
@@ -0,0 +1,619 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) V. Paxson
+Request for Comments: 6298 ICSI/UC Berkeley
+Obsoletes: 2988 M. Allman
+Updates: 1122 ICSI
+Category: Standards Track J. Chu
+ISSN: 2070-1721 Google
+ M. Sargent
+ CWRU
+ June 2011
+
+
+ Computing TCP's Retransmission Timer
+
+Abstract
+
+ This document defines the standard algorithm that Transmission
+ Control Protocol (TCP) senders are required to use to compute and
+ manage their retransmission timer. It expands on the discussion in
+ Section 4.2.3.1 of RFC 1122 and upgrades the requirement of
+ supporting the algorithm from a SHOULD to a MUST. This document
+ obsoletes RFC 2988.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 5741.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc6298.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Paxson, et al. Standards Track [Page 1]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+Copyright Notice
+
+ Copyright (c) 2011 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+1. Introduction
+
+ The Transmission Control Protocol (TCP) [Pos81] uses a retransmission
+ timer to ensure data delivery in the absence of any feedback from the
+ remote data receiver. The duration of this timer is referred to as
+ RTO (retransmission timeout). RFC 1122 [Bra89] specifies that the
+ RTO should be calculated as outlined in [Jac88].
+
+ This document codifies the algorithm for setting the RTO. In
+ addition, this document expands on the discussion in Section 4.2.3.1
+ of RFC 1122 and upgrades the requirement of supporting the algorithm
+ from a SHOULD to a MUST. RFC 5681 [APB09] outlines the algorithm TCP
+ uses to begin sending after the RTO expires and a retransmission is
+ sent. This document does not alter the behavior outlined in RFC 5681
+ [APB09].
+
+ In some situations, it may be beneficial for a TCP sender to be more
+ conservative than the algorithms detailed in this document allow.
+ However, a TCP MUST NOT be more aggressive than the following
+ algorithms allow. This document obsoletes RFC 2988 [PA00].
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in [Bra97].
+
+2. The Basic Algorithm
+
+ To compute the current RTO, a TCP sender maintains two state
+ variables, SRTT (smoothed round-trip time) and RTTVAR (round-trip
+ time variation). In addition, we assume a clock granularity of G
+ seconds.
+
+
+
+
+
+Paxson, et al. Standards Track [Page 2]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+ The rules governing the computation of SRTT, RTTVAR, and RTO are as
+ follows:
+
+ (2.1) Until a round-trip time (RTT) measurement has been made for a
+ segment sent between the sender and receiver, the sender SHOULD
+ set RTO <- 1 second, though the "backing off" on repeated
+ retransmission discussed in (5.5) still applies.
+
+ Note that the previous version of this document used an initial
+ RTO of 3 seconds [PA00]. A TCP implementation MAY still use
+ this value (or any other value > 1 second). This change in the
+ lower bound on the initial RTO is discussed in further detail
+ in Appendix A.
+
+ (2.2) When the first RTT measurement R is made, the host MUST set
+
+ SRTT <- R
+ RTTVAR <- R/2
+ RTO <- SRTT + max (G, K*RTTVAR)
+
+ where K = 4.
+
+ (2.3) When a subsequent RTT measurement R' is made, a host MUST set
+
+ RTTVAR <- (1 - beta) * RTTVAR + beta * |SRTT - R'|
+ SRTT <- (1 - alpha) * SRTT + alpha * R'
+
+ The value of SRTT used in the update to RTTVAR is its value
+ before updating SRTT itself using the second assignment. That
+ is, updating RTTVAR and SRTT MUST be computed in the above
+ order.
+
+ The above SHOULD be computed using alpha=1/8 and beta=1/4 (as
+ suggested in [JK88]).
+
+ After the computation, a host MUST update
+ RTO <- SRTT + max (G, K*RTTVAR)
+
+ (2.4) Whenever RTO is computed, if it is less than 1 second, then the
+ RTO SHOULD be rounded up to 1 second.
+
+ Traditionally, TCP implementations use coarse grain clocks to
+ measure the RTT and trigger the RTO, which imposes a large
+ minimum value on the RTO. Research suggests that a large
+ minimum RTO is needed to keep TCP conservative and avoid
+ spurious retransmissions [AP99]. Therefore, this specification
+ requires a large minimum RTO as a conservative approach, while
+
+
+
+
+Paxson, et al. Standards Track [Page 3]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+ at the same time acknowledging that at some future point,
+ research may show that a smaller minimum RTO is acceptable or
+ superior.
+
+ (2.5) A maximum value MAY be placed on RTO provided it is at least 60
+ seconds.
+
+3. Taking RTT Samples
+
+ TCP MUST use Karn's algorithm [KP87] for taking RTT samples. That
+ is, RTT samples MUST NOT be made using segments that were
+ retransmitted (and thus for which it is ambiguous whether the reply
+ was for the first instance of the packet or a later instance). The
+ only case when TCP can safely take RTT samples from retransmitted
+ segments is when the TCP timestamp option [JBB92] is employed, since
+ the timestamp option removes the ambiguity regarding which instance
+ of the data segment triggered the acknowledgment.
+
+ Traditionally, TCP implementations have taken one RTT measurement at
+ a time (typically, once per RTT). However, when using the timestamp
+ option, each ACK can be used as an RTT sample. RFC 1323 [JBB92]
+ suggests that TCP connections utilizing large congestion windows
+ should take many RTT samples per window of data to avoid aliasing
+ effects in the estimated RTT. A TCP implementation MUST take at
+ least one RTT measurement per RTT (unless that is not possible per
+ Karn's algorithm).
+
+ For fairly modest congestion window sizes, research suggests that
+ timing each segment does not lead to a better RTT estimator [AP99].
+ Additionally, when multiple samples are taken per RTT, the alpha and
+ beta defined in Section 2 may keep an inadequate RTT history. A
+ method for changing these constants is currently an open research
+ question.
+
+4. Clock Granularity
+
+ There is no requirement for the clock granularity G used for
+ computing RTT measurements and the different state variables.
+ However, if the K*RTTVAR term in the RTO calculation equals zero, the
+ variance term MUST be rounded to G seconds (i.e., use the equation
+ given in step 2.3).
+
+ RTO <- SRTT + max (G, K*RTTVAR)
+
+ Experience has shown that finer clock granularities (<= 100 msec)
+ perform somewhat better than coarser granularities.
+
+
+
+
+
+Paxson, et al. Standards Track [Page 4]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+ Note that [Jac88] outlines several clever tricks that can be used to
+ obtain better precision from coarse granularity timers. These
+ changes are widely implemented in current TCP implementations.
+
+5. Managing the RTO Timer
+
+ An implementation MUST manage the retransmission timer(s) in such a
+ way that a segment is never retransmitted too early, i.e., less than
+ one RTO after the previous transmission of that segment.
+
+ The following is the RECOMMENDED algorithm for managing the
+ retransmission timer:
+
+ (5.1) Every time a packet containing data is sent (including a
+ retransmission), if the timer is not running, start it running
+ so that it will expire after RTO seconds (for the current value
+ of RTO).
+
+ (5.2) When all outstanding data has been acknowledged, turn off the
+ retransmission timer.
+
+ (5.3) When an ACK is received that acknowledges new data, restart the
+ retransmission timer so that it will expire after RTO seconds
+ (for the current value of RTO).
+
+ When the retransmission timer expires, do the following:
+
+ (5.4) Retransmit the earliest segment that has not been acknowledged
+ by the TCP receiver.
+
+ (5.5) The host MUST set RTO <- RTO * 2 ("back off the timer"). The
+ maximum value discussed in (2.5) above may be used to provide
+ an upper bound to this doubling operation.
+
+ (5.6) Start the retransmission timer, such that it expires after RTO
+ seconds (for the value of RTO after the doubling operation
+ outlined in 5.5).
+
+ (5.7) If the timer expires awaiting the ACK of a SYN segment and the
+ TCP implementation is using an RTO less than 3 seconds, the RTO
+ MUST be re-initialized to 3 seconds when data transmission
+ begins (i.e., after the three-way handshake completes).
+
+ This represents a change from the previous version of this
+ document [PA00] and is discussed in Appendix A.
+
+
+
+
+
+
+Paxson, et al. Standards Track [Page 5]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+ Note that after retransmitting, once a new RTT measurement is
+ obtained (which can only happen when new data has been sent and
+ acknowledged), the computations outlined in Section 2 are performed,
+ including the computation of RTO, which may result in "collapsing"
+ RTO back down after it has been subject to exponential back off (rule
+ 5.5).
+
+ Note that a TCP implementation MAY clear SRTT and RTTVAR after
+ backing off the timer multiple times as it is likely that the current
+ SRTT and RTTVAR are bogus in this situation. Once SRTT and RTTVAR
+ are cleared, they should be initialized with the next RTT sample
+ taken per (2.2) rather than using (2.3).
+
+6. Security Considerations
+
+ This document requires a TCP to wait for a given interval before
+ retransmitting an unacknowledged segment. An attacker could cause a
+ TCP sender to compute a large value of RTO by adding delay to a timed
+ packet's latency, or that of its acknowledgment. However, the
+ ability to add delay to a packet's latency often coincides with the
+ ability to cause the packet to be lost, so it is difficult to see
+ what an attacker might gain from such an attack that could cause more
+ damage than simply discarding some of the TCP connection's packets.
+
+ The Internet, to a considerable degree, relies on the correct
+ implementation of the RTO algorithm (as well as those described in
+ RFC 5681) in order to preserve network stability and avoid congestion
+ collapse. An attacker could cause TCP endpoints to respond more
+ aggressively in the face of congestion by forging acknowledgments for
+ segments before the receiver has actually received the data, thus
+ lowering RTO to an unsafe value. But to do so requires spoofing the
+ acknowledgments correctly, which is difficult unless the attacker can
+ monitor traffic along the path between the sender and the receiver.
+ In addition, even if the attacker can cause the sender's RTO to reach
+ too small a value, it appears the attacker cannot leverage this into
+ much of an attack (compared to the other damage they can do if they
+ can spoof packets belonging to the connection), since the sending TCP
+ will still back off its timer in the face of an incorrectly
+ transmitted packet's loss due to actual congestion.
+
+ The security considerations in RFC 5681 [APB09] are also applicable
+ to this document.
+
+
+
+
+
+
+
+
+
+Paxson, et al. Standards Track [Page 6]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+7. Changes from RFC 2988
+
+ This document reduces the initial RTO from the previous 3 seconds
+ [PA00] to 1 second, unless the SYN or the ACK of the SYN is lost, in
+ which case the default RTO is reverted to 3 seconds before data
+ transmission begins.
+
+8. Acknowledgments
+
+ The RTO algorithm described in this memo was originated by Van
+ Jacobson in [Jac88].
+
+ Much of the data that motivated changing the initial RTO from 3
+ seconds to 1 second came from Robert Love, Andre Broido, and Mike
+ Belshe.
+
+9. References
+
+9.1. Normative References
+
+ [APB09] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
+ Control", RFC 5681, September 2009.
+
+ [Bra89] Braden, R., Ed., "Requirements for Internet Hosts -
+ Communication Layers", STD 3, RFC 1122, October 1989.
+
+ [Bra97] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [JBB92] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions for
+ High Performance", RFC 1323, May 1992.
+
+ [Pos81] Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
+ September 1981.
+
+9.2. Informative References
+
+ [AP99] Allman, M. and V. Paxson, "On Estimating End-to-End Network
+ Path Properties", SIGCOMM 99.
+
+ [Chu09] Chu, J., "Tuning TCP Parameters for the 21st Century",
+ http://www.ietf.org/proceedings/75/slides/tcpm-1.pdf, July
+ 2009.
+
+ [SLS09] Schulman, A., Levin, D., and Spring, N., "CRAWDAD data set
+ umd/sigcomm2008 (v. 2009-03-02)",
+ http://crawdad.cs.dartmouth.edu/umd/sigcomm2008, March, 2009.
+
+
+
+
+Paxson, et al. Standards Track [Page 7]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+ [HKA04] Henderson, T., Kotz, D., and Abyzov, I., "CRAWDAD trace
+ dartmouth/campus/tcpdump/fall03 (v. 2004-11-09)",
+ http://crawdad.cs.dartmouth.edu/dartmouth/campus/
+ tcpdump/fall03, November 2004.
+
+ [Jac88] Jacobson, V., "Congestion Avoidance and Control", Computer
+ Communication Review, vol. 18, no. 4, pp. 314-329, Aug.
+ 1988.
+
+ [JK88] Jacobson, V. and M. Karels, "Congestion Avoidance and
+ Control", ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z.
+
+ [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time
+ Estimates in Reliable Transport Protocols", SIGCOMM 87.
+
+ [PA00] Paxson, V. and M. Allman, "Computing TCP's Retransmission
+ Timer", RFC 2988, November 2000.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Paxson, et al. Standards Track [Page 8]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+Appendix A. Rationale for Lowering the Initial RTO
+
+ Choosing a reasonable initial RTO requires balancing two competing
+ considerations:
+
+ 1. The initial RTO should be sufficiently large to cover most of the
+ end-to-end paths to avoid spurious retransmissions and their
+ associated negative performance impact.
+
+ 2. The initial RTO should be small enough to ensure a timely recovery
+ from packet loss occurring before an RTT sample is taken.
+
+ Traditionally, TCP has used 3 seconds as the initial RTO [Bra89]
+ [PA00]. This document calls for lowering this value to 1 second
+ using the following rationale:
+
+ - Modern networks are simply faster than the state-of-the-art was at
+ the time the initial RTO of 3 seconds was defined.
+
+ - Studies have found that the round-trip times of more than 97.5% of
+ the connections observed in a large scale analysis were less than 1
+ second [Chu09], suggesting that 1 second meets criterion 1 above.
+
+ - In addition, the studies observed retransmission rates within the
+ three-way handshake of roughly 2%. This shows that reducing the
+ initial RTO has benefit to a non-negligible set of connections.
+
+ - However, roughly 2.5% of the connections studied in [Chu09] have an
+ RTT longer than 1 second. For those connections, a 1 second
+ initial RTO guarantees a retransmission during connection
+ establishment (needed or not).
+
+ When this happens, this document calls for reverting to an initial
+ RTO of 3 seconds for the data transmission phase. Therefore, the
+ implications of the spurious retransmission are modest: (1) an
+ extra SYN is transmitted into the network, and (2) according to RFC
+ 5681 [APB09] the initial congestion window will be limited to 1
+ segment. While (2) clearly puts such connections at a
+ disadvantage, this document at least resets the RTO such that the
+ connection will not continually run into problems with a short
+ timeout. (Of course, if the RTT is more than 3 seconds, the
+ connection will still encounter difficulties. But that is not a
+ new issue for TCP.)
+
+ In addition, we note that when using timestamps, TCP will be able
+ to take an RTT sample even in the presence of a spurious
+ retransmission, facilitating convergence to a correct RTT estimate
+ when the RTT exceeds 1 second.
+
+
+
+Paxson, et al. Standards Track [Page 9]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+ As an additional check on the results presented in [Chu09], we
+ analyzed packet traces of client behavior collected at four different
+ vantage points at different times, as follows:
+
+ Name Dates Pkts. Cnns. Clnts. Servs.
+ --------------------------------------------------------
+ LBL-1 Oct/05--Mar/06 292M 242K 228 74K
+ LBL-2 Nov/09--Feb/10 1.1B 1.2M 1047 38K
+ ICSI-1 Sep/11--18/07 137M 2.1M 193 486K
+ ICSI-2 Sep/11--18/08 163M 1.9M 177 277K
+ ICSI-3 Sep/14--21/09 334M 3.1M 170 253K
+ ICSI-4 Sep/11--18/10 298M 5M 183 189K
+ Dartmouth Jan/4--21/04 1B 4M 3782 132K
+ SIGCOMM Aug/17--21/08 11.6M 133K 152 29K
+
+ The "LBL" data was taken at the Lawrence Berkeley National
+ Laboratory, the "ICSI" data from the International Computer Science
+ Institute, the "SIGCOMM" data from the wireless network that served
+ the attendees of SIGCOMM 2008, and the "Dartmouth" data was collected
+ from Dartmouth College's wireless network. The latter two datasets
+ are available from the CRAWDAD data repository [HKA04] [SLS09]. The
+ table lists the dates of the data collections, the number of packets
+ collected, the number of TCP connections observed, the number of
+ local clients monitored, and the number of remote servers contacted.
+ We consider only connections initiated near the tracing vantage
+ point.
+
+ Analysis of these datasets finds the prevalence of retransmitted SYNs
+ to be between 0.03% (ICSI-4) to roughly 2% (LBL-1 and Dartmouth).
+
+ We then analyzed the data to determine the number of additional and
+ spurious retransmissions that would have been incurred if the initial
+ RTO was assumed to be 1 second. In most of the datasets, the
+ proportion of connections with spurious retransmits was less than
+ 0.1%. However, in the Dartmouth dataset, approximately 1.1% of the
+ connections would have sent a spurious retransmit with a lower
+ initial RTO. We attribute this to the fact that the monitored
+ network is wireless and therefore susceptible to additional delays
+ from RF effects.
+
+ Finally, there are obviously performance benefits from retransmitting
+ lost SYNs with a reduced initial RTO. Across our datasets, the
+ percentage of connections that retransmitted a SYN and would realize
+ at least a 10% performance improvement by using the smaller initial
+ RTO specified in this document ranges from 43% (LBL-1) to 87%
+ (ICSI-4). The percentage of connections that would realize at least
+ a 50% performance improvement ranges from 17% (ICSI-1 and SIGCOMM) to
+ 73% (ICSI-4).
+
+
+
+Paxson, et al. Standards Track [Page 10]
+
+RFC 6298 Computing TCP's Retransmission Timer June 2011
+
+
+ From the data to which we have access, we conclude that the lower
+ initial RTO is likely to be beneficial to many connections, and
+ harmful to relatively few.
+
+ Authors' Addresses
+
+ Vern Paxson
+ ICSI/UC Berkeley
+ 1947 Center Street
+ Suite 600
+ Berkeley, CA 94704-1198
+
+ Phone: 510-666-2882
+ EMail: vern@icir.org
+ http://www.icir.org/vern/
+
+
+ Mark Allman
+ ICSI
+ 1947 Center Street
+ Suite 600
+ Berkeley, CA 94704-1198
+
+ Phone: 440-235-1792
+ EMail: mallman@icir.org
+ http://www.icir.org/mallman/
+
+
+ H.K. Jerry Chu
+ Google, Inc.
+ 1600 Amphitheatre Parkway
+ Mountain View, CA 94043
+
+ Phone: 650-253-3010
+ EMail: hkchu@google.com
+
+
+ Matt Sargent
+ Case Western Reserve University
+ Olin Building
+ 10900 Euclid Avenue
+ Room 505
+ Cleveland, OH 44106
+
+ Phone: 440-223-5932
+ EMail: mts71@case.edu
+
+
+
+
+
+Paxson, et al. Standards Track [Page 11]
+