diff options
Diffstat (limited to 'doc/rfc/rfc6298.txt')
-rw-r--r-- | doc/rfc/rfc6298.txt | 619 |
1 files changed, 619 insertions, 0 deletions
diff --git a/doc/rfc/rfc6298.txt b/doc/rfc/rfc6298.txt new file mode 100644 index 0000000..4466ef5 --- /dev/null +++ b/doc/rfc/rfc6298.txt @@ -0,0 +1,619 @@ + + + + + + +Internet Engineering Task Force (IETF) V. Paxson +Request for Comments: 6298 ICSI/UC Berkeley +Obsoletes: 2988 M. Allman +Updates: 1122 ICSI +Category: Standards Track J. Chu +ISSN: 2070-1721 Google + M. Sargent + CWRU + June 2011 + + + Computing TCP's Retransmission Timer + +Abstract + + This document defines the standard algorithm that Transmission + Control Protocol (TCP) senders are required to use to compute and + manage their retransmission timer. It expands on the discussion in + Section 4.2.3.1 of RFC 1122 and upgrades the requirement of + supporting the algorithm from a SHOULD to a MUST. This document + obsoletes RFC 2988. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc6298. + + + + + + + + + + + + + + + + +Paxson, et al. Standards Track [Page 1] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + +Copyright Notice + + Copyright (c) 2011 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + +1. Introduction + + The Transmission Control Protocol (TCP) [Pos81] uses a retransmission + timer to ensure data delivery in the absence of any feedback from the + remote data receiver. The duration of this timer is referred to as + RTO (retransmission timeout). RFC 1122 [Bra89] specifies that the + RTO should be calculated as outlined in [Jac88]. + + This document codifies the algorithm for setting the RTO. In + addition, this document expands on the discussion in Section 4.2.3.1 + of RFC 1122 and upgrades the requirement of supporting the algorithm + from a SHOULD to a MUST. RFC 5681 [APB09] outlines the algorithm TCP + uses to begin sending after the RTO expires and a retransmission is + sent. This document does not alter the behavior outlined in RFC 5681 + [APB09]. + + In some situations, it may be beneficial for a TCP sender to be more + conservative than the algorithms detailed in this document allow. + However, a TCP MUST NOT be more aggressive than the following + algorithms allow. This document obsoletes RFC 2988 [PA00]. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [Bra97]. + +2. The Basic Algorithm + + To compute the current RTO, a TCP sender maintains two state + variables, SRTT (smoothed round-trip time) and RTTVAR (round-trip + time variation). In addition, we assume a clock granularity of G + seconds. + + + + + +Paxson, et al. Standards Track [Page 2] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + + The rules governing the computation of SRTT, RTTVAR, and RTO are as + follows: + + (2.1) Until a round-trip time (RTT) measurement has been made for a + segment sent between the sender and receiver, the sender SHOULD + set RTO <- 1 second, though the "backing off" on repeated + retransmission discussed in (5.5) still applies. + + Note that the previous version of this document used an initial + RTO of 3 seconds [PA00]. A TCP implementation MAY still use + this value (or any other value > 1 second). This change in the + lower bound on the initial RTO is discussed in further detail + in Appendix A. + + (2.2) When the first RTT measurement R is made, the host MUST set + + SRTT <- R + RTTVAR <- R/2 + RTO <- SRTT + max (G, K*RTTVAR) + + where K = 4. + + (2.3) When a subsequent RTT measurement R' is made, a host MUST set + + RTTVAR <- (1 - beta) * RTTVAR + beta * |SRTT - R'| + SRTT <- (1 - alpha) * SRTT + alpha * R' + + The value of SRTT used in the update to RTTVAR is its value + before updating SRTT itself using the second assignment. That + is, updating RTTVAR and SRTT MUST be computed in the above + order. + + The above SHOULD be computed using alpha=1/8 and beta=1/4 (as + suggested in [JK88]). + + After the computation, a host MUST update + RTO <- SRTT + max (G, K*RTTVAR) + + (2.4) Whenever RTO is computed, if it is less than 1 second, then the + RTO SHOULD be rounded up to 1 second. + + Traditionally, TCP implementations use coarse grain clocks to + measure the RTT and trigger the RTO, which imposes a large + minimum value on the RTO. Research suggests that a large + minimum RTO is needed to keep TCP conservative and avoid + spurious retransmissions [AP99]. Therefore, this specification + requires a large minimum RTO as a conservative approach, while + + + + +Paxson, et al. Standards Track [Page 3] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + + at the same time acknowledging that at some future point, + research may show that a smaller minimum RTO is acceptable or + superior. + + (2.5) A maximum value MAY be placed on RTO provided it is at least 60 + seconds. + +3. Taking RTT Samples + + TCP MUST use Karn's algorithm [KP87] for taking RTT samples. That + is, RTT samples MUST NOT be made using segments that were + retransmitted (and thus for which it is ambiguous whether the reply + was for the first instance of the packet or a later instance). The + only case when TCP can safely take RTT samples from retransmitted + segments is when the TCP timestamp option [JBB92] is employed, since + the timestamp option removes the ambiguity regarding which instance + of the data segment triggered the acknowledgment. + + Traditionally, TCP implementations have taken one RTT measurement at + a time (typically, once per RTT). However, when using the timestamp + option, each ACK can be used as an RTT sample. RFC 1323 [JBB92] + suggests that TCP connections utilizing large congestion windows + should take many RTT samples per window of data to avoid aliasing + effects in the estimated RTT. A TCP implementation MUST take at + least one RTT measurement per RTT (unless that is not possible per + Karn's algorithm). + + For fairly modest congestion window sizes, research suggests that + timing each segment does not lead to a better RTT estimator [AP99]. + Additionally, when multiple samples are taken per RTT, the alpha and + beta defined in Section 2 may keep an inadequate RTT history. A + method for changing these constants is currently an open research + question. + +4. Clock Granularity + + There is no requirement for the clock granularity G used for + computing RTT measurements and the different state variables. + However, if the K*RTTVAR term in the RTO calculation equals zero, the + variance term MUST be rounded to G seconds (i.e., use the equation + given in step 2.3). + + RTO <- SRTT + max (G, K*RTTVAR) + + Experience has shown that finer clock granularities (<= 100 msec) + perform somewhat better than coarser granularities. + + + + + +Paxson, et al. Standards Track [Page 4] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + + Note that [Jac88] outlines several clever tricks that can be used to + obtain better precision from coarse granularity timers. These + changes are widely implemented in current TCP implementations. + +5. Managing the RTO Timer + + An implementation MUST manage the retransmission timer(s) in such a + way that a segment is never retransmitted too early, i.e., less than + one RTO after the previous transmission of that segment. + + The following is the RECOMMENDED algorithm for managing the + retransmission timer: + + (5.1) Every time a packet containing data is sent (including a + retransmission), if the timer is not running, start it running + so that it will expire after RTO seconds (for the current value + of RTO). + + (5.2) When all outstanding data has been acknowledged, turn off the + retransmission timer. + + (5.3) When an ACK is received that acknowledges new data, restart the + retransmission timer so that it will expire after RTO seconds + (for the current value of RTO). + + When the retransmission timer expires, do the following: + + (5.4) Retransmit the earliest segment that has not been acknowledged + by the TCP receiver. + + (5.5) The host MUST set RTO <- RTO * 2 ("back off the timer"). The + maximum value discussed in (2.5) above may be used to provide + an upper bound to this doubling operation. + + (5.6) Start the retransmission timer, such that it expires after RTO + seconds (for the value of RTO after the doubling operation + outlined in 5.5). + + (5.7) If the timer expires awaiting the ACK of a SYN segment and the + TCP implementation is using an RTO less than 3 seconds, the RTO + MUST be re-initialized to 3 seconds when data transmission + begins (i.e., after the three-way handshake completes). + + This represents a change from the previous version of this + document [PA00] and is discussed in Appendix A. + + + + + + +Paxson, et al. Standards Track [Page 5] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + + Note that after retransmitting, once a new RTT measurement is + obtained (which can only happen when new data has been sent and + acknowledged), the computations outlined in Section 2 are performed, + including the computation of RTO, which may result in "collapsing" + RTO back down after it has been subject to exponential back off (rule + 5.5). + + Note that a TCP implementation MAY clear SRTT and RTTVAR after + backing off the timer multiple times as it is likely that the current + SRTT and RTTVAR are bogus in this situation. Once SRTT and RTTVAR + are cleared, they should be initialized with the next RTT sample + taken per (2.2) rather than using (2.3). + +6. Security Considerations + + This document requires a TCP to wait for a given interval before + retransmitting an unacknowledged segment. An attacker could cause a + TCP sender to compute a large value of RTO by adding delay to a timed + packet's latency, or that of its acknowledgment. However, the + ability to add delay to a packet's latency often coincides with the + ability to cause the packet to be lost, so it is difficult to see + what an attacker might gain from such an attack that could cause more + damage than simply discarding some of the TCP connection's packets. + + The Internet, to a considerable degree, relies on the correct + implementation of the RTO algorithm (as well as those described in + RFC 5681) in order to preserve network stability and avoid congestion + collapse. An attacker could cause TCP endpoints to respond more + aggressively in the face of congestion by forging acknowledgments for + segments before the receiver has actually received the data, thus + lowering RTO to an unsafe value. But to do so requires spoofing the + acknowledgments correctly, which is difficult unless the attacker can + monitor traffic along the path between the sender and the receiver. + In addition, even if the attacker can cause the sender's RTO to reach + too small a value, it appears the attacker cannot leverage this into + much of an attack (compared to the other damage they can do if they + can spoof packets belonging to the connection), since the sending TCP + will still back off its timer in the face of an incorrectly + transmitted packet's loss due to actual congestion. + + The security considerations in RFC 5681 [APB09] are also applicable + to this document. + + + + + + + + + +Paxson, et al. Standards Track [Page 6] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + +7. Changes from RFC 2988 + + This document reduces the initial RTO from the previous 3 seconds + [PA00] to 1 second, unless the SYN or the ACK of the SYN is lost, in + which case the default RTO is reverted to 3 seconds before data + transmission begins. + +8. Acknowledgments + + The RTO algorithm described in this memo was originated by Van + Jacobson in [Jac88]. + + Much of the data that motivated changing the initial RTO from 3 + seconds to 1 second came from Robert Love, Andre Broido, and Mike + Belshe. + +9. References + +9.1. Normative References + + [APB09] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion + Control", RFC 5681, September 2009. + + [Bra89] Braden, R., Ed., "Requirements for Internet Hosts - + Communication Layers", STD 3, RFC 1122, October 1989. + + [Bra97] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [JBB92] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions for + High Performance", RFC 1323, May 1992. + + [Pos81] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, + September 1981. + +9.2. Informative References + + [AP99] Allman, M. and V. Paxson, "On Estimating End-to-End Network + Path Properties", SIGCOMM 99. + + [Chu09] Chu, J., "Tuning TCP Parameters for the 21st Century", + http://www.ietf.org/proceedings/75/slides/tcpm-1.pdf, July + 2009. + + [SLS09] Schulman, A., Levin, D., and Spring, N., "CRAWDAD data set + umd/sigcomm2008 (v. 2009-03-02)", + http://crawdad.cs.dartmouth.edu/umd/sigcomm2008, March, 2009. + + + + +Paxson, et al. Standards Track [Page 7] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + + [HKA04] Henderson, T., Kotz, D., and Abyzov, I., "CRAWDAD trace + dartmouth/campus/tcpdump/fall03 (v. 2004-11-09)", + http://crawdad.cs.dartmouth.edu/dartmouth/campus/ + tcpdump/fall03, November 2004. + + [Jac88] Jacobson, V., "Congestion Avoidance and Control", Computer + Communication Review, vol. 18, no. 4, pp. 314-329, Aug. + 1988. + + [JK88] Jacobson, V. and M. Karels, "Congestion Avoidance and + Control", ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. + + [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time + Estimates in Reliable Transport Protocols", SIGCOMM 87. + + [PA00] Paxson, V. and M. Allman, "Computing TCP's Retransmission + Timer", RFC 2988, November 2000. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Paxson, et al. Standards Track [Page 8] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + +Appendix A. Rationale for Lowering the Initial RTO + + Choosing a reasonable initial RTO requires balancing two competing + considerations: + + 1. The initial RTO should be sufficiently large to cover most of the + end-to-end paths to avoid spurious retransmissions and their + associated negative performance impact. + + 2. The initial RTO should be small enough to ensure a timely recovery + from packet loss occurring before an RTT sample is taken. + + Traditionally, TCP has used 3 seconds as the initial RTO [Bra89] + [PA00]. This document calls for lowering this value to 1 second + using the following rationale: + + - Modern networks are simply faster than the state-of-the-art was at + the time the initial RTO of 3 seconds was defined. + + - Studies have found that the round-trip times of more than 97.5% of + the connections observed in a large scale analysis were less than 1 + second [Chu09], suggesting that 1 second meets criterion 1 above. + + - In addition, the studies observed retransmission rates within the + three-way handshake of roughly 2%. This shows that reducing the + initial RTO has benefit to a non-negligible set of connections. + + - However, roughly 2.5% of the connections studied in [Chu09] have an + RTT longer than 1 second. For those connections, a 1 second + initial RTO guarantees a retransmission during connection + establishment (needed or not). + + When this happens, this document calls for reverting to an initial + RTO of 3 seconds for the data transmission phase. Therefore, the + implications of the spurious retransmission are modest: (1) an + extra SYN is transmitted into the network, and (2) according to RFC + 5681 [APB09] the initial congestion window will be limited to 1 + segment. While (2) clearly puts such connections at a + disadvantage, this document at least resets the RTO such that the + connection will not continually run into problems with a short + timeout. (Of course, if the RTT is more than 3 seconds, the + connection will still encounter difficulties. But that is not a + new issue for TCP.) + + In addition, we note that when using timestamps, TCP will be able + to take an RTT sample even in the presence of a spurious + retransmission, facilitating convergence to a correct RTT estimate + when the RTT exceeds 1 second. + + + +Paxson, et al. Standards Track [Page 9] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + + As an additional check on the results presented in [Chu09], we + analyzed packet traces of client behavior collected at four different + vantage points at different times, as follows: + + Name Dates Pkts. Cnns. Clnts. Servs. + -------------------------------------------------------- + LBL-1 Oct/05--Mar/06 292M 242K 228 74K + LBL-2 Nov/09--Feb/10 1.1B 1.2M 1047 38K + ICSI-1 Sep/11--18/07 137M 2.1M 193 486K + ICSI-2 Sep/11--18/08 163M 1.9M 177 277K + ICSI-3 Sep/14--21/09 334M 3.1M 170 253K + ICSI-4 Sep/11--18/10 298M 5M 183 189K + Dartmouth Jan/4--21/04 1B 4M 3782 132K + SIGCOMM Aug/17--21/08 11.6M 133K 152 29K + + The "LBL" data was taken at the Lawrence Berkeley National + Laboratory, the "ICSI" data from the International Computer Science + Institute, the "SIGCOMM" data from the wireless network that served + the attendees of SIGCOMM 2008, and the "Dartmouth" data was collected + from Dartmouth College's wireless network. The latter two datasets + are available from the CRAWDAD data repository [HKA04] [SLS09]. The + table lists the dates of the data collections, the number of packets + collected, the number of TCP connections observed, the number of + local clients monitored, and the number of remote servers contacted. + We consider only connections initiated near the tracing vantage + point. + + Analysis of these datasets finds the prevalence of retransmitted SYNs + to be between 0.03% (ICSI-4) to roughly 2% (LBL-1 and Dartmouth). + + We then analyzed the data to determine the number of additional and + spurious retransmissions that would have been incurred if the initial + RTO was assumed to be 1 second. In most of the datasets, the + proportion of connections with spurious retransmits was less than + 0.1%. However, in the Dartmouth dataset, approximately 1.1% of the + connections would have sent a spurious retransmit with a lower + initial RTO. We attribute this to the fact that the monitored + network is wireless and therefore susceptible to additional delays + from RF effects. + + Finally, there are obviously performance benefits from retransmitting + lost SYNs with a reduced initial RTO. Across our datasets, the + percentage of connections that retransmitted a SYN and would realize + at least a 10% performance improvement by using the smaller initial + RTO specified in this document ranges from 43% (LBL-1) to 87% + (ICSI-4). The percentage of connections that would realize at least + a 50% performance improvement ranges from 17% (ICSI-1 and SIGCOMM) to + 73% (ICSI-4). + + + +Paxson, et al. Standards Track [Page 10] + +RFC 6298 Computing TCP's Retransmission Timer June 2011 + + + From the data to which we have access, we conclude that the lower + initial RTO is likely to be beneficial to many connections, and + harmful to relatively few. + + Authors' Addresses + + Vern Paxson + ICSI/UC Berkeley + 1947 Center Street + Suite 600 + Berkeley, CA 94704-1198 + + Phone: 510-666-2882 + EMail: vern@icir.org + http://www.icir.org/vern/ + + + Mark Allman + ICSI + 1947 Center Street + Suite 600 + Berkeley, CA 94704-1198 + + Phone: 440-235-1792 + EMail: mallman@icir.org + http://www.icir.org/mallman/ + + + H.K. Jerry Chu + Google, Inc. + 1600 Amphitheatre Parkway + Mountain View, CA 94043 + + Phone: 650-253-3010 + EMail: hkchu@google.com + + + Matt Sargent + Case Western Reserve University + Olin Building + 10900 Euclid Avenue + Room 505 + Cleveland, OH 44106 + + Phone: 440-223-5932 + EMail: mts71@case.edu + + + + + +Paxson, et al. Standards Track [Page 11] + |