1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
|
Network Working Group P. Sarolahti
Request for Comments: 5682 Nokia Research Center
Updates: 4138 M. Kojo
Category: Standards Track University of Helsinki
K. Yamamoto
M. Hata
NTT Docomo
September 2009
Forward RTO-Recovery (F-RTO): An Algorithm for Detecting
Spurious Retransmission Timeouts with TCP
Abstract
The purpose of this document is to move the F-RTO (Forward
RTO-Recovery) functionality for TCP in RFC 4138 from
Experimental to Standards Track status. The F-RTO support for Stream
Control Transmission Protocol (SCTP) in RFC 4138 remains with
Experimental status. See Appendix B for the differences between this
document and RFC 4138.
Spurious retransmission timeouts cause suboptimal TCP performance
because they often result in unnecessary retransmission of the last
window of data. This document describes the F-RTO detection
algorithm for detecting spurious TCP retransmission timeouts. F-RTO
is a TCP sender-only algorithm that does not require any TCP options
to operate. After retransmitting the first unacknowledged segment
triggered by a timeout, the F-RTO algorithm of the TCP sender
monitors the incoming acknowledgments to determine whether the
timeout was spurious. It then decides whether to send new segments
or retransmit unacknowledged segments. The algorithm effectively
helps to avoid additional unnecessary retransmissions and thereby
improves TCP performance in the case of a spurious timeout.
Status of This Memo
This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
Sarolahti, et al. Standards Track [Page 1]
^L
RFC 5682 F-RTO September 2009
Copyright and License Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the BSD License.
Table of Contents
1. Introduction ....................................................3
1.1. Conventions and Terminology ................................5
2. Basic F-RTO Algorithm ...........................................5
2.1. The Algorithm ..............................................5
2.2. Discussion .................................................7
3. SACK-Enhanced Version of the F-RTO Algorithm ....................9
3.1. The Algorithm ..............................................9
3.2. Discussion ................................................11
4. Taking Actions after Detecting Spurious RTO ....................11
5. Evaluation of RFC 4138 .........................................12
6. Security Considerations ........................................13
7. Acknowledgments ................................................14
Appendix A. Discussion of Window-Limited Cases ....................15
Appendix B. Changes since RFC 4138 ................................16
References ........................................................16
Normative References ...........................................16
Informative References .........................................17
Sarolahti, et al. Standards Track [Page 2]
^L
RFC 5682 F-RTO September 2009
1. Introduction
The Transmission Control Protocol (TCP) [Pos81] has two methods for
triggering retransmissions. First, the TCP sender relies on incoming
duplicate acknowledgments (ACKs), which indicate that the receiver is
missing some of the data. After a required number of successive
duplicate ACKs have arrived at the sender, it retransmits the first
unacknowledged segment [APB09] and continues with a loss recovery
algorithm such as NewReno [FHG04] or SACK-based (Selective
Acknowledgment) loss recovery [BAFW03]. Second, the TCP sender
maintains a retransmission timer that triggers retransmission of
segments, if they have not been acknowledged before the
retransmission timeout (RTO) occurs. When the retransmission timeout
occurs, the TCP sender enters the RTO recovery where the congestion
window is initialized to one segment and unacknowledged segments are
retransmitted using the slow-start algorithm. The retransmission
timer is adjusted dynamically, based on the measured round-trip times
[PA00].
It has been pointed out that the retransmission timer can expire
spuriously and cause unnecessary retransmissions when no segments
have been lost [LK00, GL02, LM03]. After a spurious retransmission
timeout, the late acknowledgments of the original segments arrive at
the sender, usually triggering unnecessary retransmissions of a whole
window of segments during the RTO recovery. Furthermore, after a
spurious retransmission timeout, a conventional TCP sender increases
the congestion window on each late acknowledgment in slow start.
This injects a large number of data segments into the network within
one round-trip time, thus violating the packet conservation principle
[Jac88].
There are a number of potential reasons for spurious retransmission
timeouts. First, some mobile networking technologies involve sudden
delay spikes on transmission because of actions taken during a hand-
off. Second, a hand-off may take place from a low latency path to a
high latency path, suddenly increasing the round-trip time beyond the
current RTO value. Third, on a low-bandwidth link the arrival of
competing traffic (possibly with higher priority), or some other
change in available bandwidth, can cause a sudden increase of the
round-trip time. This may trigger a spurious retransmission timeout.
A persistently reliable link layer can also cause a sudden delay when
a data frame and several retransmissions of it are lost for some
reason. This document does not distinguish between the different
causes of such a delay spike. Rather, it discusses the spurious
retransmission timeouts caused by a delay spike in general.
Sarolahti, et al. Standards Track [Page 3]
^L
RFC 5682 F-RTO September 2009
This document describes the F-RTO detection algorithm for TCP. It is
based on the detection mechanism of the "Forward RTO-Recovery"
(F-RTO) algorithm [SKR03] that is used for detecting spurious
retransmission timeouts and thus avoids unnecessary retransmissions
following the retransmission timeout. When the timeout is not
spurious, the F-RTO algorithm reverts back to the conventional RTO
recovery algorithm, and therefore has similar behavior and
performance. In contrast to alternative algorithms proposed for
detecting unnecessary retransmissions (Eifel [LK00, LM03] and DSACK-
based (Duplicate SACK) algorithms [BA04]), F-RTO does not require any
TCP options for its operation, and it can be implemented by modifying
only the TCP sender. The Eifel algorithm uses TCP timestamps [BBJ92]
for detecting a spurious timeout upon arrival of the first
acknowledgment after the retransmission. The DSACK-based algorithms
require that the TCP Selective Acknowledgment Option [MMFR96], with
the DSACK extension [FMMP00], is in use. With DSACK, the TCP
receiver can report if it has received a duplicate segment, enabling
the sender to detect afterwards whether it has retransmitted segments
unnecessarily. The F-RTO algorithm only attempts to detect and avoid
unnecessary retransmissions after an RTO. Eifel and DSACK can also
be used for detecting unnecessary retransmissions caused by other
events, such as packet reordering.
When the retransmission timer expires, the F-RTO sender retransmits
the first unacknowledged segment as usual [APB09]. Deviating from
the normal operation after a timeout, it then tries to transmit new,
previously unsent data for the first acknowledgment that arrives
after the timeout, given that the acknowledgment advances the window.
If the second acknowledgment that arrives after the timeout advances
the window (i.e., acknowledges data that was not retransmitted), the
F-RTO sender declares the timeout spurious and exits the RTO
recovery. However, if either of these two acknowledgments is a
duplicate ACK, there will not be sufficient evidence of a spurious
timeout. Therefore, the F-RTO sender retransmits the unacknowledged
segments in slow start similar to the traditional algorithm. With a
SACK-enhanced version of the F-RTO algorithm, spurious timeouts may
be detected even if duplicate ACKs arrive after an RTO
retransmission.
This document specifies the F-RTO algorithm for TCP only, replacing
the F-RTO functionality with TCP in RFC 4138 [SK05] and moving it
from Experimental to Standards Track status. The algorithm can also
be applied to the Stream Control Transmission Protocol (SCTP) [Ste07]
that has acknowledgment and packet retransmission concepts similar to
TCP. The considerations on applying F-RTO to SCTP are discussed in
RFC 4138, but the F-RTO support for SCTP remains with Experimental
status.
Sarolahti, et al. Standards Track [Page 4]
^L
RFC 5682 F-RTO September 2009
This document is organized as follows. Section 2 describes the basic
F-RTO algorithm, and the SACK-enhanced F-RTO algorithm is given in
Section 3. Section 4 discusses the possible actions to be taken
after detecting a spurious RTO. Section 5 summarizes the experience
with F-RTO implementations and the experimental results, and Section
6 discusses the security considerations.
1.1. Conventions and Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in BCP 14, RFC 2119
[RFC2119] and indicate requirement levels for protocols.
2. Basic F-RTO Algorithm
A timeout is considered spurious if it would have been avoided had
the sender waited longer for an acknowledgment to arrive [LM03].
F-RTO affects the TCP sender behavior only after a retransmission
timeout. Otherwise, the TCP behavior remains the same. When the
retransmission timer expires, the F-RTO algorithm monitors incoming
acknowledgments, and if the TCP sender gets an acknowledgment for a
segment that was not retransmitted due to the timeout, the F-RTO
algorithm declares a timeout spurious. The actions taken in response
to a spurious timeout are not specified in this document, but we
discuss some alternatives in Section 4. This section introduces the
algorithm and then discusses the different steps of the algorithm in
more detail.
Following the practice used with the Eifel Detection algorithm
[LM03], we use the "SpuriousRecovery" variable to indicate whether
the retransmission is declared spurious by the sender. This variable
can be used as an input for a corresponding response algorithm. With
F-RTO, the value of SpuriousRecovery can be either SPUR_TO
(indicating a spurious retransmission timeout) or FALSE (indicating
that the timeout is not declared spurious and the TCP sender should
follow the conventional RTO recovery algorithm). In addition, we use
the "recover" variable specified in the NewReno algorithm [FHG04].
2.1. The Algorithm
A TCP sender implementing the basic F-RTO algorithm MUST take the
following steps after the retransmission timer expires. If the
retransmission timer expires again during the execution of the F-RTO
algorithm, the TCP sender MUST re-start the algorithm processing from
step 1. If the sender implements some loss recovery algorithm other
than Reno or NewReno [FHG04], the F-RTO algorithm SHOULD NOT be
entered when earlier fast recovery is underway.
Sarolahti, et al. Standards Track [Page 5]
^L
RFC 5682 F-RTO September 2009
The F-RTO algorithm takes different actions based on whether an
incoming acknowledgment advances the cumulative acknowledgment point
for a received in-order segment, or whether it is a duplicate
acknowledgment to indicate an out-of-order segment. Duplicate
acknowledgment is defined in [APB09]. The F-RTO algorithm does not
specify actions for receiving a segment that neither acknowledges new
data nor is a duplicate acknowledgment. The TCP sender SHOULD ignore
such segments and wait for a segment that either acknowledges new
data or is a duplicate acknowledgment.
1) When the retransmission timer expires, retransmit the first
unacknowledged segment and set SpuriousRecovery to FALSE. If the
TCP sender is already in RTO recovery AND "recover" is larger than
or equal to SND.UNA (the oldest unacknowledged sequence number
[Pos81]), do not enter step 2 of this algorithm. Instead, store
the highest sequence number transmitted so far in variable
"recover" and continue with slow-start retransmissions following
the conventional RTO recovery algorithm.
2) When the first acknowledgment after the RTO retransmission arrives
at the TCP sender, store the highest sequence number transmitted
so far in variable "recover". The TCP sender chooses one of the
following actions, depending on whether the ACK advances the
window or whether it is a duplicate ACK.
a) If the acknowledgment is a duplicate ACK, OR the Acknowledgment
field covers "recover" but not more than "recover", OR the
acknowledgment does not acknowledge all of the data that was
retransmitted in step 1, revert to the conventional RTO
recovery and continue by retransmitting unacknowledged data in
slow start. Do not enter step 3 of this algorithm. The
SpuriousRecovery variable remains as FALSE.
b) Else, if the acknowledgment advances the window AND the
Acknowledgment field does not cover "recover", transmit up to
two new (previously unsent) segments and enter step 3 of this
algorithm. If the TCP sender does not have enough unsent data,
it can send only one segment. In addition, the TCP sender MAY
override the Nagle algorithm [Nag84] and immediately send a
segment if needed. Note that sending two segments in this step
is allowed by TCP congestion control requirements [APB09]: an
F-RTO TCP sender simply chooses different segments to transmit.
If the TCP sender does not have any new data to send, or the
advertised window prohibits new transmissions, the recommended
action is to skip step 3 of this algorithm and continue with
slow-start retransmissions, following the conventional RTO
Sarolahti, et al. Standards Track [Page 6]
^L
RFC 5682 F-RTO September 2009
recovery algorithm. However, alternative ways of handling the
window-limited cases that could result in better performance
are discussed in Appendix A.
3) When the second acknowledgment after the RTO retransmission
arrives at the TCP sender, the TCP sender either declares the
timeout spurious, or starts retransmitting the unacknowledged
segments.
a) If the acknowledgment is a duplicate ACK, set the congestion
window to no more than 3 * MSS (where MSS indicates Maximum
Segment Size), and continue with the slow-start algorithm
retransmitting unacknowledged segments. The congestion window
can be set to 3 * MSS, because two round-trip times have
elapsed since the RTO, and a conventional TCP sender would have
increased cwnd to 3 during the same time. Leave
SpuriousRecovery set to FALSE.
b) If the acknowledgment advances the window (i.e., if it
acknowledges data that was not retransmitted after the
timeout), declare the timeout spurious, set SpuriousRecovery to
SPUR_TO, and set the value of the "recover" variable to SND.UNA
(the oldest unacknowledged sequence number [Pos81]).
2.2. Discussion
The F-RTO sender takes cautious actions when it receives duplicate
acknowledgments after a retransmission timeout. Because duplicate
ACKs may indicate that segments have been lost, reliably detecting a
spurious timeout is difficult due to the lack of additional
information. Therefore, it is prudent to follow the conventional TCP
recovery in those cases.
The condition in step 1 prevents the execution of the F-RTO algorithm
in case a previous RTO recovery is underway when the retransmission
timer expires, except in case the retransmission timer expires
multiple times for the same segment. If the retransmission timer
expires during an earlier RTO-based loss recovery, acknowledgments
for retransmitted segments may falsely lead the TCP sender to declare
the timeout spurious.
If the first acknowledgment after the RTO retransmission covers the
"recover" point at algorithm step (2a), there is not enough evidence
that a non-retransmitted segment has arrived at the receiver after
the timeout. This is a common case when a fast retransmission is
lost and has been retransmitted again after an RTO, while the rest of
Sarolahti, et al. Standards Track [Page 7]
^L
RFC 5682 F-RTO September 2009
the unacknowledged segments were successfully delivered to the TCP
receiver before the retransmission timeout. Therefore, the timeout
cannot be declared spurious in this case.
If the first acknowledgment after the RTO retransmission does not
acknowledge all of the data that was retransmitted in step 1, the TCP
sender reverts to the conventional RTO recovery. Otherwise, a
malicious receiver acknowledging partial segments could cause the
sender to declare the timeout spurious in a case where data was lost.
The TCP sender is allowed to send two new segments in algorithm
branch (2b) because the conventional TCP sender would transmit two
segments when the first new ACK arrives after the RTO retransmission.
If sending new data is not possible in algorithm branch (2b), or if
the receiver window limits the transmission, the TCP sender has to
send something in order to prevent the TCP transfer from stalling.
If no segments were sent, the pipe between sender and receiver might
run out of segments, and no further acknowledgments would arrive.
Therefore, in the window-limited case, the recommendation is to
revert to the conventional RTO recovery with slow-start
retransmissions. Appendix A discusses some alternative solutions for
window-limited situations.
If the retransmission timeout is declared spurious, the TCP sender
sets the value of the "recover" variable to SND.UNA in order to allow
fast retransmit [FHG04]. The "recover" variable was proposed for
avoiding unnecessary, multiple fast retransmits when the
retransmission timer expires during fast recovery with NewReno TCP.
Because the F-RTO sender retransmits only the segment that triggered
the timeout, the problem of unnecessary multiple fast retransmits
[FHG04] cannot occur. Therefore, if three duplicate ACKs arrive at
the sender after the timeout, they probably indicate a packet loss,
and thus fast retransmit should be used to allow efficient recovery.
If there are not enough duplicate ACKs arriving at the sender after a
packet loss, the retransmission timer expires again and the sender
enters step 1 of this algorithm.
When the timeout is declared spurious, the TCP sender cannot detect
whether the unnecessary RTO retransmission was lost. In principle,
the loss of the RTO retransmission should be taken as a congestion
signal. Thus, there is a small possibility that the F-RTO sender
will violate the congestion control rules, if it chooses to fully
revert congestion control parameters after detecting a spurious
timeout. The Eifel Detection algorithm has a similar property, while
the DSACK option can be used to detect whether the retransmitted
segment was successfully delivered to the receiver.
Sarolahti, et al. Standards Track [Page 8]
^L
RFC 5682 F-RTO September 2009
The F-RTO algorithm has a side effect on the TCP round-trip time
measurement. Because the TCP sender can avoid most of the
unnecessary retransmissions after detecting a spurious timeout, the
sender is able to take round-trip time samples on the delayed
segments. If the regular RTO recovery was used without TCP
timestamps, this would not be possible due to the retransmission
ambiguity. As a result, the RTO is likely to have more accurate and
larger values with F-RTO than with the regular TCP after a spurious
timeout that was triggered due to delayed segments. We believe this
is an advantage in networks that are prone to delay spikes.
There are some situations where the F-RTO algorithm may not avoid
unnecessary retransmissions after a spurious timeout. If packet
reordering or packet duplication occurs on the segment that triggered
the spurious timeout, the F-RTO algorithm may not detect the spurious
timeout due to incoming duplicate ACKs. Additionally, if a spurious
timeout occurs during fast recovery, the F-RTO algorithm often cannot
detect the spurious timeout because the segments that were
transmitted before the fast recovery trigger duplicate ACKs.
However, we consider these cases rare, and note that in cases where
F-RTO fails to detect the spurious timeout, it retransmits the
unacknowledged segments in slow start, and thus performs the same as
the regular RTO recovery.
3. SACK-Enhanced Version of the F-RTO Algorithm
This section describes an alternative version of the F-RTO algorithm
that uses the TCP Selective Acknowledgment Option [MMFR96]. By using
the SACK option, the TCP sender detects spurious timeouts in most of
the cases when packet reordering or packet duplication is present.
If the SACK information acknowledges new data that was not
transmitted after the RTO retransmission, the sender may declare the
timeout spurious, even when duplicate ACKs follow the RTO.
3.1. The Algorithm
Given that the TCP Selective Acknowledgment Option [MMFR96] is
enabled for a TCP connection, a TCP sender MAY apply the SACK-
enhanced F-RTO algorithm. If the sender applies the SACK-enhanced
F-RTO algorithm, it MUST follow the steps below. This algorithm
SHOULD NOT be applied if the TCP sender is already in loss recovery
when a retransmission timeout occurs.
The steps of the SACK-enhanced version of the F-RTO algorithm are as
follows. If the retransmission timer expires again during the
execution of the SACK-enhanced F-RTO algorithm, the TCP sender MUST
re-start the algorithm processing from step 1.
Sarolahti, et al. Standards Track [Page 9]
^L
RFC 5682 F-RTO September 2009
1) When the retransmission timer expires, retransmit the first
unacknowledged segment and set SpuriousRecovery to FALSE.
Following the recommendation in the SACK specification [MMFR96],
reset the SACK scoreboard. If "RecoveryPoint" is larger than or
equal to SND.UNA, do not enter step 2 of this algorithm. Instead,
set variable "RecoveryPoint" to indicate the highest sequence
number transmitted so far and continue with slow-start
retransmissions following the conventional RTO recovery algorithm.
2) Wait until the acknowledgment of the data retransmitted due to the
timeout arrives at the sender. If duplicate ACKs arrive before
the cumulative acknowledgment for retransmitted data, adjust the
scoreboard according to the incoming SACK information. Stay in
step 2 and wait for the next new acknowledgment. If the
retransmission timeout expires again, go to step 1 of the
algorithm. When a new acknowledgment arrives, set variable
"RecoveryPoint" to indicate the highest sequence number
transmitted so far.
a) If the Cumulative Acknowledgment field covers "RecoveryPoint"
but not more than "RecoveryPoint", revert to the conventional
RTO recovery and set the congestion window to no more than 2 *
MSS, like a regular TCP would do. Do not enter step 3 of this
algorithm.
b) Else, if the Cumulative Acknowledgment field does not cover
"RecoveryPoint" but is larger than SND.UNA, transmit up to two
new (previously unsent) segments and proceed to step 3. If the
TCP sender is not able to transmit any previously unsent data
-- either due to receiver window limitation or because it does
not have any new data to send -- the recommended action is to
refrain from entering step 3 of this algorithm. Rather,
continue with slow-start retransmissions following the
conventional RTO recovery algorithm.
It is also possible to apply some of the alternatives for
handling window-limited cases discussed in Appendix A.
3) The next acknowledgment arrives at the sender. Either a duplicate
ACK or a new cumulative ACK (advancing the window) applies in this
step. Other types of ACKs are ignored without any action.
a) If the Cumulative Acknowledgment field or the SACK information
covers more than "RecoveryPoint", set the congestion window to
no more than 3 * MSS and proceed with the conventional RTO
recovery, retransmitting unacknowledged segments. Take this
branch also when the acknowledgment is a duplicate ACK and it
does not acknowledge any new, previously unacknowledged data
Sarolahti, et al. Standards Track [Page 10]
^L
RFC 5682 F-RTO September 2009
below "RecoveryPoint" in the SACK information. Leave
SpuriousRecovery set to FALSE.
b) If the Cumulative Acknowledgment field or a SACK information in
the ACK does not cover more than "RecoveryPoint" AND it
acknowledges data that was not acknowledged earlier (either
with cumulative acknowledgment or using SACK information),
declare the timeout spurious and set SpuriousRecovery to
SPUR_TO. The retransmission timeout can be declared spurious,
because the segment acknowledged with this ACK was transmitted
before the timeout.
If there are unacknowledged holes between the received SACK
information, those segments are retransmitted similarly to the
conventional SACK recovery algorithm [BAFW03]. If the algorithm
exits with SpuriousRecovery set to SPUR_TO, "RecoveryPoint" is set to
SND.UNA, thus allowing fast recovery on incoming duplicate
acknowledgments.
3.2. Discussion
The SACK-enhanced algorithm works on the same principle as the basic
algorithm, but by utilizing the additional information from the SACK
option. When a genuine retransmission timeout occurs during a steady
state of a connection, it can be assumed that there are no segments
left in the pipe. Otherwise, the acknowledgments triggered by these
segments would have triggered the SACK loss recovery or transmission
of new segments. Therefore, if the F-RTO sender receives
acknowledgments for segments transmitted before the retransmission
timeout in response to the two new segments sent at the algorithm
step 2, the normal operation of TCP has been just delayed, and the
retransmission timeout is considered spurious. Note that this
reasoning works only when the TCP sender is not in loss recovery at
the time the retransmission timeout occurs. The condition in step 1
checking that "RecoveryPoint" is larger than or equal to SND.UNA
prevents the execution of the F-RTO algorithm in case a previous loss
recovery, either RTO recovery or SACK loss recovery, is underway when
the retransmission timer expires. It, however, allows the execution
of the F-RTO algorithm, if the retransmission timer expires multiple
times for the same segment.
4. Taking Actions after Detecting Spurious RTO
Upon a retransmission timeout, a conventional TCP sender assumes that
outstanding segments are lost and starts retransmitting the
unacknowledged segments. When the retransmission timeout is detected
to be spurious, the TCP sender should not continue retransmitting
based on the timeout. For example, if the sender was in congestion
Sarolahti, et al. Standards Track [Page 11]
^L
RFC 5682 F-RTO September 2009
avoidance phase transmitting new, previously unsent segments, it
should continue transmitting previously unsent segments in congestion
avoidance.
There are currently two alternatives specified for a spurious timeout
response algorithm, the Eifel Response Algorithm [LG05], and an
algorithm for adapting the retransmission timeout after a spurious
RTO [BBA06]. If no specific response algorithm is implemented, the
TCP SHOULD respond to spurious timeout conservatively, applying the
TCP congestion control specification [APB09]. Different response
algorithms for spurious retransmission timeouts have been analyzed in
some research papers [GL03, Sar03] and IETF documents [SL03].
5. Evaluation of RFC 4138
F-RTO was first specified in an Experimental RFC (RFC 4138) that has
been implemented in a number of operating systems since it was
published. Gained experience has been documented in a separate
document [KYHS07], and can be summarized as follows.
If the TCP sender employs F-RTO, it is able to detect spurious RTOs
and avoid the unnecessary retransmission of the whole window of data.
Because F-RTO avoids the unnecessary retransmissions after a spurious
RTO, it is able to adhere to the packet conservation principle,
unlike a regular TCP that enters the slow-start recovery
unnecessarily and inappropriately restarts the ACK clock while there
are segments outstanding in the network. When a spurious RTO has
been detected, a sender can select an appropriate congestion control
response instead of setting the congestion window to one segment.
Because F-RTO avoids unnecessary retransmissions, it is able to take
the round-trip time of the delayed segments into account when
calculating the RTO estimate, which may help in avoiding further
spurious retransmission timeouts.
Experimental results with the basic F-RTO have been reported in an
emulated network using a Linux implementation [SKR03]. Also,
different congestion control responses along with the SACK-enhanced
version of F-RTO were tested in a similar environment [Sar03]. There
are publications analyzing F-RTO performance over commercial Wideband
Code Division Multiple Access (W-CDMA) networks, and in an emulated
High-Speed Downlink Packet Access (HSDPA) network [Yam05, Hok05].
Also, Microsoft reported positive experiences with their
implementation of F-RTO at the IETF-68 meeting.
It is known that some spurious RTOs may remain undetected by F-RTO if
duplicate acknowledgments arrive at the sender immediately after the
spurious RTO, for example due to packet reordering or packet loss.
There are rare corner cases where F-RTO could "hide" a packet loss
Sarolahti, et al. Standards Track [Page 12]
^L
RFC 5682 F-RTO September 2009
and therefore lead to inappropriate behavior with non-conservative
congestion control response: first, if a massive packet reordering
occurred so that the acknowledgment of RTO retransmission arrived at
the sender before the acknowledgments of original transmissions, the
sender might not detect the loss of the segment that triggered the
RTO. Second, a malicious receiver could lead F-RTO to make a wrong
conclusion after an RTO by acknowledging segments it has not
received. Such a receiver would, however, risk breaking the
consistency of the TCP state between the sender and receiver, causing
the connection to become unusable, which cannot be of any benefit to
the receiver. Therefore, we believe it is not likely that receivers
would start employing such tricks on a significant scale. Finally,
loss of the unnecessary RTO retransmission cannot be detected without
using some explicit acknowledgment scheme such as DSACK. This is
common to the other mechanisms for detecting spurious RTO, as well as
to regular TCP that does not use DSACK. We note that if the
congestion control response to spurious RTO is conservative enough,
the above corner cases do not cause problems due to increased
congestion.
6. Security Considerations
The main security threat regarding F-RTO is the possibility that a
receiver could mislead the sender into setting too large a congestion
window after an RTO. There are two possible ways a malicious
receiver could trigger a wrong output from the F-RTO algorithm.
First, the receiver can acknowledge data that it has not received.
Second, it can delay acknowledgment of a segment it has received
earlier, and acknowledge the segment after the TCP sender has been
deluded to enter algorithm step 3.
If the receiver acknowledges a segment it has not really received,
the sender can be led to declare spurious timeout in the F-RTO
algorithm, step 3. However, because the sender will have an
incorrect state, it cannot retransmit the segment that has never
reached the receiver. Therefore, this attack is unlikely to be
useful for the receiver to maliciously gain a larger congestion
window.
A common case for a retransmission timeout is that a fast
retransmission of a segment is lost. If all other segments have been
received, the RTO retransmission causes the whole window to be
acknowledged at once. This case is recognized in F-RTO algorithm
branch (2a). However, if the receiver only acknowledges one segment
after receiving the RTO retransmission, and then the rest of the
segments, it could cause the timeout to be declared spurious when it
is not. Therefore, it is suggested that, when an RTO occurs during
Sarolahti, et al. Standards Track [Page 13]
^L
RFC 5682 F-RTO September 2009
the fast recovery phase, the sender would not fully revert the
congestion window even if the timeout was declared spurious.
Instead, the sender would reduce the congestion window to 1.
If there is more than one segment missing at the time of a
retransmission timeout, the receiver does not benefit from misleading
the sender to declare a spurious timeout because the sender would
have to go through another recovery period to retransmit the missing
segments, usually after an RTO has elapsed.
7. Acknowledgments
The authors would like to thank Alfred Hoenes, Ilpo Jarvinen, and
Murari Sridharan for the comments on this document.
We are also thankful to Reiner Ludwig, Andrei Gurtov, Josh Blanton,
Mark Allman, Sally Floyd, Yogesh Swami, Mika Liljeberg, Ivan Arias
Rodriguez, Sourabh Ladha, Martin Duke, Motoharu Miyake, Ted Faber,
Samu Kontinen, and Kostas Pentikousis who gave valuable feedback
during the preparation of RFC 4138, the precursor of this document.
Sarolahti, et al. Standards Track [Page 14]
^L
RFC 5682 F-RTO September 2009
Appendix A. Discussion of Window-Limited Cases
When the advertised window limits the transmission of two new
previously unsent segments, or there are no new data to send, it is
recommended in F-RTO algorithm step (2b) that the TCP sender continue
with the conventional RTO recovery algorithm. The disadvantage is
that the sender may continue unnecessary retransmissions due to
possible spurious timeout. This section briefly discusses the
options that can potentially improve performance when transmitting
previously unsent data is not possible.
- The TCP sender could reserve an unused space of a size of one or
two segments in the advertised window to ensure the use of
algorithms such as F-RTO or Limited Transmit [ABF01] in receiver
window-limited situations. On the other hand, while doing this,
the TCP sender should ensure that the window of outstanding
segments is large enough for proper utilization of the available
pipe.
- Use additional information if available, e.g., TCP timestamps with
the Eifel Detection algorithm, for detecting a spurious timeout.
However, Eifel detection may yield different results from F-RTO
when ACK losses and an RTO occur within the same round-trip time
[SKR03].
- Retransmit data from the tail of the retransmission queue and
continue with step 3 of the F-RTO algorithm. It is possible that
the retransmission will be made unnecessarily. Furthermore, the
operation of the SACK-based F-RTO algorithm would need to consider
this case separately, to not use the retransmitted segment to
indicate spurious timeout. Given these considerations, this option
is not recommended.
- Send a zero-sized segment below SND.UNA, similar to a TCP Keep-
Alive probe, and continue with step 3 of the F-RTO algorithm.
Because the receiver replies with a duplicate ACK, the sender is
able to detect whether the timeout was spurious from the incoming
acknowledgment. This method does not send data unnecessarily, but
it delays the recovery by one round-trip time in cases where the
timeout was not spurious. Therefore, this method is not
encouraged.
- In receiver-limited cases, send one octet of new data, regardless
of the advertised window limit, and continue with step 3 of the
F-RTO algorithm. It is possible that the receiver will have free
buffer space to receive the data by the time the segment has
Sarolahti, et al. Standards Track [Page 15]
^L
RFC 5682 F-RTO September 2009
propagated through the network, in which case no harm is done. If
the receiver is not capable of receiving the segment, it rejects
the segment and sends a duplicate ACK.
Appendix B. Changes since RFC 4138
Changes from RFC 4138 are summarized below, apart from minor
editing and language improvements.
* Modified the basic F-RTO algorithm and the SACK-enhanced F-RTO
algorithm to prevent the TCP sender from applying the F-RTO
algorithm if the retransmission timer expires when an earlier RTO
recovery is underway, except when the retransmission timer expires
multiple times for the same segment.
* Clarified behavior on multiple timeouts.
* Added a paragraph on acknowledgments that do not acknowledge new
data but are not duplicate acknowledgments.
* Clarified the SACK-algorithm a bit, and added one paragraph of
description of the basic idea of the algorithm.
* Removed SCTP considerations.
* Removed earlier Appendix sections, except Appendix C from RFC 4138,
which is now Appendix A.
* Clarified text about the possible response algorithms.
* Added section that summarizes the evaluation of RFC 4138.
References
Normative References
[APB09] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
Control", RFC 5681, September 2009.
[BAFW03] Blanton, E., Allman, M., Fall, K., and L. Wang, "A
Conservative Selective Acknowledgment (SACK)-based Loss
Recovery Algorithm for TCP", RFC 3517, April 2003.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
Sarolahti, et al. Standards Track [Page 16]
^L
RFC 5682 F-RTO September 2009
[FHG04] Floyd, S., Henderson, T., and A. Gurtov, "The NewReno
Modification to TCP's Fast Recovery Algorithm", RFC 3782,
April 2004.
[MMFR96] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
Selective Acknowledgment Options", RFC 2018, October 1996.
[PA00] Paxson, V. and M. Allman, "Computing TCP's Retransmission
Timer", RFC 2988, November 2000.
[Pos81] Postel, J., "Transmission Control Protocol", STD 7, RFC
793, September 1981.
Informative References
[ABF01] Allman, M., Balakrishnan, H., and S. Floyd, "Enhancing
TCP's Loss Recovery Using Limited Transmit", RFC 3042,
January 2001.
[BA04] Blanton, E. and M. Allman, "Using TCP Duplicate Selective
Acknowledgement (DSACKs) and Stream Control Transmission
Protocol (SCTP) Duplicate Transmission Sequence Numbers
(TSNs) to Detect Spurious Retransmissions", RFC 3708,
February 2004.
[BBA06] Blanton, J., Blanton, E., and M. Allman, "Using Spurious
Retransmissions to Adapt the Retransmission Timeout", Work
in Progress, December 2006.
[BBJ92] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions
for High Performance", RFC 1323, May 1992.
[FMMP00] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An
Extension to the Selective Acknowledgement (SACK) Option
for TCP", RFC 2883, July 2000.
[GL02] Gurtov A. and R. Ludwig, "Evaluating the Eifel Algorithm
for TCP in a GPRS Network", In Proc. European Wireless,
Florence, Italy, February 2002.
[GL03] Gurtov A. and R. Ludwig, "Responding to Spurious Timeouts
in TCP", In Proc. IEEE INFOCOM 03, San Francisco, CA, USA,
March 2003.
[Jac88] Jacobson, V., "Congestion Avoidance and Control", In Proc.
ACM SIGCOMM 88.
Sarolahti, et al. Standards Track [Page 17]
^L
RFC 5682 F-RTO September 2009
[Hok05] Hokamura, A., et al., "Performance Evaluation of F-RTO and
Eifel Response Algorithms over W-CDMA packet network", In
Proc. Wireless Personal Multimedia Communications
(WPMC'05), Sept. 2005.
[KYHS07] Kojo, M., Yamamoto, K., Hata, M., and P. Sarolahti,
"Evaluation of RFC 4138", Work in Progress, November 2007.
[LG05] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm for
TCP", RFC 4015, February 2005.
[LK00] Ludwig R. and R.H. Katz, "The Eifel Algorithm: Making TCP
Robust Against Spurious Retransmissions", ACM SIGCOMM
Computer Communication Review, 30(1), January 2000.
[LM03] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for
TCP", RFC 3522, April 2003.
[Nag84] Nagle, J., "Congestion control in IP/TCP internetworks",
RFC 896, January 1984.
[SK05] Sarolahti, P. and M. Kojo, "Forward RTO-Recovery (F-RTO):
An Algorithm for Detecting Spurious Retransmission Timeouts
with TCP and the Stream Control Transmission Protocol
(SCTP)", RFC 4138, August 2005.
[SKR03] Sarolahti, P., Kojo, M., and K. Raatikainen, "F-RTO: An
Enhanced Recovery Algorithm for TCP Retransmission
Timeouts", ACM SIGCOMM Computer Communication Review,
33(2), April 2003.
[Sar03] Sarolahti, P., "Congestion Control on Spurious TCP
Retransmission Timeouts", In Proc. of IEEE Globecom 2003,
San Francisco, CA, USA. December 2003.
[SL03] Swami Y. and K. Le, "DCLOR: De-correlated Loss Recovery
using SACK Option for spurious timeouts", Work in Progress,
September 2003.
[Ste07] Stewart, R., Ed., "Stream Control Transmission Protocol",
RFC 4960, September 2007.
[Yam05] Yamamoto, K., et al., "Effects of F-RTO and Eifel Response
Algorithms for W-CDMA and HSDPA networks", In Proc.
Wireless Personal Multimedia Communications (WPMC'05),
September 2005.
Sarolahti, et al. Standards Track [Page 18]
^L
RFC 5682 F-RTO September 2009
Authors' Addresses
Pasi Sarolahti
Nokia Research Center
P.O. Box 407
FI-00045 NOKIA GROUP
Finland
Phone: +358 50 4876607
EMail: pasi.sarolahti@iki.fi
Markku Kojo
University of Helsinki
P.O. Box 68
FI-00014 UNIVERSITY OF HELSINKI
Finland
Phone: +358 9 19151305
EMail: kojo@cs.helsinki.fi
Kazunori Yamamoto
NTT Docomo, Inc.
3-5 Hikarinooka, Yokosuka, Kanagawa, 239-8536, Japan
Phone: +81-46-840-3812
EMail: yamamotokaz@nttdocomo.co.jp
Max Hata
NTT Docomo, Inc.
3-5 Hikarinooka, Yokosuka, Kanagawa, 239-8536, Japan
Phone: +81-46-840-3812
EMail: hatama@s1.nttdocomo.co.jp
Sarolahti, et al. Standards Track [Page 19]
^L
|