1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
|
Network Working Group R. Fox
Request for Comments: 1106 Tandem
June 1989
TCP Big Window and Nak Options
Status of this Memo
This memo discusses two extensions to the TCP protocol to provide a
more efficient operation over a network with a high bandwidth*delay
product. The extensions described in this document have been
implemented and shown to work using resources at NASA. This memo
describes an Experimental Protocol, these extensions are not proposed
as an Internet standard, but as a starting point for further
research. Distribution of this memo is unlimited.
Abstract
Two extensions to the TCP protocol are described in this RFC in order
to provide a more efficient operation over a network with a high
bandwidth*delay product. The main issue that still needs to be
solved is congestion versus noise. This issue is touched on in this
memo, but further research is still needed on the applicability of
the extensions in the Internet as a whole infrastructure and not just
high bandwidth*delay product networks. Even with this outstanding
issue, this document does describe the use of these options in the
isolated satellite network environment to help facilitate more
efficient use of this special medium to help off load bulk data
transfers from links needed for interactive use.
1. Introduction
Recent work on TCP has shown great performance gains over a variety
of network paths [1]. However, these changes still do not work well
over network paths that have a large round trip delay (satellite with
a 600 ms round trip delay) or a very large bandwidth
(transcontinental DS3 line). These two networks exhibit a higher
bandwidth*delay product, over 10**6 bits, than the 10**5 bits that
TCP is currently limited to. This high bandwidth*delay product
refers to the amount of data that may be unacknowledged so that all
of the networks bandwidth is being utilized by TCP. This may also be
referred to as "filling the pipe" [2] so that the sender of data can
always put data onto the network and the receiver will always have
something to read, and neither end of the connection will be forced
to wait for the other end.
After the last batch of algorithm improvements to TCP, performance
Fox [Page 1]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
over high bandwidth*delay networks is still very poor. It appears
that no algorithm changes alone will make any significant
improvements over high bandwidth*delay networks, but will require an
extension to the protocol itself. This RFC discusses two possible
options to TCP for this purpose.
The two options implemented and discussed in this RFC are:
1. NAKs
This extension allows the receiver of data to inform the sender
that a packet of data was not received and needs to be resent.
This option proves to be useful over any network path (both high
and low bandwidth*delay type networks) that experiences periodic
errors such as lost packets, noisy links, or dropped packets due
to congestion. The information conveyed by this option is
advisory and if ignored, does not have any effect on TCP what so
ever.
2. Big Windows
This option will give a method of expanding the current 16 bit (64
Kbytes) TCP window to 32 bits of which 30 bits (over 1 gigabytes)
are allowed for the receive window. (The maximum window size
allowed in TCP due to the requirement of TCP to detect old data
versus new data. For a good explanation please see [2].) No
changes are required to the standard TCP header [6]. The 16 bit
field in the TCP header that is used to convey the receive window
will remain unchanged. The 32 bit receive window is achieved
through the use of an option that contains the upper half of the
window. It is this option that is necessary to fill large data
pipes such as a satellite link.
This RFC is broken up into the following sections: section 2 will
discuss the operation of the NAK option in greater detail, section 3
will discuss the big window option in greater detail. Section 4 will
discuss other effects of the big windows and nak feature when used
together. Included in this section will be a brief discussion on the
effects of congestion versus noise to TCP and possible options for
satellite networks. Section 5 will be a conclusion with some hints
as to what future development may be done at NASA, and then an
appendix containing some test results is included.
2. NAK Option
Any packet loss in a high bandwidth*delay network will have a
catastrophic effect on throughput because of the simple
acknowledgement of TCP. TCP always acks the stream of data that has
Fox [Page 2]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
successfully been received and tells the sender the next byte of data
of the stream that is expected. If a packet is lost and succeeding
packets arrive the current protocol has no way of telling the sender
that it missed one packet but received following packets. TCP
currently resends all of the data over again, after a timeout or the
sender suspects a lost packet due to a duplicate ack algorithm [1],
until the receiver receives the lost packet and can then ack the lost
packet as well as succeeding packets received. On a normal low
bandwidth*delay network this effect is minimal if the timeout period
is set short enough. However, on a long delay network such as a T1
satellite channel this is catastrophic because by the time the lost
packet can be sent and the ack returned the TCP window would have
been exhausted and both the sender and receiver would be temporarily
stalled waiting for the packet and ack to fully travel the data pipe.
This causes the pipe to become empty and requires the sender to
refill the pipe after the ack is received. This will cause a minimum
of 3*X bandwidth loss, where X is the one way delay of the medium and
may be much higher depending on the size of the timeout period and
bandwidth*delay product. Its 1X for the packet to be resent, 1X for
the ack to be received and 1X for the next packet being sent to reach
the destination. This calculation assumes that the window size is
much smaller than the pipe size (window = 1/2 data pipe or 1X), which
is the typical case with the current TCP window limitation over long
delay networks such as a T1 satellite link.
An attempt to reduce this wasted bandwidth from 3*X was introduced in
[1] by having the sender resend a packet after it notices that a
number of consecutively received acks completely acknowledges already
acknowledged data. On a typical network this will reduce the lost
bandwidth to almost nil, since the packet will be resent before the
TCP window is exhausted and with the data pipe being much smaller
than the TCP window, the data pipe will not become empty and no
bandwidth will be lost. On a high delay network the reduction of
lost bandwidth is minimal such that lost bandwidth is still
significant. On a very noisy satellite, for instance, the lost
bandwidth is very high (see appendix for some performance figures)
and performance is very poor.
There are two methods of informing the sender of lost data.
Selective acknowledgements and NAKS. Selective acknowledgements have
been the object of research in a number of experimental protocols
including VMTP [3], NETBLT [4], and SatFTP [5]. The idea behind
selective acks is that the receiver tells the sender which pieces it
received so that the sender can resend the data not acked but already
sent once. NAKs on the other hand, tell the sender that a particular
packet of data needs to be resent.
There are a couple of disadvantages of selective acks. Namely, in
Fox [Page 3]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
some of the protocols mentioned above, the receiver waits a certain
time before sending the selective ack so that acks may be bundled up.
This delay can cause some wasted bandwidth and requires more complex
state information than the simple nak. Even if the receiver doesn't
bundle up the selective acks but sends them as it notices that
packets have been lost, more complex state information is needed to
determine which packets have been acked and which packets need to be
resent. With naks, only the immediate data needed to move the left
edge of the window is naked, thus almost completely eliminating all
state information.
The selective ack has one advantage over naks. If the link is very
noisy and packets are being lost close together, then the sender will
find out about all of the missing data at once and can send all of
the missing data out immediately in an attempt to move the left
window edge in the acknowledge number of the TCP header, thus keeping
the data pipe flowing. Whereas with naks, the sender will be
notified of lost packets one at a time and this will cause the sender
to process extra packets compared to selective acks. However,
empirical studies has shown that most lost packets occur far enough
apart that the advantage of selective acks over naks is rarely seen.
Also, if naks are sent out as soon as a packet has been determined
lost, then the advantage of selective acks becomes no more than
possibly a more aesthetic algorithm for handling lost data, but
offers no gains over naks as described in this paper. It is this
reason that the simplicity of naks was chosen over selective acks for
the current implementation.
2.1 Implementation details
When the receiver of data notices a gap between the expected sequence
number and the actual sequence number of the packet received, the
receiver can assume that the data between the two sequence numbers is
either going to arrive late or is lost forever. Since the receiver
can not distinguish between the two events a nak should be sent in
the TCP option field. Naking a packet still destined to arrive has
the effect of causing the sender to resend the packet, wasting one
packets worth of bandwidth. Since this event is fairly rare, the
lost bandwidth is insignificant as compared to that of not sending a
nak when the packet is not going to arrive. The option will take the
form as follows:
+========+=========+=========================+================+
+option= + length= + sequence number of + number of +
+ A + 7 + first byte being naked + segments naked +
+========+=========+=========================+================+
This option contains the first sequence number not received and a
Fox [Page 4]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
count of how many segments of bytes needed to be resent, where
segments is the size of the current TCP MSS being used for the
connection. Since a nak is an advisory piece of information, the
sending of a nak is unreliable and no means for retransmitting a nak
is provided at this time.
When the sender of data receives the option it may either choose to
do nothing or it will resend the missing data immediately and then
continue sending data where it left off before receiving the nak.
The receiver will keep track of the last nak sent so that it will not
repeat the same nak. If it were to repeat the same nak the protocol
could get into the mode where on every reception of data the receiver
would nak the first missing data frame. Since the data pipe may be
very large by the time the first nak is read and responded to by the
sender, many naks would have been sent by the receiver. Since the
sender does not know that the naks are repetitious it will resend the
data each time, thus wasting the network bandwidth with useless
retransmissions of the same piece of data. Having an unreliable nak
may result in a nak being damaged and not being received by the
sender, and in this case, we will let the tcp recover by its normal
means. Empirical data has shown that the likelihood of the nak being
lost is quite small and thus, this advisory nak option works quite
well.
3. Big Window Option
Currently TCP has a 16 bit window limitation built into the protocol.
This limits the amount of outstanding unacknowledged data to 64
Kbytes. We have already seen that some networks have a pipe larger
than 64 Kbytes. A T1 satellite channel and a cross country DS3
network with a 30ms delay have data pipes much larger than 64 Kbytes.
Thus, even on a perfectly conditioned link with no bandwidth wasted
due to errors, the data pipe will not be filled and bandwidth will be
wasted. What is needed is the ability to send more unacknowledged
data. This is achieved by having bigger windows, bigger than the
current limitation of 16 bits. This option to expands the window
size to 30 bits or over 1 gigabytes by literally expanding the window
size mechanism currently used by TCP. The added option contains the
upper 15 bits of the window while the lower 16 bits will continue to
go where they normally go [6] in the TCP header.
A TCP session will use the big window options only if both sides
agree to use them, otherwise the option is not used and the normal 16
bit windows will be used. Once the 2 sides agree to use the big
windows then every packet thereafter will be expected to contain the
window option with the current upper 15 bits of the window. The
negotiation to decide whether or not to use the bigger windows takes
place during the SYN and SYN ACK segments of the TCP connection
Fox [Page 5]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
startup process. The originator of the connection will include in
the SYN segment the following option:
1 byte 1 byte 4 bytes
+=========+==========+===============+
+option=B + length=6 + 30 bit window +
+=========+==========+===============+
If the other end of the connection wants to use big windows it will
include the same option back in the SYN ACK segment that it must
send. At this point, both sides have agreed to use big windows and
the specified windows will be used. It should be noted that the SYN
and SYN ACK segments will use the small windows, and once the big
window option has been negotiated then the bigger windows will be
used.
Once both sides have agreed to use 32 bit windows the protocol will
function just as it did before with no difference in operation, even
in the event of lost packets. This claim holds true since the
rcv_wnd and snd_wnd variables of tcp contain the 16 bit windows until
the big window option is negotiated and then they are replaced with
the appropriate 32 bit values. Thus, the use of big windows becomes
part of the state information kept by TCP.
Other methods of expanding the windows have been presented, including
a window multiple [2] or streaming [5], but this solution is more
elegant in the sense that it is a true extension of the window that
one day may easily become part of the protocol and not just be an
option to the protocol.
3.1 How does it work
Once a connection has decided to use big windows every succeeding
packet must contain the following option:
+=========+==========+==========================+
+option=C + length=4 + upper 15 bits of rcv_wnd +
+=========+==========+==========================+
With all segments sent, the sender supplies the size of its receive
window. If the connection is only using 16 bits then this option is
not supplied, otherwise the lower 16 bits of the receive window go
into the tcp header where it currently resides [6] and the upper 15
bits of the window is put into the data portion of the option C.
When the receiver processes the packet it must first reform the
window and then process the packet as it would in the absence of the
option.
Fox [Page 6]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
3.2 Impact of changes
In implementing the first version of the big window option there was
very little change required to the source. State information must be
added to the protocol to determine if the big window option is to be
used and all 16 bit variables that dealt with window information must
now become 32 bit quantities. A future document will describe in
more detail the changes required to the 4.3 bsd tcp source code.
Test results of the window change only are presented in the appendix.
When expanding 16 bit quantities to 32 bit quantities in the TCP
control block in the source (4.3 bsd source) may cause the structure
to become larger than the mbuf used to hold the structure. Care must
be taken to insure this doesn't occur with your system or
undetermined events may take place.
4. Effects of Big Windows and Naks when used together
With big windows alone, transfer times over a satellite were quite
impressive with the absence of any introduced errors. However, when
an error simulator was used to create random errors during transfers,
performance went down extremely fast. When the nak option was added
to the big window option performance in the face of errors went up
some but not to the level that was expected. This section will
discuss some issues that were overcome to produce the results given
in the appendix.
4.1 Window Size and Nak benefits
With out errors, the window size required to keep the data pipe full
is equal to the round trip delay * throughput desired, or the data
pipe bandwidth (called Z from now on). This and other calculations
assume that processing time of the hosts is negligible. In the event
of an error (without NAKs), the window size needs to become larger
than Z in order to keep the data pipe full while the sender is
waiting for the ack of the resent packet. If the window size is
equaled to Z and we assume that the retransmission timer is equaled
to Z, then when a packet is lost, the retransmission timer will go
off as the last piece of data in the window is sent. In this case,
the lost piece of data can be resent with no delay. The data pipe
will empty out because it will take 1/2Z worth of data to get the ack
back to the sender, an additional 1/2Z worth of data to get the data
pipe refilled with new data. This causes the required window to be
2Z, 1Z to keep the data pipe full during normal operations and 1Z to
keep the data pipe full while waiting for a lost packet to be resent
and acked.
If the same scenario in the last paragraph is used with the addition
of NAKs, the required window size still needs to be 2Z to avoid
Fox [Page 7]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
wasting any bandwidth in the event of a dropped packet. This appears
to mean that the nak option does not provide any benefits at all.
Testing showed that the retransmission timer was larger than the data
pipe and in the event of errors became much bigger than the data
pipe, because of the retransmission backoff. Thus, the nak option
bounds the required window to 2Z such that in the event of an error
there is no lost bandwidth, even with the retransmission timer
fluctuations. The results in the appendix shows that by using naks,
bandwidth waste associated with the retransmission timer facility is
eliminated.
4.2 Congestions vs Noise
An issue that must be looked at when implementing both the NAKs and
big window scheme together is in the area of congestion versus lost
packets due to the medium, or noise. In the recent algorithm
enhancements [1], slow start was introduced so that whenever a data
transfer is being started on a connection or right after a dropped
packet, the effective send window would be set to a very small size
(typically would equal the MSS being used). This is done so that a
new connection would not cause congestion by immediately overloading
the network, and so that an existing connection would back off the
network if a packet was dropped due to congestion and allow the
network to clear up. If a connection using big windows loses a
packet due to the medium (a packet corrupted by an error) the last
thing that should be done is to close the send window so that the
connection can only send 1 packet and must use the slow start
algorithm to slowly work itself back up to sending full windows worth
of data. This algorithm would quickly limit the usefulness of the
big window and nak options over lossy links.
On the other hand, if a packet was dropped due to congestion and the
sender assumes the packet was dropped because of noise the sender
will continue sending large amounts of data. This action will cause
the congestion to continue, more packets will be dropped, and that
part of the network will collapse. In this instance, the sender
would want to back off from sending at the current window limit.
Using the current slow start mechanism over a satellite builds up the
window too slowly [1]. Possibly a better solution would be for the
window to be opened 2*Rlog2(W) instead of R*log2(W) [1] (open window
by 2 packets instead of 1 for each acked packet). This will reduce
the wasted bandwidth by opening the window much quicker while giving
the network a chance to clear up. More experimentation is necessary
to find the optimal rate of opening the window, especially when large
windows are being used.
The current recommendation for TCP is to use the slow start mechanism
in the event of any lost packet. If an application knows that it
Fox [Page 8]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
will be using a satellite with a high error rate, it doesn't make
sense to force it to use the slow start mechanism for every dropped
packet. Instead, the application should be able to choose what
action should happen in the event of a lost packet. In the BSD
environment, a setsockopt call should be provided so that the
application may inform TCP to handle lost packets in a special way
for this particular connection. If the known error rate of a link is
known to be small, then by using slow start with modified rate from
above, will cause the amount of bandwidth loss to be very small in
respect to the amount of bandwidth actually utilized. In this case,
the setsockopt call should not be used. What is really needed is a
way for a host to determine if a packet or packets are being dropped
due to congestion or noise. Then, the host can choose to do the
right thing. This will require a mechanism like source quench to be
used. For this to happen more experimentation is necessary to
determine a solid definition on the use of this mechanism. Now it is
believed by some that using source quench to avoid congestion only
adds to the problem, not help suppress it.
The TCP used to gather the results in the appendix for the big window
with nak experiment, assumed that lost packets were the result of
noise and not congestion. This assumption was used to show how to
make the current TCP work in such an environment. The actual
satellite used in the experiment (when the satellite simulator was
not used) only experienced an error rate around 10e-10. With this
error rate it is suggested that in practice when big windows are used
over the link, TCP should use the slow start mechanism for all lost
packets with the 2*Rlog2(W) rate discussed above. Under most
situations when long delay networks are being used (transcontinental
DS3 networks using fiber with very low error rates, or satellite
links with low error rates) big windows and naks should be used with
the assumption that lost packets are the result of congestion until a
better algorithm is devised [7].
Another problem noticed, while testing the affects of slow start over
a satellite link, was at times, the retransmission timer was set so
restrictive, that milliseconds before a naked packet's ack is
received the retransmission timer would go off due to a timed packet
within the send window. The timer was set at the round trip delay of
the network allowing no time for packet processing. If this timer
went off due to congestion then backing off is the right thing to do,
otherwise to avoid the scenario discovered by experimentation, the
transmit timer should be set a little longer so that the
retransmission timer does not go off too early. Care must be taken
to make sure the right thing is done in the implementation in
question so that a packet isn't retransmitted too soon, and blamed on
congestion when in fact, the ack is on its way.
Fox [Page 9]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
4.3 Duplicate Acks
Another problem found with the 4.3bsd implementation is in the area
of duplicate acks. When the sender of data receives a certain number
of acks (3 in the current Berkeley release) that acknowledge
previously acked data before, it then assumes that a packet has been
lost and will resend the one packet assumed lost, and close its send
window as if the network is congested and the slow start algorithm
mention above will be used to open the send window. This facility is
no longer needed since the sender can use the reception of a nak as
its indicator that a particular packet was dropped. If the nak
packet is lost then the retransmit timer will go off and the packet
will be retransmitted by normal means. If a senders algorithm
continues to count duplicate acks the sender will find itself
possibly receiving many duplicate acks after it has already resent
the packet due to a nak being received because of the large size of
the data pipe. By receiving all of these duplicate acks the sender
may find itself doing nothing but resending the same packet of data
unnecessarily while keeping the send window closed for absolutely no
reason. By removing this feature of the implementation a user can
expect to find a satellite connection working much better in the face
of errors and other connections should not see any performance loss,
but a slight improvement in performance if anything at all.
5. Conclusion
This paper has described two new options that if used will make TCP a
more efficient protocol in the face of errors and a more efficient
protocol over networks that have a high bandwidth*delay product
without decreasing performance over more common networks. If a
system that implements the options talks with one that does not, the
two systems should still be able to communicate with no problems.
This assumes that the system doesn't use the option numbers defined
in this paper in some other way or doesn't panic when faced with an
option that the machine does not implement. Currently at NASA, there
are many machines that do not implement either option and communicate
just fine with the systems that do implement them.
The drive for implementing big windows has been the direct result of
trying to make TCP more efficient over large delay networks [2,3,4,5]
such as a T1 satellite. However, another practical use of large
windows is becoming more apparent as the local area networks being
developed are becoming faster and supporting much larger MTU's.
Hyperchannel, for instances, has been stated to be able to support 1
Mega bit MTU's in their new line of products. With the current
implementation of TCP, efficient use of hyperchannel is not utilized
as it should because the physical mediums MTU is larger than the
maximum window of the protocol being used. By increasing the TCP
Fox [Page 10]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
window size, better utilization of networks like hyperchannel will be
gained instantly because the sender can send 64 Kbyte packets (IP
limitation) but not have to operate in a stop and wait fashion.
Future work is being started to increase the IP maximum datagram size
so that even better utilization of fast local area networks will be
seen by having the TCP/IP protocols being able to send large packets
over mediums with very large MTUs. This will hopefully, eliminate
the network protocol as the bottleneck in data transfers while
workstations and workstation file system technology advances even
more so, than it already has.
An area of concern when using the big window mechanism is the use of
machine resources. When running over a satellite and a packet is
dropped such that 2Z (where Z is the round trip delay) worth of data
is unacknowledged, both ends of the connection need to be able to
buffer the data using machine mbufs (or whatever mechanism the
machine uses), usually a valuable and scarce commodity. If the
window size is not chosen properly, some machines will crash when the
memory is all used up, or it will keep other parts of the system from
running. Thus, setting the window to some fairly large arbitrary
number is not a good idea, especially on a general purpose machine
where many users log on at any time. What is currently being
engineered at NASA is the ability for certain programs to use the
setsockopt feature or 4.3bsd asking to use big windows such that the
average user may not have access to the large windows, thus limiting
the use of big windows to applications that absolutely need them and
to protect a valuable system resource.
6. References
[1] Jacobson, V., "Congestion Avoidance and Control", SIGCOMM 88,
Stanford, Ca., August 1988.
[2] Jacobson, V., and R. Braden, "TCP Extensions for Long-Delay
Paths", LBL, USC/Information Sciences Institute, RFC 1072,
October 1988.
[3] Cheriton, D., "VMTP: Versatile Message Transaction Protocol", RFC
1045, Stanford University, February 1988.
[4] Clark, D., M. Lambert, and L. Zhang, "NETBLT: A Bulk Data
Transfer Protocol", RFC 998, MIT, March 1987.
[5] Fox, R., "Draft of Proposed Solution for High Delay Circuit File
Transfer", GE/NAS Internal Document, March 1988.
[6] Postel, J., "Transmission Control Protocol - DARPA Internet
Program Protocol Specification", RFC 793, DARPA, September 1981.
Fox [Page 11]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
[7] Leiner, B., "Critical Issues in High Bandwidth Networking", RFC
1077, DARPA, November 1989.
7. Appendix
Both options have been implemented and tested. Contained in this
section is some performance gathered to support the use of these two
options. The satellite channel used was a 1.544 Mbit link with a
580ms round trip delay. All values are given as units of bytes.
TCP with Big Windows, No Naks:
|---------------transfer rates----------------------|
Window Size | no error | 10e-7 error rate | 10e-6 error rate |
-----------------------------------------------------------------
64K | 94K | 53K | 14K |
-----------------------------------------------------------------
72K | 106K | 51K | 15K |
-----------------------------------------------------------------
80K | 115K | 42K | 14K |
-----------------------------------------------------------------
92K | 115K | 43K | 14K |
-----------------------------------------------------------------
100K | 135K | 66K | 15K |
-----------------------------------------------------------------
112K | 126K | 53K | 17K |
-----------------------------------------------------------------
124K | 154K | 45K | 14K |
-----------------------------------------------------------------
136K | 160K | 66K | 15K |
-----------------------------------------------------------------
156K | 167K | 45K | 14K |
-----------------------------------------------------------------
Figure 1.
Fox [Page 12]
^L
RFC 1106 TCP Big Window and Nak Options June 1989
TCP with Big Windows, and Naks:
|---------------transfer rates----------------------|
Window Size | no error | 10e-7 error rate | 10e-6 error rate |
-----------------------------------------------------------------
64K | 95K | 83K | 43K |
-----------------------------------------------------------------
72K | 104K | 87K | 49K |
-----------------------------------------------------------------
80K | 117K | 96K | 62K |
-----------------------------------------------------------------
92K | 124K | 119K | 39K |
-----------------------------------------------------------------
100K | 140K | 124K | 35K |
-----------------------------------------------------------------
112K | 151K | 126K | 53K |
-----------------------------------------------------------------
124K | 160K | 140K | 36K |
-----------------------------------------------------------------
136K | 167K | 148K | 38K |
-----------------------------------------------------------------
156K | 167K | 160K | 38K |
-----------------------------------------------------------------
Figure 2.
With a 10e-6 error rate, many naks as well as data packets were
dropped, causing the wild swing in transfer times. Also, please note
that the machines used are SGI Iris 2500 Turbos with the 3.6 OS with
the new TCP enhancements. The performance associated with the Irises
are slower than a Sun 3/260, but due to some source code restrictions
the Iris was used. Initial results on the Sun showed slightly higher
performance and less variance.
Author's Address
Richard Fox
950 Linden #208
Sunnyvale, Cal, 94086
EMail: rfox@tandem.com
Fox [Page 13]
^L
|