summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc813.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc813.txt')
-rw-r--r--doc/rfc/rfc813.txt1167
1 files changed, 1167 insertions, 0 deletions
diff --git a/doc/rfc/rfc813.txt b/doc/rfc/rfc813.txt
new file mode 100644
index 0000000..5817050
--- /dev/null
+++ b/doc/rfc/rfc813.txt
@@ -0,0 +1,1167 @@
+
+RFC: 813
+
+
+
+ WINDOW AND ACKNOWLEDGEMENT STRATEGY IN TCP
+
+ David D. Clark
+ MIT Laboratory for Computer Science
+ Computer Systems and Communications Group
+ July, 1982
+
+
+ 1. Introduction
+
+
+ This document describes implementation strategies to deal with two
+
+mechanisms in TCP, the window and the acknowledgement. These mechanisms
+
+are described in the specification document, but it is possible, while
+
+complying with the specification, to produce implementations which yield
+
+very bad performance. Happily, the pitfalls possible in window and
+
+acknowledgement strategies are very easy to avoid.
+
+
+ It is a much more difficult exercise to verify the performance of a
+
+specification than the correctness. Certainly, we have less experience
+
+in this area, and we certainly lack any useful formal technique.
+
+Nonetheless, it is important to attempt a specification in this area,
+
+because different implementors might otherwise choose superficially
+
+reasonable algorithms which interact poorly with each other. This
+
+document presents a particular set of algorithms which have received
+
+testing in the field, and which appear to work properly with each other.
+
+With more experience, these algorithms may become part of the formal
+
+specification: until such time their use is recommended.
+
+ 2
+
+
+2. The Mechanisms
+
+
+ The acknowledgement mechanism is at the heart of TCP. Very simply,
+
+when data arrives at the recipient, the protocol requires that it send
+
+back an acknowledgement of this data. The protocol specifies that the
+
+bytes of data are sequentially numbered, so that the recipient can
+
+acknowledge data by naming the highest numbered byte of data it has
+
+received, which also acknowledges the previous bytes (actually, it
+
+identifies the first byte of data which it has not yet received, but
+
+this is a small detail). The protocol contains only a general assertion
+
+that data should be acknowledged promptly, but gives no more specific
+
+indication as to how quickly an acknowledgement must be sent, or how
+
+much data should be acknowledged in each separate acknowledgement.
+
+
+ The window mechanism is a flow control tool. Whenever appropriate,
+
+the recipient of data returns to the sender a number, which is (more or
+
+less) the size of the buffer which the receiver currently has available
+
+for additional data. This number of bytes, called the window, is the
+
+maximum which the sender is permitted to transmit until the receiver
+
+returns some additional window. Sometimes, the receiver will have no
+
+buffer space available, and will return a window value of zero. Under
+
+these circumstances,the protocol requires the sender to send a small
+
+segment to the receiver now and then, to see if more data is accepted.
+
+If the window remains closed at zero for some substantial period, and
+
+the sender can obtain no response from the receiver, the protocol
+
+requires the sender to conclude that the receiver has failed, and to
+
+close the connection. Again, there is very little performance
+
+ 3
+
+
+information in the specification, describing under what circumstances
+
+the window should be increased, and how the sender should respond to
+
+such revised information.
+
+
+ A bad implementation of the window algorithm can lead to extremely
+
+poor performance overall. The degradations which occur in throughput
+
+and CPU utilizations can easily be several factors of ten, not just a
+
+fractional increase. This particular phenomenon is specific enough that
+
+it has been given the name of Silly Window Syndrome, or SWS. Happily
+
+SWS is easy to avoid if a few simple rules are observed. The most
+
+important function of this memo is to describe SWS, so that implementors
+
+will understand the general nature of the problem, and to describe
+
+algorithms which will prevent its occurrence. This document also
+
+describes performance enhancing algorithms which relate to
+
+acknowledgement, and discusses the way acknowledgement and window
+
+algorithms interact as part of SWS.
+
+
+ 3. SILLY WINDOW SYNDROME
+
+
+ In order to understand SWS, we must first define two new terms.
+
+Superficially, the window mechanism is very simple: there is a number,
+
+called "the window", which is returned from the receiver to the sender.
+
+However, we must have a more detailed way of talking about the meaning
+
+of this number. The receiver of data computes a value which we will
+
+call the "offered window". In a simple case, the offered window
+
+corresponds to the amount of buffer space available in the receiver.
+
+This correspondence is not necessarily exact, but is a suitable model
+
+for the discussion to follow. It is the offered window which is
+
+ 4
+
+
+actually transmitted back from the receiver to the sender. The sender
+
+uses the offered window to compute a different value, the "usable
+
+window", which is the offered window minus the amount of outstanding
+
+unacknowledged data. The usable window is less than or equal to the
+
+offered window, and can be much smaller.
+
+
+ Consider the following simple example. The receiver initially
+
+provides an offered window of 1,000. The sender uses up this window by
+
+sending five segments of 200 bytes each. The receiver, on processing
+
+the first of these segments, returns an acknowledgement which also
+
+contains an updated window value. Let us assume that the receiver of
+
+the data has removed the first 200 bytes from the buffer, so that the
+
+receiver once again has 1,000 bytes of available buffer. Therefore, the
+
+receiver would return, as before, an offered window of 1,000 bytes. The
+
+sender, on receipt of this first acknowledgement, now computes the
+
+additional number of bytes which may be sent. In fact, of the 1,000
+
+bytes which the recipient is prepared to receive at this time, 800 are
+
+already in transit, having been sent in response to the previous offered
+
+window. In this case, the usable window is only 200 bytes.
+
+
+ Let us now consider how SWS arises. To continue the previous
+
+example, assume that at some point, when the sender computes a useable
+
+window of 200 bytes, it has only 50 bytes to send until it reaches a
+
+"push" point. It thus sends 50 bytes in one segment, and 150 bytes in
+
+the next segment. Sometime later, this 50-byte segment will arrive at
+
+the recipient, which will process and remove the 50 bytes and once again
+
+return an offered window of 1,000 bytes. However, the sender will now
+
+ 5
+
+
+compute that there are 950 bytes in transit in the network, so that the
+
+useable window is now only 50 bytes. Thus, the sender will once again
+
+send a 50 byte segment, even though there is no longer a natural
+
+boundary to force it.
+
+
+ In fact, whenever the acknowledgement of a small segment comes
+
+back, the useable window associated with that acknowledgement will cause
+
+another segment of the same small size to be sent, until some
+
+abnormality breaks the pattern. It is easy to see how small segments
+
+arise, because natural boundaries in the data occasionally cause the
+
+sender to take a computed useable window and divide it up between two
+
+segments. Once that division has occurred, there is no natural way for
+
+those useable window allocations to be recombined; thus the breaking up
+
+of the useable window into small pieces will persist.
+
+
+ Thus, SWS is a degeneration in the throughput which develops over
+
+time, during a long data transfer. If the sender ever stops, as for
+
+example when it runs out of data to send, the receiver will eventually
+
+acknowledge all the outstanding data, so that the useable window
+
+computed by the sender will equal the full offered window of the
+
+receiver. At this point the situation will have healed, and further
+
+data transmission over the link will occur efficiently. However, in
+
+large file transfers, which occur without interruption, SWS can cause
+
+appalling performance. The network between the sender and the receiver
+
+becomes clogged with many small segments, and an equal number of
+
+acknowledgements, which in turn causes lost segments, which triggers
+
+massive retransmission. Bad cases of SWS have been seen in which the
+
+ 6
+
+
+average segment size was one-tenth of the size the sender and receiver
+
+were prepared to deal with, and the average number of retransmission per
+
+successful segments sent was five.
+
+
+ Happily, SWS is trivial to avoid. The following sections describe
+
+two algorithms, one executed by the sender, and one by the receiver,
+
+which appear to eliminate SWS completely. Actually, either algorithm by
+
+itself is sufficient to prevent SWS, and thus protect a host from a
+
+foreign implementation which has failed to deal properly with this
+
+problem. The two algorithms taken together produce an additional
+
+reduction in CPU consumption, observed in practice to be as high as a
+
+factor of four.
+
+
+ 4. Improved Window Algorithms
+
+
+ The receiver of data can take a very simple step to eliminate SWS.
+
+When it disposes of a small amount of data, it can artificially reduce
+
+the offered window in subsequent acknowledgements, so that the useable
+
+window computed by the sender does not permit the sending of any further
+
+data. At some later time, when the receiver has processed a
+
+substantially larger amount of incoming data, the artificial limitation
+
+on the offered window can be removed all at once, so that the sender
+
+computes a sudden large jump rather than a sequence of small jumps in
+
+the useable window.
+
+
+ At this level, the algorithm is quite simple, but in order to
+
+determine exactly when the window should be opened up again, it is
+
+necessary to look at some of the other details of the implementation.
+
+ 7
+
+
+Depending on whether the window is held artificially closed for a short
+
+or long time, two problems will develop. The one we have already
+
+discussed -- never closing the window artificially -- will lead to SWS.
+
+On the other hand, if the window is only opened infrequently, the
+
+pipeline of data in the network between the sender and the receiver may
+
+have emptied out while the sender was being held off, so that a delay is
+
+introduced before additional data arrives from the sender. This delay
+
+does reduce throughput, but it does not consume network resources or CPU
+
+resources in the process, as does SWS. Thus, it is in this direction
+
+that one ought to overcompensate. For a simple implementation, a rule
+
+of thumb that seems to work in practice is to artificially reduce the
+
+offered window until the reduction constitutes one half of the available
+
+space, at which point increase the window to advertise the entire space
+
+again. In any event, one ought to make the chunk by which the window is
+
+opened at least permit one reasonably large segment. (If the receiver
+
+is so short of buffers that it can never advertise a large enough buffer
+
+to permit at least one large segment, it is hopeless to expect any sort
+
+of high throughput.)
+
+
+ There is an algorithm that the sender can use to achieve the same
+
+effect described above: a very simple and elegant rule first described
+
+by Michael Greenwald at MIT. The sender of the data uses the offered
+
+window to compute a useable window, and then compares the useable window
+
+to the offered window, and refrains from sending anything if the ratio
+
+of useable to offered is less than a certain fraction. Clearly, if the
+
+computed useable window is small compared to the offered window, this
+
+means that a substantial amount of previously sent information is still
+
+ 8
+
+
+in the pipeline from the sender to the receiver, which in turn means
+
+that the sender can count on being granted a larger useable window in
+
+the future. Until the useable window reaches a certain amount, the
+
+sender should simply refuse to send anything.
+
+
+ Simple experiments suggest that the exact value of the ratio is not
+
+very important, but that a value of about 25 percent is sufficient to
+
+avoid SWS and achieve reasonable throughput, even for machines with a
+
+small offered window. An additional enhancement which might help
+
+throughput would be to attempt to hold off sending until one can send a
+
+maximum size segment. Another enhancement would be to send anyway, even
+
+if the ratio is small, if the useable window is sufficient to hold the
+
+data available up to the next "push point".
+
+
+ This algorithm at the sender end is very simple. Notice that it is
+
+not necessary to set a timer to protect against protocol lockup when
+
+postponing the send operation. Further acknowledgements, as they
+
+arrive, will inevitably change the ratio of offered to useable window.
+
+(To see this, note that when all the data in the catanet pipeline has
+
+arrived at the receiver, the resulting acknowledgement must yield an
+
+offered window and useable window that equal each other.) If the
+
+expected acknowledgements do not arrive, the retransmission mechanism
+
+will come into play to assure that something finally happens. Thus, to
+
+add this algorithm to an existing TCP implementation usually requires
+
+one line of code. As part of the send algorithm it is already necessary
+
+to compute the useable window from the offered window. It is a simple
+
+matter to add a line of code which, if the ratio is less than a certain
+
+ 9
+
+
+percent, sets the useable window to zero. The results of SWS are so
+
+devastating that no sender should be without this simple piece of
+
+insurance.
+
+
+ 5. Improved Acknowledgement Algorithms
+
+
+ In the beginning of this paper, an overly simplistic implementation
+
+of TCP was described, which led to SWS. One of the characteristics of
+
+this implementation was that the recipient of data sent a separate
+
+acknowledgement for every segment that it received. This compulsive
+
+acknowledgement was one of the causes of SWS, because each
+
+acknowledgement provided some new useable window, but even if one of the
+
+algorithms described above is used to eliminate SWS, overly frequent
+
+acknowledgement still has a substantial problem, which is that it
+
+greatly increases the processing time at the sender's end. Measurement
+
+of TCP implementations, especially on large operating systems, indicate
+
+that most of the overhead of dealing with a segment is not in the
+
+processing at the TCP or IP level, but simply in the scheduling of the
+
+handler which is required to deal with the segment. A steady dribble of
+
+acknowledgements causes a high overhead in scheduling, with very little
+
+to show for it. This waste is to be avoided if possible.
+
+
+ There are two reasons for prompt acknowledgement. One is to
+
+prevent retransmission. We will discuss later how to determine whether
+
+unnecessary retransmission is occurring. The other reason one
+
+acknowledges promptly is to permit further data to be sent. However,
+
+the previous section makes quite clear that it is not always desirable
+
+to send a little bit of data, even though the receiver may have room for
+
+ 10
+
+
+it. Therefore, one can state a general rule that under normal
+
+operation, the receiver of data need not, and for efficiency reasons
+
+should not, acknowledge the data unless either the acknowledgement is
+
+intended to produce an increased useable window, is necessary in order
+
+to prevent retransmission or is being sent as part of a reverse
+
+direction segment being sent for some other reason. We will consider an
+
+algorithm to achieve these goals.
+
+
+ Only the recipient of the data can control the generation of
+
+acknowledgements. Once an acknowledgement has been sent from the
+
+receiver back to the sender, the sender must process it. Although the
+
+extra overhead is incurred at the sender's end, it is entirely under the
+
+receiver's control. Therefore, we must now describe an algorithm which
+
+occurs at the receiver's end. Obviously, the algorithm must have the
+
+following general form; sometimes the receiver of data, upon processing
+
+a segment, decides not to send an acknowledgement now, but to postpone
+
+the acknowledgement until some time in the future, perhaps by setting a
+
+timer. The peril of this approach is that on many large operating
+
+systems it is extremely costly to respond to a timer event, almost as
+
+costly as to respond to an incoming segment. Clearly, if the receiver
+
+of the data, in order to avoid extra overhead at the sender end, spends
+
+a great deal of time responding to timer interrupts, no overall benefit
+
+has been achieved, for efficiency at the sender end is achieved by great
+
+thrashing at the receiver end. We must find an algorithm that avoids
+
+both of these perils.
+
+
+ The following scheme seems a good compromise. The receiver of data
+
+ 11
+
+
+will refrain from sending an acknowledgement under certain
+
+circumstances, in which case it must set a timer which will cause the
+
+acknowledgement to be sent later. However, the receiver should do this
+
+only where it is a reasonable guess that some other event will intervene
+
+and prevent the necessity of the timer interrupt. The most obvious
+
+event on which to depend is the arrival of another segment. So, if a
+
+segment arrives, postpone sending an acknowledgement if both of the
+
+following conditions hold. First, the push bit is not set in the
+
+segment, since it is a reasonable assumption that there is more data
+
+coming in a subsequent segment. Second, there is no revised window
+
+information to be sent back.
+
+
+ This algorithm will insure that the timer, although set, is seldom
+
+used. The interval of the timer is related to the expected inter-
+
+segment delay, which is in turn a function of the particular network
+
+through which the data is flowing. For the Arpanet, a reasonable
+
+interval seems to be 200 to 300 milliseconds. Appendix A describes an
+
+adaptive algorithm for measuring this delay.
+
+
+ The section on improved window algorithms described both a receiver
+
+algorithm and a sender algorithm, and suggested that both should be
+
+used. The reason for this is now clear. While the sender algorithm is
+
+extremely simple, and useful as insurance, the receiver algorithm is
+
+required in order that this improved acknowledgement strategy work. If
+
+the receipt of every segment causes a new window value to be returned,
+
+then of necessity an acknowledgement will be sent for every data
+
+segment. When, according to the strategy of the previous section, the
+
+ 12
+
+
+receiver determines to artificially reduce the offered window, that is
+
+precisely the circumstance under which an acknowledgement need not be
+
+sent. When the receiver window algorithm and the receiver
+
+acknowledgement algorithm are used together, it will be seen that
+
+sending an acknowledgement will be triggered by one of the following
+
+events. First, a push bit has been received. Second, a temporary pause
+
+in the data stream is detected. Third, the offered window has been
+
+artificially reduced to one-half its actual value.
+
+
+ In the beginning of this section, it was pointed out that there are
+
+two reasons why one must acknowledge data. Our consideration at this
+
+point has been concerned only with the first, that an acknowledgement
+
+must be returned as part of triggering the sending of new data. It is
+
+also necessary to acknowledge whenever the failure to do so would
+
+trigger retransmission by the sender. Since the retransmission interval
+
+is selected by the sender, the receiver of the data cannot make a
+
+precise determination of when the acknowledgement must be sent.
+
+However, there is a rough rule the sender can use to avoid
+
+retransmission, provided that the receiver is reasonably well behaved.
+
+
+ We will assume that sender of the data uses the optional algorithm
+
+described in the TCP specification, in which the roundtrip delay is
+
+measured using an exponential decay smoothing algorithm. Retransmission
+
+of a segment occurs if the measured delay for that segment exceeds the
+
+smoothed average by some factor. To see how retransmission might be
+
+triggered, one must consider the pattern of segment arrivals at the
+
+receiver. The goal of our strategy was that the sender should send off
+
+ 13
+
+
+a number of segments in close sequence, and receive one acknowledgement
+
+for the whole burst. The acknowledgement will be generated by the
+
+receiver at the time that the last segment in the burst arrives at the
+
+receiver. (To ensure the prompt return of the acknowledgement, the
+
+sender could turn on the "push" bit in the last segment of the burst.)
+
+The delay observed at the sender between the initial transmission of a
+
+segment and the receipt of the acknowledgement will include both the
+
+network transit time, plus the holding time at the receiver. The
+
+holding time will be greatest for the first segments in the burst, and
+
+smallest for the last segments in the burst. Thus, the smoothing
+
+algorithm will measure a delay which is roughly proportional to the
+
+average roundtrip delay for all the segments in the burst. Problems
+
+will arise if the average delay is substantially smaller than the
+
+maximum delay and the smoothing algorithm used has a very small
+
+threshold for triggering retransmission. The widest variation between
+
+average and maximum delay will occur when network transit time is
+
+negligible, and all delay is processing time. In this case, the maximum
+
+will be twice the average (by simple algebra) so the threshold that
+
+controls retransmission should be somewhat more than a factor of two.
+
+
+ In practice, retransmission of the first segments of a burst has
+
+not been a problem because the delay measured consists of the network
+
+roundtrip delay, as well as the delay due to withholding the
+
+acknowledgement, and the roundtrip tends to dominate except in very low
+
+roundtrip time situations (such as when sending to one's self for test
+
+purposes). This low roundtrip situation can be covered very simply by
+
+including a minimum value below which the roundtrip estimate is not
+
+permitted to drop.
+
+ 14
+
+
+ In our experiments with this algorithm, retransmission due to
+
+faulty calculation of the roundtrip delay occurred only once, when the
+
+parameters of the exponential smoothing algorithm had been misadjusted
+
+so that they were only taking into account the last two or three
+
+segments sent. Clearly, this will cause trouble since the last two or
+
+three segments of any burst are the ones whose holding time at the
+
+receiver is minimal, so the resulting total estimate was much lower than
+
+appropriate. Once the parameters of the algorithm had been adjusted so
+
+that the number of segments taken into account was approximately twice
+
+the number of segments in a burst of average size, with a threshold
+
+factor of 1.5, no further retransmission has ever been identified due to
+
+this problem, including when sending to ourself and when sending over
+
+high delay nets.
+
+
+ 6. Conservative Vs. Optimistic Windows
+
+
+ According to the TCP specification, the offered window is presumed
+
+to have some relationship to the amount of data which the receiver is
+
+actually prepared to receive. However, it is not necessarily an exact
+
+correspondence. We will use the term "conservative window" to describe
+
+the case where the offered window is precisely no larger than the actual
+
+buffering available. The drawback to conservative window algorithms is
+
+that they can produce very low throughput in long delay situations. It
+
+is easy to see that the maximum input of a conservative window algorithm
+
+is one bufferfull every roundtrip delay in the net, since the next
+
+bufferfull cannot be launched until the updated window/acknowledgement
+
+information from the previous transmission has made the roundtrip.
+
+ 15
+
+
+ In certain cases, it may be possible to increase the overall
+
+throughput of the transmission by increasing the offered window over the
+
+actual buffer available at the receiver. Such a strategy we will call
+
+an "optimistic window" strategy. The optimistic strategy works if the
+
+network delivers the data to the recipient sufficiently slowly that it
+
+can process the data fast enough to keep the buffer from overflowing.
+
+If the receiver is faster than the sender, one could, with luck, permit
+
+an infinitely optimistic window, in which the sender is simply permitted
+
+to send full-speed. If the sender is faster than the receiver, however,
+
+and the window is too optimistic, then some segments will cause a buffer
+
+overflow, and will be discarded. Therefore, the correct strategy to
+
+implement an optimistic window is to increase the window size until
+
+segments start to be lost. This only works if it is possible to detect
+
+that the segment has been lost. In some cases, it is easy to do,
+
+because the segment is partially processed inside the receiving host
+
+before it is thrown away. In other cases, overflows may actually cause
+
+the network interface to be clogged, which will cause the segments to be
+
+lost elsewhere in the net. It is inadvisable to attempt an optimistic
+
+window strategy unless one is certain that the algorithm can detect the
+
+resulting lost segments. However, the increase in throughput which is
+
+possible from optimistic windows is quite substantial. Any systems with
+
+small buffer space should seriously consider the merit of optimistic
+
+windows.
+
+
+ The selection of an appropriate window algorithm is actually more
+
+complicated than even the above discussion suggests. The following
+
+considerations are not presented with the intention that they be
+
+ 16
+
+
+incorporated in current implementations of TCP, but as background for
+
+the sophisticated designer who is attempting to understand how his TCP
+
+will respond to a variety of networks, with different speed and delay
+
+characteristics. The particular pattern of windows and acknowledgements
+
+sent from receiver to sender influences two characteristics of the data
+
+being sent. First, they control the average data rate. Clearly, the
+
+average rate of the sender cannot exceed the average rate of the
+
+receiver, or long-term buffer overflow will occur. Second, they
+
+influence the burstiness of the data coming from the sender. Burstiness
+
+has both advantages and disadvantages. The advantage of burstiness is
+
+that it reduces the CPU processing necessary to send the data. This
+
+follows from the observed fact, especially on large machines, that most
+
+of the cost of sending a segment is not the TCP or IP processing, but
+
+the scheduling overhead of getting started.
+
+
+ On the other hand, the disadvantage of burstiness is that it may
+
+cause buffers to overflow, either in the eventual recipient, which was
+
+discussed above, or in an intermediate gateway, a problem ignored in
+
+this paper. The algorithms described above attempts to strike a balance
+
+between excessive burstiness, which in the extreme cases can cause
+
+delays because a burst is not requested soon enough, and excessive
+
+fragmentation of the data stream into small segments, which we
+
+identified as Silly Window Syndrome.
+
+
+ Under conditions of extreme delay in the network, none of the
+
+algorithms described above will achieve adequate throughput.
+
+Conservative window algorithms have a predictable throughput limit,
+
+ 17
+
+
+which is one windowfull per roundtrip delay. Attempts to solve this by
+
+optimistic window strategies may cause buffer overflows due to the
+
+bursty nature of the arriving data. A very sophisticated way to solve
+
+this is for the receiver, having measured by some means the roundtrip
+
+delay and intersegment arrival rate of the actual connection, to open
+
+his window, not in one optimistic increment of gigantic proportion, but
+
+in a number of smaller optimistic increments, which have been carefully
+
+spaced using a timer so that the resulting smaller bursts which arrive
+
+are each sufficiently small to fit into the existing buffers. One could
+
+visualize this as a number of requests flowing backwards through the net
+
+which trigger in return a number of bursts which flow back spaced evenly
+
+from the sender to the receiver. The overall result is that the
+
+receiver uses the window mechanism to control the burstiness of the
+
+arrivals, and the average rate.
+
+
+ To my knowledge, no such strategy has been implemented in any TCP.
+
+First, we do not normally have delays high enough to require this kind
+
+of treatment. Second, the strategy described above is probably not
+
+stable unless it is very carefully balanced. Just as buses on a single
+
+bus route tend to bunch up, bursts which start out equally spaced could
+
+well end up piling into each other, and forming the single large burst
+
+which the receiver was hoping to avoid. It is important to understand
+
+this extreme case, however, in order to understand the limits beyond
+
+which TCP, as normally implemented, with either conservative or simple
+
+optimistic windows can be expected to deliver throughput which is a
+
+reasonable percentage of the actual network capacity.
+
+ 18
+
+
+ 7. Conclusions
+
+
+ This paper describes three simple algorithms for performance
+
+enhancement in TCP, one at the sender end and two at the receiver. The
+
+sender algorithm is to refrain from sending if the useable window is
+
+smaller than 25 percent of the offered window. The receiver algorithms
+
+are first, to artificially reduce the offered window when processing new
+
+data if the resulting reduction does not represent more than some
+
+fraction, say 50 percent, of the actual space available, and second, to
+
+refrain from sending an acknowledgment at all if two simple conditions
+
+hold.
+
+
+ Either of these algorithms will prevent the worst aspects of Silly
+
+Window Syndrome, and when these algorithms are used together, they will
+
+produce substantial improvement in CPU utilization, by eliminating the
+
+process of excess acknowledgements.
+
+
+ Preliminary experiments with these algorithms suggest that they
+
+work, and work very well. Both the sender and receiver algorithms have
+
+been shown to eliminate SWS, even when talking to fairly silly
+
+algorithms at the other end. The Multics mailer, in particular, had
+
+suffered substantial attacks of SWS while sending large mail to a number
+
+of hosts. We believe that implementation of the sender side algorithm
+
+has eliminated every known case of SWS detected in our mailer.
+
+Implementation of the receiver side algorithm produced substantial
+
+improvements of CPU time when Multics was the sending system. Multics
+
+is a typical large operating system, with scheduling costs which are
+
+large compared to the actual processing time for protocol handlers.
+
+ 19
+
+
+Tests were done sending from Multics to a host which implemented the SWS
+
+suppression algorithm, and which could either refrain or not from
+
+sending acknowledgements on each segment. As predicted, suppressing the
+
+return acknowledgements did not influence the throughput for large data
+
+transfer at all, since the throttling effect was elsewhere. However,
+
+the CPU time required to process the data at the Multics end was cut by
+
+a factor of four (In this experiment, the bursts of data which were
+
+being sent were approximately eight segments. Thus, the number of
+
+acknowledgements in the two experiments differed by a factor of eight.)
+
+
+ An important consideration in evaluating these algorithms is that
+
+they must not cause the protocol implementations to deadlock. All of
+
+the recommendations in this document have the characteristic that they
+
+suggest one refrain from doing something even though the protocol
+
+specification permits one to do it. The possibility exists that if one
+
+refrains from doing something now one may never get to do it later, and
+
+both ends will halt, even though it would appear superficially that the
+
+transaction can continue.
+
+
+ Formally, the idea that things continue to work is referred to as
+
+"liveness". One of the defects of ad hoc solutions to performance
+
+problems is the possibility that two different approaches will interact
+
+to prevent liveness. It is believed that the algorithms described in
+
+this paper are always live, and that is one of the reasons why there is
+
+a strong advantage in uniform use of this particular proposal, except in
+
+cases where it is explicitly demonstrated not to work.
+
+
+ The argument for liveness in these solutions proceeds as follows.
+
+ 20
+
+
+First, the sender algorithm can only be stopped by one thing, a refusal
+
+of the receiver to acknowledge sent data. As long as the receiver
+
+continues to acknowledge data, the ratio of useable window to offered
+
+window will approach one, and eventually the sender must continue to
+
+send. However, notice that the receiver algorithm we have advocated
+
+involves refraining from acknowledging. Therefore, we certainly do have
+
+a situation where improper operation of this algorithm can prevent
+
+liveness.
+
+
+ What we must show is that the receiver of the data, if it chooses
+
+to refrain from acknowledging, will do so only for a short time, and not
+
+forever. The design of the algorithm described above was intended to
+
+achieve precisely this goal: whenever the receiver of data refrained
+
+from sending an acknowledgement it was required to set a timer. The
+
+only event that was permitted to clear that timer was the receipt of
+
+another segment, which essentially reset the timer, and started it going
+
+again. Thus, an acknowledgement will be sent as soon as no data has
+
+been received. This has precisely the effect desired: if the data flow
+
+appears to be disrupted for any reason, the receiver responds by sending
+
+an up-to-date acknowledgement. In fact, the receiver algorithm is
+
+designed to be more robust than this, for transmission of an
+
+acknowledgment is triggered by two events, either a cessation of data or
+
+a reduction in the amount of offered window to 50 percent of the actual
+
+value. This is the condition which will normally trigger the
+
+transmission of this acknowledgement.
+
+ 21
+
+
+
+
+
+ APPENDIX A
+
+
+ Dynamic Calculation of Acknowledgement Delay
+
+
+ The text suggested that when setting a timer to postpone the
+
+sending of an acknowledgement, a fixed interval of 200 to 300
+
+milliseconds would work properly in practice. This has not been
+
+verified over a wide variety of network delays, and clearly if there is
+
+a very slow net which stretches out the intersegment arrival time, a
+
+fixed interval will fail. In a sophisticated TCP, which is expected to
+
+adjust dynamically (rather than manually) to changing network
+
+conditions, it would be appropriate to measure this interval and respond
+
+dynamically. The following algorithm, which has been relegated to an
+
+Appendix, because it has not been tested, seems sensible. Whenever a
+
+segment arrives which does not have the push bit on in it, start a
+
+timer, which runs until the next segment arrives. Average these
+
+interarrival intervals, using an exponential decay smoothing function
+
+tuned to take into account perhaps the last ten or twenty segments that
+
+have come in. Occasionally, there will be a long interarrival period,
+
+even for a segment which is does not terminate a piece of data being
+
+pushed, perhaps because a window has gone to zero or some glitch in the
+
+sender or the network has held up the data. Therefore, examine each
+
+interarrival interval, and discard it from the smoothing algorithm if it
+
+exceeds the current estimate by some amount, perhaps a ratio of two or
+
+four times. By rejecting the larger intersegment arrival intervals, one
+
+should obtain a smoothed estimate of the interarrival of segments inside
+
+ 22
+
+
+a burst. The number need not be exact, since the timer which triggers
+
+acknowledgement can add a fairly generous fudge factor to this without
+
+causing trouble with the sender's estimate of the retransmission
+
+interval, so long as the fudge factor is constant.
+
+