doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc813.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 1167 insertions, 0 deletions
diff --git a/doc/rfc/rfc813.txt b/doc/rfc/rfc813.txt
new file mode 100644
index 0000000..5817050
--- /dev/null
+++ b/doc/rfc/rfc813.txt
@@ -0,0 +1,1167 @@
+
+RFC:  813
+
+
+              
+                WINDOW AND ACKNOWLEDGEMENT STRATEGY IN TCP
+
+                             David D. Clark
+                  MIT Laboratory for Computer Science
+               Computer Systems and Communications Group
+                               July, 1982
+
+
+     1.  Introduction
+
+
+     This  document describes implementation strategies to deal with two
+
+mechanisms in TCP, the window and the acknowledgement.  These mechanisms
+
+are described in the specification document, but it is  possible,  while
+
+complying with the specification, to produce implementations which yield
+
+very  bad  performance.    Happily,  the pitfalls possible in window and
+
+acknowledgement strategies are very easy to avoid.
+
+
+     It is a much more difficult exercise to verify the performance of a
+
+specification than the correctness.  Certainly, we have less  experience
+
+in  this  area,  and  we  certainly  lack  any  useful formal technique.
+
+Nonetheless, it is important to attempt a specification  in  this  area,
+
+because  different  implementors  might  otherwise  choose superficially
+
+reasonable algorithms  which  interact  poorly  with  each  other.  This
+
+document  presents  a  particular  set of algorithms which have received
+
+testing in the field, and which appear to work properly with each other.
+
+With more experience, these algorithms may become  part  of  the  formal
+
+specification:  until such time their use is recommended.
+
+                                   2
+
+
+2.  The Mechanisms
+
+
+     The acknowledgement mechanism is at the heart of TCP.  Very simply,
+
+when  data  arrives at the recipient, the protocol requires that it send
+
+back an acknowledgement of this data.  The protocol specifies  that  the
+
+bytes  of  data  are  sequentially  numbered,  so that the recipient can
+
+acknowledge data by naming the highest numbered  byte  of  data  it  has
+
+received,  which  also  acknowledges  the  previous  bytes (actually, it
+
+identifies the first byte of data which it has  not  yet  received,  but
+
+this is a small detail).  The protocol contains only a general assertion
+
+that  data  should  be acknowledged promptly, but gives no more specific
+
+indication as to how quickly an acknowledgement must  be  sent,  or  how
+
+much data should be acknowledged in each separate acknowledgement.
+
+
+     The window mechanism is a flow control tool.  Whenever appropriate,
+
+the  recipient of data returns to the sender a number, which is (more or
+
+less) the size of the buffer which the receiver currently has  available
+
+for  additional  data.   This number of bytes, called the window, is the
+
+maximum which the sender is permitted to  transmit  until  the  receiver
+
+returns  some  additional  window.  Sometimes, the receiver will have no
+
+buffer space available, and will return a window value of zero.    Under
+
+these  circumstances,the  protocol  requires  the sender to send a small
+
+segment to the receiver now and then, to see if more data  is  accepted.
+
+If  the  window  remains closed at zero for some substantial period, and
+
+the sender can obtain  no  response  from  the  receiver,  the  protocol
+
+requires  the  sender  to  conclude that the receiver has failed, and to
+
+close  the  connection.    Again,  there  is  very  little   performance
+
+                                   3
+
+
+information  in  the  specification, describing under what circumstances
+
+the window should be increased, and how the  sender  should  respond  to
+
+such revised information.
+
+
+     A  bad implementation of the window algorithm can lead to extremely
+
+poor performance overall.  The degradations which  occur  in  throughput
+
+and  CPU  utilizations  can easily be several factors of ten, not just a
+
+fractional increase.  This particular phenomenon is specific enough that
+
+it has been given the name of Silly Window Syndrome, or  SWS.    Happily
+
+SWS  is  easy  to  avoid  if  a few simple rules are observed.  The most
+
+important function of this memo is to describe SWS, so that implementors
+
+will understand the general nature  of  the  problem,  and  to  describe
+
+algorithms  which  will  prevent  its  occurrence.    This document also
+
+describes   performance   enhancing   algorithms   which    relate    to
+
+acknowledgement,  and  discusses  the  way  acknowledgement  and  window
+
+algorithms interact as part of SWS.
+
+
+     3.  SILLY WINDOW SYNDROME
+
+
+     In order to understand SWS, we must first  define  two  new  terms.
+
+Superficially,  the window mechanism is very simple:  there is a number,
+
+called "the window", which is returned from the receiver to the  sender.
+
+However,  we  must have a more detailed way of talking about the meaning
+
+of this number.  The receiver of data computes a  value  which  we  will
+
+call  the  "offered  window".    In  a  simple  case, the offered window
+
+corresponds to the amount of buffer space  available  in  the  receiver.
+
+This  correspondence  is  not necessarily exact, but is a suitable model
+
+for the discussion to follow.    It  is  the  offered  window  which  is
+
+                                   4
+
+
+actually  transmitted  back from the receiver to the sender.  The sender
+
+uses the offered window  to  compute  a  different  value,  the  "usable
+
+window",  which  is  the  offered window minus the amount of outstanding
+
+unacknowledged data.  The usable window is less than  or  equal  to  the
+
+offered window, and can be much smaller.
+
+
+     Consider  the  following  simple  example.   The receiver initially
+
+provides an offered window of 1,000.  The sender uses up this window  by
+
+sending  five  segments  of 200 bytes each.  The receiver, on processing
+
+the first of these  segments,  returns  an  acknowledgement  which  also
+
+contains  an  updated  window value.  Let us assume that the receiver of
+
+the data has removed the first 200 bytes from the buffer,  so  that  the
+
+receiver once again has 1,000 bytes of available buffer.  Therefore, the
+
+receiver would return, as before, an offered window of 1,000 bytes.  The
+
+sender,  on  receipt  of  this  first  acknowledgement, now computes the
+
+additional number of bytes which may be sent.  In  fact,  of  the  1,000
+
+bytes  which  the recipient is prepared to receive at this time, 800 are
+
+already in transit, having been sent in response to the previous offered
+
+window.  In this case, the usable window is only 200 bytes.
+
+
+     Let us now consider how SWS  arises.    To  continue  the  previous
+
+example,  assume  that at some point, when the sender computes a useable
+
+window of 200 bytes, it has only 50 bytes to send  until  it  reaches  a
+
+"push"  point.   It thus sends 50 bytes in one segment, and 150 bytes in
+
+the next segment. Sometime later, this 50-byte segment  will  arrive  at
+
+the recipient, which will process and remove the 50 bytes and once again
+
+return  an  offered window of 1,000 bytes.  However, the sender will now
+
+                                   5
+
+
+compute  that there are 950 bytes in transit in the network, so that the
+
+useable window is now only 50 bytes.  Thus, the sender will  once  again
+
+send  a  50  byte  segment,  even  though  there  is no longer a natural
+
+boundary to force it.
+
+
+     In fact, whenever the acknowledgement  of  a  small  segment  comes
+
+back, the useable window associated with that acknowledgement will cause
+
+another  segment  of  the  same  small  size  to  be  sent,  until  some
+
+abnormality breaks the pattern.  It is easy to see  how  small  segments
+
+arise,  because  natural  boundaries  in the data occasionally cause the
+
+sender to take a computed useable window and divide it  up  between  two
+
+segments.   Once that division has occurred, there is no natural way for
+
+those useable window allocations to be recombined; thus the breaking  up
+
+of the useable window into small pieces will persist.
+
+
+     Thus,  SWS  is a degeneration in the throughput which develops over
+
+time, during a long data transfer.  If the sender  ever  stops,  as  for
+
+example  when  it runs out of data to send, the receiver will eventually
+
+acknowledge all  the  outstanding  data,  so  that  the  useable  window
+
+computed  by  the  sender  will  equal  the  full  offered window of the
+
+receiver.  At this point the situation will  have  healed,  and  further
+
+data  transmission  over  the  link will occur efficiently.  However, in
+
+large file transfers, which occur without interruption,  SWS  can  cause
+
+appalling  performance.  The network between the sender and the receiver
+
+becomes clogged with  many  small  segments,  and  an  equal  number  of
+
+acknowledgements,  which  in  turn  causes lost segments, which triggers
+
+massive retransmission.  Bad cases of SWS have been seen  in  which  the
+
+                                   6
+
+
+average  segment  size was one-tenth of the size the sender and receiver
+
+were prepared to deal with, and the average number of retransmission per
+
+successful segments sent was five.
+
+
+     Happily, SWS is trivial to avoid.  The following sections  describe
+
+two  algorithms,  one  executed  by the sender, and one by the receiver,
+
+which appear to eliminate SWS completely.  Actually, either algorithm by
+
+itself is sufficient to prevent SWS, and thus  protect  a  host  from  a
+
+foreign  implementation  which  has  failed  to  deal properly with this
+
+problem.  The  two  algorithms  taken  together  produce  an  additional
+
+reduction  in  CPU  consumption, observed in practice to be as high as a
+
+factor of four.
+
+
+     4.  Improved Window Algorithms
+
+
+     The receiver of data can take a very simple step to eliminate  SWS.
+
+When  it  disposes of a small amount of data, it can artificially reduce
+
+the offered window in subsequent acknowledgements, so that  the  useable
+
+window computed by the sender does not permit the sending of any further
+
+data.     At  some  later  time,  when  the  receiver  has  processed  a
+
+substantially larger amount of incoming data, the artificial  limitation
+
+on  the  offered  window  can be removed all at once, so that the sender
+
+computes a sudden large jump rather than a sequence of  small  jumps  in
+
+the useable window.
+
+
+     At  this  level,  the  algorithm  is  quite simple, but in order to
+
+determine exactly when the window should  be  opened  up  again,  it  is
+
+necessary  to  look  at some of the other details of the implementation.
+
+                                   7
+
+
+Depending  on whether the window is held artificially closed for a short
+
+or long time, two problems will  develop.    The  one  we  have  already
+
+discussed  -- never closing the window artificially -- will lead to SWS.
+
+On the other hand, if  the  window  is  only  opened  infrequently,  the
+
+pipeline  of data in the network between the sender and the receiver may
+
+have emptied out while the sender was being held off, so that a delay is
+
+introduced before additional data arrives from the sender.   This  delay
+
+does reduce throughput, but it does not consume network resources or CPU
+
+resources  in  the  process, as does SWS.  Thus, it is in this direction
+
+that one ought to overcompensate.  For a simple implementation,  a  rule
+
+of  thumb  that  seems to work in practice is to artificially reduce the
+
+offered window until the reduction constitutes one half of the available
+
+space, at which point increase the window to advertise the entire  space
+
+again.  In any event, one ought to make the chunk by which the window is
+
+opened  at  least permit one reasonably large segment.  (If the receiver
+
+is so short of buffers that it can never advertise a large enough buffer
+
+to permit at least one large segment, it is hopeless to expect any  sort
+
+of high throughput.)
+
+
+     There  is  an algorithm that the sender can use to achieve the same
+
+effect described above:  a very simple and elegant rule first  described
+
+by  Michael  Greenwald  at MIT.  The sender of the data uses the offered
+
+window to compute a useable window, and then compares the useable window
+
+to the offered window, and refrains from sending anything if  the  ratio
+
+of  useable to offered is less than a certain fraction.  Clearly, if the
+
+computed useable window is small compared to the  offered  window,  this
+
+means  that a substantial amount of previously sent information is still
+
+                                   8
+
+
+in  the  pipeline  from  the sender to the receiver, which in turn means
+
+that the sender can count on being granted a larger  useable  window  in
+
+the  future.    Until  the  useable window reaches a certain amount, the
+
+sender should simply refuse to send anything.
+
+
+     Simple experiments suggest that the exact value of the ratio is not
+
+very important, but that a value of about 25 percent  is  sufficient  to
+
+avoid  SWS  and  achieve reasonable throughput, even for machines with a
+
+small offered window.    An  additional  enhancement  which  might  help
+
+throughput  would be to attempt to hold off sending until one can send a
+
+maximum size segment.  Another enhancement would be to send anyway, even
+
+if the ratio is small, if the useable window is sufficient to  hold  the
+
+data available up to the next "push point".
+
+
+     This algorithm at the sender end is very simple.  Notice that it is
+
+not  necessary  to  set  a timer to protect against protocol lockup when
+
+postponing the  send  operation.    Further  acknowledgements,  as  they
+
+arrive,  will  inevitably change the ratio of offered to useable window.
+
+(To see this, note that when all the data in the  catanet  pipeline  has
+
+arrived  at  the  receiver,  the resulting acknowledgement must yield an
+
+offered window and  useable  window  that  equal  each  other.)  If  the
+
+expected  acknowledgements  do  not arrive, the retransmission mechanism
+
+will come into play to assure that something finally happens.  Thus,  to
+
+add  this  algorithm  to an existing TCP implementation usually requires
+
+one line of code.  As part of the send algorithm it is already necessary
+
+to compute the useable window from the offered window.  It is  a  simple
+
+matter  to add a line of code which, if the ratio is less than a certain
+
+                                   9
+
+
+percent,  sets  the  useable  window to zero.  The results of SWS are so
+
+devastating that no sender  should  be  without  this  simple  piece  of
+
+insurance.
+
+
+     5.  Improved Acknowledgement Algorithms
+
+
+     In the beginning of this paper, an overly simplistic implementation
+
+of  TCP  was described, which led to SWS.  One of the characteristics of
+
+this implementation was that the  recipient  of  data  sent  a  separate
+
+acknowledgement  for  every  segment  that it received.  This compulsive
+
+acknowledgement  was  one  of  the   causes   of   SWS,   because   each
+
+acknowledgement provided some new useable window, but even if one of the
+
+algorithms  described  above  is  used to eliminate SWS, overly frequent
+
+acknowledgement still has  a  substantial  problem,  which  is  that  it
+
+greatly  increases the processing time at the sender's end.  Measurement
+
+of TCP implementations, especially on large operating systems,  indicate
+
+that  most  of  the  overhead  of  dealing  with a segment is not in the
+
+processing at the TCP or IP level, but simply in the scheduling  of  the
+
+handler which is required to deal with the segment.  A steady dribble of
+
+acknowledgements  causes a high overhead in scheduling, with very little
+
+to show for it.  This waste is to be avoided if possible.
+
+
+     There are two reasons  for  prompt  acknowledgement.    One  is  to
+
+prevent  retransmission.  We will discuss later how to determine whether
+
+unnecessary  retransmission  is  occurring.    The  other   reason   one
+
+acknowledges  promptly  is  to permit further data to be sent.  However,
+
+the previous section makes quite clear that it is not  always  desirable
+
+to send a little bit of data, even though the receiver may have room for
+
+                                   10
+
+
+it.    Therefore,  one  can  state  a  general  rule  that  under normal
+
+operation, the receiver of data need not,  and  for  efficiency  reasons
+
+should  not,  acknowledge  the data unless either the acknowledgement is
+
+intended to produce an increased useable window, is necessary  in  order
+
+to  prevent  retransmission  or  is  being  sent  as  part  of a reverse
+
+direction segment being sent for some other reason.  We will consider an
+
+algorithm to achieve these goals.
+
+
+     Only the recipient of  the  data  can  control  the  generation  of
+
+acknowledgements.    Once  an  acknowledgement  has  been  sent from the
+
+receiver back to the sender, the sender must process it.   Although  the
+
+extra overhead is incurred at the sender's end, it is entirely under the
+
+receiver's  control.  Therefore, we must now describe an algorithm which
+
+occurs at the receiver's end.  Obviously, the algorithm  must  have  the
+
+following  general form; sometimes the receiver of data, upon processing
+
+a segment, decides not to send an acknowledgement now, but  to  postpone
+
+the  acknowledgement until some time in the future, perhaps by setting a
+
+timer.  The peril of this approach  is  that  on  many  large  operating
+
+systems  it  is  extremely costly to respond to a timer event, almost as
+
+costly as to respond to an incoming segment.  Clearly, if  the  receiver
+
+of  the data, in order to avoid extra overhead at the sender end, spends
+
+a great deal of time responding to timer interrupts, no overall  benefit
+
+has been achieved, for efficiency at the sender end is achieved by great
+
+thrashing  at  the  receiver end.  We must find an algorithm that avoids
+
+both of these perils.
+
+
+     The following scheme seems a good compromise.  The receiver of data
+
+                                   11
+
+
+will   refrain   from   sending   an   acknowledgement   under   certain
+
+circumstances, in which case it must set a timer which  will  cause  the
+
+acknowledgement  to be sent later.  However, the receiver should do this
+
+only where it is a reasonable guess that some other event will intervene
+
+and prevent the necessity of the timer  interrupt.    The  most  obvious
+
+event  on  which  to depend is the arrival of another segment.  So, if a
+
+segment arrives, postpone sending an  acknowledgement  if  both  of  the
+
+following  conditions  hold.    First,  the  push  bit is not set in the
+
+segment, since it is a reasonable assumption that  there  is  more  data
+
+coming  in  a  subsequent  segment.   Second, there is no revised window
+
+information to be sent back.
+
+
+     This algorithm will insure that the timer, although set, is  seldom
+
+used.    The  interval  of  the  timer is related to the expected inter-
+
+segment delay, which is in turn a function  of  the  particular  network
+
+through  which  the  data  is  flowing.    For the Arpanet, a reasonable
+
+interval seems to be 200 to 300 milliseconds.  Appendix A  describes  an
+
+adaptive algorithm for measuring this delay.
+
+
+     The section on improved window algorithms described both a receiver
+
+algorithm  and  a  sender  algorithm,  and suggested that both should be
+
+used.  The reason for this is now clear.  While the sender algorithm  is
+
+extremely  simple,  and  useful  as insurance, the receiver algorithm is
+
+required in order that this improved acknowledgement strategy work.   If
+
+the  receipt  of every segment causes a new window value to be returned,
+
+then of necessity  an  acknowledgement  will  be  sent  for  every  data
+
+segment.    When, according to the strategy of the previous section, the
+
+                                   12
+
+
+receiver  determines  to artificially reduce the offered window, that is
+
+precisely the circumstance under which an acknowledgement  need  not  be
+
+sent.      When   the   receiver   window  algorithm  and  the  receiver
+
+acknowledgement algorithm are  used  together,  it  will  be  seen  that
+
+sending  an  acknowledgement  will  be triggered by one of the following
+
+events.  First, a push bit has been received.  Second, a temporary pause
+
+in the data stream is detected.  Third,  the  offered  window  has  been
+
+artificially reduced to one-half its actual value.
+
+
+     In the beginning of this section, it was pointed out that there are
+
+two  reasons  why  one must acknowledge data.  Our consideration at this
+
+point has been concerned only with the first,  that  an  acknowledgement
+
+must  be  returned as part of triggering the sending of new data.  It is
+
+also necessary to acknowledge  whenever  the  failure  to  do  so  would
+
+trigger retransmission by the sender.  Since the retransmission interval
+
+is  selected  by  the  sender,  the  receiver  of the data cannot make a
+
+precise  determination  of  when  the  acknowledgement  must  be   sent.
+
+However,   there   is   a  rough  rule  the  sender  can  use  to  avoid
+
+retransmission, provided that the receiver is reasonably well behaved.
+
+
+     We will assume that sender of the data uses the optional  algorithm
+
+described  in  the  TCP  specification,  in which the roundtrip delay is
+
+measured using an exponential decay smoothing algorithm.  Retransmission
+
+of a segment occurs if the measured delay for that segment  exceeds  the
+
+smoothed  average  by  some  factor.  To see how retransmission might be
+
+triggered, one must consider the pattern  of  segment  arrivals  at  the
+
+receiver.   The goal of our strategy was that the sender should send off
+
+                                   13
+
+
+a  number of segments in close sequence, and receive one acknowledgement
+
+for the whole burst.  The  acknowledgement  will  be  generated  by  the
+
+receiver  at  the time that the last segment in the burst arrives at the
+
+receiver.  (To ensure the prompt  return  of  the  acknowledgement,  the
+
+sender  could  turn on the "push" bit in the last segment of the burst.)
+
+The delay observed at the sender between the initial transmission  of  a
+
+segment  and  the  receipt  of the acknowledgement will include both the
+
+network transit time, plus the  holding  time  at  the  receiver.    The
+
+holding  time  will be greatest for the first segments in the burst, and
+
+smallest for the last segments  in  the  burst.    Thus,  the  smoothing
+
+algorithm  will  measure  a  delay  which is roughly proportional to the
+
+average roundtrip delay for all the segments in  the  burst.    Problems
+
+will  arise  if  the  average  delay  is  substantially smaller than the
+
+maximum delay  and  the  smoothing  algorithm  used  has  a  very  small
+
+threshold  for  triggering retransmission.  The widest variation between
+
+average and maximum delay  will  occur  when  network  transit  time  is
+
+negligible, and all delay is processing time.  In this case, the maximum
+
+will  be  twice  the  average  (by simple algebra) so the threshold that
+
+controls retransmission should be somewhat more than a factor of two.
+
+
+     In practice, retransmission of the first segments of  a  burst  has
+
+not  been  a  problem because the delay measured consists of the network
+
+roundtrip  delay,  as  well  as  the  delay  due  to   withholding   the
+
+acknowledgement,  and the roundtrip tends to dominate except in very low
+
+roundtrip time situations (such as when sending to one's self  for  test
+
+purposes).    This low roundtrip situation can be covered very simply by
+
+including a minimum value below which  the  roundtrip  estimate  is  not
+
+permitted to drop.
+
+                                   14
+
+
+     In  our  experiments  with  this  algorithm,  retransmission due to
+
+faulty calculation of the roundtrip delay occurred only once,  when  the
+
+parameters  of  the exponential smoothing algorithm had been misadjusted
+
+so that they were only  taking  into  account  the  last  two  or  three
+
+segments  sent.   Clearly, this will cause trouble since the last two or
+
+three segments of any burst are the  ones  whose  holding  time  at  the
+
+receiver is minimal, so the resulting total estimate was much lower than
+
+appropriate.   Once the parameters of the algorithm had been adjusted so
+
+that the number of segments taken into account was  approximately  twice
+
+the  number  of  segments  in  a burst of average size, with a threshold
+
+factor of 1.5, no further retransmission has ever been identified due to
+
+this problem, including when sending to ourself and  when  sending  over
+
+high delay nets.
+
+
+     6.  Conservative Vs. Optimistic Windows
+
+
+     According  to the TCP specification, the offered window is presumed
+
+to have some relationship to the amount of data which  the  receiver  is
+
+actually  prepared  to receive.  However, it is not necessarily an exact
+
+correspondence.  We will use the term "conservative window" to  describe
+
+the case where the offered window is precisely no larger than the actual
+
+buffering  available.  The drawback to conservative window algorithms is
+
+that they can produce very low throughput in long delay situations.   It
+
+is easy to see that the maximum input of a conservative window algorithm
+
+is  one  bufferfull  every  roundtrip  delay  in the net, since the next
+
+bufferfull cannot be launched until the  updated  window/acknowledgement
+
+information from the previous transmission has made the roundtrip.
+
+                                   15
+
+
+     In  certain  cases,  it  may  be  possible  to increase the overall
+
+throughput of the transmission by increasing the offered window over the
+
+actual buffer available at the receiver.  Such a strategy we  will  call
+
+an  "optimistic  window" strategy.  The optimistic strategy works if the
+
+network delivers the data to the recipient sufficiently slowly  that  it
+
+can  process  the  data fast enough to keep the buffer from overflowing.
+
+If the receiver is faster than the sender, one could, with luck,  permit
+
+an infinitely optimistic window, in which the sender is simply permitted
+
+to send full-speed.  If the sender is faster than the receiver, however,
+
+and the window is too optimistic, then some segments will cause a buffer
+
+overflow,  and  will  be  discarded.  Therefore, the correct strategy to
+
+implement an optimistic window is to  increase  the  window  size  until
+
+segments  start to be lost.  This only works if it is possible to detect
+
+that the segment has been lost.  In  some  cases,  it  is  easy  to  do,
+
+because  the  segment  is  partially processed inside the receiving host
+
+before it is thrown away.  In other cases, overflows may actually  cause
+
+the network interface to be clogged, which will cause the segments to be
+
+lost  elsewhere  in the net.  It is inadvisable to attempt an optimistic
+
+window strategy unless one is certain that the algorithm can detect  the
+
+resulting  lost  segments.  However, the increase in throughput which is
+
+possible from optimistic windows is quite substantial.  Any systems with
+
+small buffer space should seriously consider  the  merit  of  optimistic
+
+windows.
+
+
+     The  selection  of an appropriate window algorithm is actually more
+
+complicated than even the above  discussion  suggests.    The  following
+
+considerations  are  not  presented  with  the  intention  that  they be
+
+                                   16
+
+
+incorporated  in  current  implementations of TCP, but as background for
+
+the sophisticated designer who is attempting to understand how  his  TCP
+
+will  respond  to  a variety of networks, with different speed and delay
+
+characteristics.  The particular pattern of windows and acknowledgements
+
+sent from receiver to sender influences two characteristics of the  data
+
+being  sent.    First, they control the average data rate.  Clearly, the
+
+average rate of the  sender  cannot  exceed  the  average  rate  of  the
+
+receiver,  or  long-term  buffer  overflow  will  occur.    Second, they
+
+influence the burstiness of the data coming from the sender.  Burstiness
+
+has both advantages and disadvantages.  The advantage of  burstiness  is
+
+that  it  reduces  the  CPU processing necessary to send the data.  This
+
+follows from the observed fact, especially on large machines, that  most
+
+of  the  cost  of sending a segment is not the TCP or IP processing, but
+
+the scheduling overhead of getting started.
+
+
+     On the other hand, the disadvantage of burstiness is  that  it  may
+
+cause  buffers  to overflow, either in the eventual recipient, which was
+
+discussed above, or in an intermediate gateway,  a  problem  ignored  in
+
+this paper.  The algorithms described above attempts to strike a balance
+
+between  excessive  burstiness,  which  in  the  extreme cases can cause
+
+delays because a burst is  not  requested  soon  enough,  and  excessive
+
+fragmentation   of  the  data  stream  into  small  segments,  which  we
+
+identified as Silly Window Syndrome.
+
+
+     Under conditions of extreme delay  in  the  network,  none  of  the
+
+algorithms   described   above   will   achieve   adequate   throughput.
+
+Conservative window algorithms  have  a  predictable  throughput  limit,
+
+                                   17
+
+
+which  is one windowfull per roundtrip delay.  Attempts to solve this by
+
+optimistic window strategies may  cause  buffer  overflows  due  to  the
+
+bursty  nature  of the arriving data.  A very sophisticated way to solve
+
+this is for the receiver, having measured by some  means  the  roundtrip
+
+delay  and  intersegment  arrival rate of the actual connection, to open
+
+his window, not in one optimistic increment of gigantic proportion,  but
+
+in  a number of smaller optimistic increments, which have been carefully
+
+spaced using a timer so that the resulting smaller bursts  which  arrive
+
+are each sufficiently small to fit into the existing buffers.  One could
+
+visualize this as a number of requests flowing backwards through the net
+
+which trigger in return a number of bursts which flow back spaced evenly
+
+from  the  sender  to  the  receiver.    The  overall result is that the
+
+receiver uses the window mechanism to  control  the  burstiness  of  the
+
+arrivals, and the average rate.
+
+
+     To  my knowledge, no such strategy has been implemented in any TCP.
+
+First, we do not normally have delays high enough to require  this  kind
+
+of  treatment.    Second,  the  strategy described above is probably not
+
+stable unless it is very carefully balanced.  Just as buses on a  single
+
+bus  route tend to bunch up, bursts which start out equally spaced could
+
+well end up piling into each other, and forming the single  large  burst
+
+which  the  receiver was hoping to avoid.  It is important to understand
+
+this extreme case, however, in order to  understand  the  limits  beyond
+
+which  TCP,  as normally implemented, with either conservative or simple
+
+optimistic windows can be expected to  deliver  throughput  which  is  a
+
+reasonable percentage of the actual network capacity.
+
+                                   18
+
+
+     7.  Conclusions
+
+
+     This  paper  describes  three  simple  algorithms  for  performance
+
+enhancement in TCP, one at the sender end and two at the receiver.   The
+
+sender  algorithm  is  to  refrain from sending if the useable window is
+
+smaller than 25 percent of the offered window.  The receiver  algorithms
+
+are first, to artificially reduce the offered window when processing new
+
+data  if  the  resulting  reduction  does  not  represent more than some
+
+fraction, say 50 percent, of the actual space available, and second,  to
+
+refrain  from  sending an acknowledgment at all if two simple conditions
+
+hold.
+
+
+     Either of these algorithms will prevent the worst aspects of  Silly
+
+Window  Syndrome, and when these algorithms are used together, they will
+
+produce substantial improvement in CPU utilization, by  eliminating  the
+
+process of excess acknowledgements.
+
+
+     Preliminary  experiments  with  these  algorithms suggest that they
+
+work, and work very well.  Both the sender and receiver algorithms  have
+
+been  shown  to  eliminate  SWS,  even  when  talking  to  fairly  silly
+
+algorithms at the other end.  The Multics  mailer,  in  particular,  had
+
+suffered substantial attacks of SWS while sending large mail to a number
+
+of  hosts.   We believe that implementation of the sender side algorithm
+
+has  eliminated  every  known  case  of  SWS  detected  in  our  mailer.
+
+Implementation  of  the  receiver  side  algorithm  produced substantial
+
+improvements of CPU time when Multics was the sending system.    Multics
+
+is  a  typical  large  operating system, with scheduling costs which are
+
+large compared to the actual  processing  time  for  protocol  handlers.
+
+                                   19
+
+
+Tests were done sending from Multics to a host which implemented the SWS
+
+suppression  algorithm,  and  which  could  either  refrain  or not from
+
+sending acknowledgements on each segment.  As predicted, suppressing the
+
+return acknowledgements did not influence the throughput for large  data
+
+transfer  at  all,  since the throttling effect was elsewhere.  However,
+
+the CPU time required to process the data at the Multics end was cut  by
+
+a  factor  of  four  (In  this experiment, the bursts of data which were
+
+being sent were approximately eight  segments.    Thus,  the  number  of
+
+acknowledgements in the two experiments differed by a factor of eight.)
+
+
+     An  important  consideration in evaluating these algorithms is that
+
+they must not cause the protocol implementations to deadlock.    All  of
+
+the  recommendations  in this document have the characteristic that they
+
+suggest one refrain  from  doing  something  even  though  the  protocol
+
+specification  permits one to do it.  The possibility exists that if one
+
+refrains from doing something now one may never get to do it later,  and
+
+both  ends will halt, even though it would appear superficially that the
+
+transaction can continue.
+
+
+     Formally, the idea that things continue to work is referred  to  as
+
+"liveness".    One  of  the  defects  of ad hoc solutions to performance
+
+problems is the possibility that two different approaches will  interact
+
+to  prevent  liveness.   It is believed that the algorithms described in
+
+this paper are always live, and that is one of the reasons why there  is
+
+a strong advantage in uniform use of this particular proposal, except in
+
+cases where it is explicitly demonstrated not to work.
+
+
+     The  argument  for liveness in these solutions proceeds as follows.
+
+                                   20
+
+
+First,  the sender algorithm can only be stopped by one thing, a refusal
+
+of the receiver to acknowledge sent data.    As  long  as  the  receiver
+
+continues  to  acknowledge  data, the ratio of useable window to offered
+
+window will approach one, and eventually the  sender  must  continue  to
+
+send.    However,  notice  that the receiver algorithm we have advocated
+
+involves refraining from acknowledging.  Therefore, we certainly do have
+
+a situation where improper  operation  of  this  algorithm  can  prevent
+
+liveness.
+
+
+     What  we  must show is that the receiver of the data, if it chooses
+
+to refrain from acknowledging, will do so only for a short time, and not
+
+forever.  The design of the algorithm described above  was  intended  to
+
+achieve  precisely  this  goal:  whenever the receiver of data refrained
+
+from sending an acknowledgement it was required to set  a  timer.    The
+
+only  event  that  was  permitted to clear that timer was the receipt of
+
+another segment, which essentially reset the timer, and started it going
+
+again.  Thus, an acknowledgement will be sent as soon  as  no  data  has
+
+been received.  This has precisely the effect desired:  if the data flow
+
+appears to be disrupted for any reason, the receiver responds by sending
+
+an  up-to-date  acknowledgement.    In  fact,  the receiver algorithm is
+
+designed  to  be  more  robust  than  this,  for  transmission   of   an
+
+acknowledgment is triggered by two events, either a cessation of data or
+
+a  reduction in the amount of offered window to 50 percent of the actual
+
+value.    This  is  the  condition  which  will  normally  trigger   the
+
+transmission of this acknowledgement.
+
+                                   21
+
+
+
+
+
+                               APPENDIX A
+
+
+     Dynamic Calculation of Acknowledgement Delay
+
+
+     The  text  suggested  that  when  setting  a  timer to postpone the
+
+sending  of  an  acknowledgement,  a  fixed  interval  of  200  to   300
+
+milliseconds  would  work  properly  in  practice.    This  has not been
+
+verified over a wide variety of network delays, and clearly if there  is
+
+a  very  slow  net  which stretches out the intersegment arrival time, a
+
+fixed interval will fail.  In a sophisticated TCP, which is expected  to
+
+adjust   dynamically   (rather   than   manually)  to  changing  network
+
+conditions, it would be appropriate to measure this interval and respond
+
+dynamically.  The following algorithm, which has been  relegated  to  an
+
+Appendix,  because  it  has not been tested, seems sensible.  Whenever a
+
+segment arrives which does not have the push  bit  on  in  it,  start  a
+
+timer,  which  runs  until  the  next  segment  arrives.   Average these
+
+interarrival intervals, using an exponential  decay  smoothing  function
+
+tuned  to take into account perhaps the last ten or twenty segments that
+
+have come in.  Occasionally, there will be a long  interarrival  period,
+
+even  for  a  segment  which is does not terminate a piece of data being
+
+pushed, perhaps because a window has gone to zero or some glitch in  the
+
+sender  or  the  network  has held up the data.  Therefore, examine each
+
+interarrival interval, and discard it from the smoothing algorithm if it
+
+exceeds the current estimate by some amount, perhaps a ratio of  two  or
+
+four times.  By rejecting the larger intersegment arrival intervals, one
+
+should obtain a smoothed estimate of the interarrival of segments inside
+
+                                   22
+
+
+a  burst.   The number need not be exact, since the timer which triggers
+
+acknowledgement can add a fairly generous fudge factor to  this  without
+
+causing  trouble  with  the  sender's  estimate  of  the  retransmission
+
+interval, so long as the fudge factor is constant.
+
+
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc813.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)