summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc998.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc998.txt')
-rw-r--r--doc/rfc/rfc998.txt1237
1 files changed, 1237 insertions, 0 deletions
diff --git a/doc/rfc/rfc998.txt b/doc/rfc/rfc998.txt
new file mode 100644
index 0000000..a572d6c
--- /dev/null
+++ b/doc/rfc/rfc998.txt
@@ -0,0 +1,1237 @@
+
+Network Working Group David D. Clark
+Request for Comments: 998 Mark L. Lambert
+Obsoletes: RFC 969 Lixia Zhang
+ MIT
+ March 1987
+
+
+ NETBLT: A Bulk Data Transfer Protocol
+
+
+1. Status
+
+ This document is a description of, and a specification for, the
+ NETBLT protocol. It is a revision of the specification published in
+ NIC RFC-969. The protocol has been revised after extensive research
+ into NETBLT's performance over long-delay, high-bandwidth satellite
+ channels. Most of the changes in the protocol specification have to
+ do with the computation and use of data timers in a multiple
+ buffering data transfer model.
+
+ This document is published for discussion and comment, and does not
+ constitute a standard. The proposal may change and certain parts of
+ the protocol have not yet been specified; implementation of this
+ document is therefore not advised.
+
+2. Introduction
+
+ NETBLT (NETwork BLock Transfer) is a transport level protocol
+ intended for the rapid transfer of a large quantity of data between
+ computers. It provides a transfer that is reliable and flow
+ controlled, and is designed to provide maximum throughput over a wide
+ variety of networks. Although NETBLT currently runs on top of the
+ Internet Protocol (IP), it should be able to operate on top of any
+ datagram protocol similar in function to IP.
+
+ NETBLT's motivation is to achieve higher throughput than other
+ protocols might offer. The protocol achieves this goal by trying to
+ minimize the effect of several network-related problems: network
+ congestion, delays over satellite links, and packet loss.
+
+ Its transmission rate-control algorithms deal well with network
+ congestion; its multiple-buffering capability allows high throughput
+ over long-delay satellite channels, and its various
+ timeout/retransmit algorithms minimize the effect of packet loss
+ during a transfer. Most importantly, NETBLT's features give it good
+ performance over long-delay channels without impairing performance
+ over high-speed LANs.
+
+
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 1]
+
+RFC 998 March 1987
+
+
+ The protocol works by opening a connection between two "clients" (the
+ "sender" and the "receiver"), transferring the data in a series of
+ large data aggregates called "buffers", and then closing the
+ connection. Because the amount of data to be transferred can be very
+ large, the client is not required to provide at once all the data to
+ the protocol module. Instead, the data is provided by the client in
+ buffers. The NETBLT layer transfers each buffer as a sequence of
+ packets; since each buffer is composed of a large number of packets,
+ the per-buffer interaction between NETBLT and its client is far more
+ efficient than a per-packet interaction would be.
+
+ In its simplest form, a NETBLT transfer works as follows: the
+ sending client loads a buffer of data and calls down to the NETBLT
+ layer to transfer it. The NETBLT layer breaks the buffer up into
+ packets and sends these packets across the network in Internet
+ datagrams. The receiving NETBLT layer loads these packets into a
+ matching buffer provided by the receiving client. When the last
+ packet in the buffer has arrived, the receiving NETBLT checks to see
+ that all packets in that buffer have been correctly received. If
+ some packets are missing, the receiving NETBLT requests that they be
+ resent. When the buffer has been completely transmitted, the
+ receiving client is notified by its NETBLT layer. The receiving
+ client disposes of the buffer and provides a new buffer to receive
+ more data. The receiving NETBLT notifies the sender that the new
+ buffer is ready, and the sender prepares and sends the next buffer in
+ the same manner. This continues until all the data has been sent; at
+ that time the sender notifies the receiver that the transmission has
+ been completed. The connection is then closed.
+
+ As described above, the NETBLT protocol is "lock-step". Action halts
+ after a buffer is transmitted, and begins again after confirmation is
+ received from the receiver of data. NETBLT provides for multiple
+ buffering, a transfer model in which the sending NETBLT can transmit
+ new buffers while earlier buffers are waiting for confirmation from
+ the receiving NETBLT. Multiple buffering makes packet flow
+ essentially continuous and markedly improves performance.
+
+ The remainder of this document describes NETBLT in detail. The next
+ sections describe the philosophy behind a number of protocol
+ features: packetization, flow control, transfer reliability, and
+ connection management. The final sections describe NETBLT's packet
+ formats.
+
+3. Buffers and Packets
+
+ NETBLT is designed to permit transfer of a very large amounts of data
+ between two clients. During connection setup the sending NETBLT can
+ inform the receiving NETBLT of the transfer size; the maximum
+ transfer length is 2**32 bytes. This limit should permit any
+ practical application. The transfer size parameter is for the use of
+ the receiving client; the receiving NETBLT makes no use of it. A
+
+
+
+Clark, Lambert, & Zhang [Page 2]
+
+RFC 998 March 1987
+
+
+ NETBLT receiver accepts data until told by the sender that the
+ transfer is complete.
+
+ The data to be sent must be broken up into buffers by the client.
+ Each buffer must be the same size, save for the last buffer. During
+ connection setup, the sending and receiving NETBLTs negotiate the
+ buffer size, based on limits provided by the clients. Buffer sizes
+ are in bytes only; the client is responsible for placing data in
+ buffers on byte boundaries.
+
+ NETBLT has been designed and should be implemented to work with
+ buffers of any size. The only fundamental limitation on buffer size
+ should be the amount of memory available to the client. Buffers
+ should be as large as possible since this minimizes the number of
+ buffer transmissions and therefore improves performance.
+
+ NETBLT is designed to require a minimum amount of memory, allowing
+ the client to allocate as much memory as possible for buffer storage.
+ In particular, NETBLT does not keep buffer copies for retransmission
+ purposes. Instead, data to be retransmitted is recopied directly
+ from the client buffer. This means that the client cannot release
+ buffer storage piece by piece as the buffer is sent, but this has not
+ been a problem in preliminary NETBLT implementations.
+
+ Buffers are broken down by the NETBLT layer into sequences of DATA
+ packets. As with the buffer size, the DATA packet size is negotiated
+ between the sending and receiving NETBLTs during connection setup.
+ Unlike buffer size, DATA packet size is visible only to the NETBLT
+ layer.
+
+ All DATA packets save the last packet in a buffer must be the same
+ size. Packets should be as large as possible, since NETBLT's
+ performance is directly related to packet size. At the same time,
+ the packets should not be so large as to cause internetwork
+ fragmentation, since this normally causes performance degradation.
+
+ All buffers save the last buffer must be the same size; the last
+ buffer can be any size required to complete the transfer. Since the
+ receiving NETBLT does not know the transfer size in advance, it needs
+ some way of identifying the last packet in each buffer. For this
+ reason, the last packet of every buffer is not a DATA packet but
+ rather an LDATA packet. DATA and LDATA packets are identical save
+ for the packet type.
+
+4. Flow Control
+
+ NETBLT uses two strategies for flow control, one internal and one at
+ the client level.
+
+ The sending and receiving NETBLTs transmit data in buffers; client
+ flow control is therefore at a buffer level. Before a buffer can be
+
+
+
+Clark, Lambert, & Zhang [Page 3]
+
+RFC 998 March 1987
+
+
+ transmitted, NETBLT confirms that both clients have set up matching
+ buffers, that one is ready to send data, and that the other is ready
+ to receive data. Either client can therefore control the flow of
+ data by not providing a new buffer. Clients cannot stop a buffer
+ transfer once it is in progress.
+
+ Since buffers can be quite large, there has to be another method for
+ flow control that is used during a buffer transfer. The NETBLT layer
+ provides this form of flow control.
+
+ There are several flow control problems that could arise while a
+ buffer is being transmitted. If the sending NETBLT is transferring
+ data faster than the receiving NETBLT can process it, the receiver's
+ ability to buffer unprocessed packets could be overflowed, causing
+ packet loss. Similarly, a slow gateway or intermediate network could
+ cause packets to collect and overflow network packet buffer space.
+ Packets will then be lost within the network. This problem is
+ particularly acute for NETBLT because NETBLT buffers will generally
+ be quite large, and therefore composed of many packets.
+
+ A traditional solution to packet flow control is a window system, in
+ which the sending end is permitted to send only a certain number of
+ packets at a time. Unfortunately, flow control using windows tends
+ to result in low throughput. Windows must be kept small in order to
+ avoid overflowing hosts and gateways, and cannot easily be updated,
+ since an end-to-end exchange is required for each window change.
+
+ To permit high throughput over a variety of networks and gateways,
+ NETBLT uses a novel flow control method: rate control. The
+ transmission rate is negotiated by the sending and receiving NETBLTs
+ during connection setup and after each buffer transmission. The
+ sender uses timers, rather than messages from the receiver, to
+ maintain the negotiated rate.
+
+ In its simplest form, rate control specifies a minimum time period
+ per packet transmission. This can cause performance problems for
+ several reasons. First, the transmission time for a single packet is
+ very small, frequently smaller than the granularity of the timing
+ mechanism. Also, the overhead required to maintain timing mechanisms
+ on a per packet basis is relatively high and lowers performance.
+
+ The solution is to control the transmission rate of groups of
+ packets, rather than single packets. The sender transmits a burst of
+ packets over a negotiated time interval, then sends another burst.
+ In this way, the overhead decreases by a factor of the burst size,
+ and the per-burst transmission time is long enough that timing
+ mechanisms will work properly. NETBLT's rate control therefore has
+ two parts, a burst size and a burst rate, with (burst size)/(burst
+ rate) equal to the average transmission time per packet.
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 4]
+
+RFC 998 March 1987
+
+
+ The burst size and burst rate should be based not only on the packet
+ transmission and processing speed which each end can handle, but also
+ on the capacities of any intermediate gateways or networks.
+ Following are some intuitive values for packet size, buffer size,
+ burst size, and burst rate.
+
+ Packet sizes can be as small as 128 bytes. Performance with packets
+ this small is almost always bad, because of the high per-packet
+ processing overhead. Even the default Internet Protocol packet size
+ of 576 bytes is barely big enough for adequate performance. Most
+ networks do not support packet sizes much larger than one or two
+ thousand bytes, and packets of this size can also get fragmented when
+ traveling over intermediate networks, lowering performance.
+
+ The size of a NETBLT buffer is limited only by the amount of memory
+ available to a client. Theoretically, buffers of 100 Kbytes or more
+ are possible. This would mean the transmission of 50 to 100 packets
+ per buffer.
+
+ The burst size and burst rate are obviously very machine dependent.
+ There is a certain amount of transmission overhead in the sending and
+ receiving machines associated with maintaining timers and scheduling
+ processes. This overhead can be minimized by sending packets in
+ large bursts. There are also limitations imposed on the burst size
+ by the number of available packet buffers in the operating system
+ kernel. On most modern operating systems, a burst size of between
+ five and ten packets should reduce the overhead to an acceptable
+ level. A preliminary NETBLT implementation for the IBM PC/AT sends
+ packets in bursts of five. It could send more, but is limited by the
+ available memory.
+
+ The burst rate is in part determined by the granularity of the
+ sender's timing mechanism, and in part by the processing speed of the
+ receiver and any intermediate gateways. It is also directly related
+ to the burst size. Burst rates from 20 to 45 milliseconds per 5-
+ packet burst have been tried on the IBM PC/AT and Symbolics 3600
+ NETBLT implementations with good results within a single local-area
+ network. This value clearly depends on the network bandwidth and
+ packet buffering available.
+
+ All NETBLT flow control parameters (packet size, buffer size, burst
+ size, and burst rate) are negotiated during connection setup. The
+ negotiation process is the same for all parameters. The client
+ initiating the connection (the active end) proposes and sends a set
+ of values for each parameter in its connection request. The other
+ client (the passive end) compares these values with the highest-
+ performance values it can support. The passive end can then modify
+ any of the parameters, but only by making them more restrictive. The
+ modified parameters are then sent back to the active end in its
+ response message.
+
+
+
+
+Clark, Lambert, & Zhang [Page 5]
+
+RFC 998 March 1987
+
+
+ The burst size and burst rate can also be re-negotiated after each
+ buffer transmission to adjust the transfer rate according to the
+ performance observed from transferring the previous buffer. The
+ receiving end sends burst size and burst rate values in its OK
+ messages (described later). The sender compares these values with
+ the values it can support. Again, it may then modify any of the
+ parameters, but only by making them more restrictive. The modified
+ parameters are then communicated to the receiver in a NULL-ACK
+ packet, described later.
+
+ Obviously each of the parameters depend on many factors -- gateway
+ and host processing speeds, available memory, timer granularity --
+ some of which cannot be checked by either client. Each client must
+ therefore try to make as best a guess as it can, tuning for
+ performance on subsequent transfers.
+
+5. The NETBLT Transfer Model
+
+ Each NETBLT transfer has three stages, connection setup, data
+ transfer, and connection close. The stages are described in detail
+ below, along with methods for insuring that each stage completes
+ reliably.
+
+5.1. Connection Setup
+
+ A NETBLT connection is set up by an exchange of two packets between
+ the active NETBLT and the passive NETBLT. Note that either NETBLT
+ can send or receive data; the words "active" and "passive" are only
+ used to differentiate the end making the connection request from the
+ end responding to the connection request. The active end sends an
+ OPEN packet; the passive end acknowledges the OPEN packet in one of
+ two ways. It can either send a REFUSED packet, indicating that the
+ connection cannot be completed for some reason, or it can complete
+ the connection setup by sending a RESPONSE packet. At this point the
+ transfer can begin.
+
+ As discussed in the previous section, the OPEN and RESPONSE packets
+ are used to negotiate flow control parameters. Other parameters used
+ in the data transfer are also negotiated. These parameters are (1)
+ the maximum number of buffers that can be sending at any one time,
+ and (2) whether or not DATA packet data will be checksummed. NETBLT
+ automatically checksums all non-DATA/LDATA packets. If the
+ negotiated checksum flag is set to TRUE (1), both the header and the
+ data of a DATA/LDATA packet are checksummed; if set to FALSE (0),
+ only the header is checksummed. The checksum value is the bitwise
+ negation of the ones-complement sum of the 16-bit words being
+ checksummed.
+
+ Finally, each end transmits its death-timeout value in seconds in
+ either the OPEN or the RESPONSE packet. The death-timeout value will
+ be used to determine the frequency with which to send KEEPALIVE
+
+
+
+Clark, Lambert, & Zhang [Page 6]
+
+RFC 998 March 1987
+
+
+ packets during idle periods of an opened connection (death timers and
+ KEEPALIVE packets are described in the following section).
+
+ The active end specifies a passive client through a client-specific
+ "well-known" 16 bit port number on which the passive end listens.
+ The active end identifies itself through a 32 bit Internet address
+ and a unique 16 bit port number.
+
+ In order to allow the active and passive ends to communicate
+ miscellaneous useful information, an unstructured, variable-length
+ field is provided in OPEN and RESPONSE packets for any client-
+ specific information that may be required. In addition, a "reason
+ for refusal" field is provided in REFUSED packets.
+
+ Recovery for lost OPEN and RESPONSE packets is provided by the use of
+ timers. The active end sets a timer when it sends an OPEN packet.
+ When the timer expires, another OPEN packet is sent, until some
+ predetermined maximum number of OPEN packets have been sent. The
+ timer is cleared upon receipt of a RESPONSE packet.
+
+ To prevent duplication of OPEN and RESPONSE packets, the OPEN packet
+ contains a 32 bit connection unique ID that must be returned in the
+ RESPONSE packet. This prevents the initiator from confusing the
+ response to the current request with the response to an earlier
+ connection request (there can only be one connection between any two
+ ports). Any OPEN or RESPONSE packet with a destination port matching
+ that of an open connection has its unique ID checked. If the unique
+ ID of the packet matches the unique ID of the connection, then the
+ packet type is checked. If it is a RESPONSE packet, it is treated as
+ a duplicate and ignored. If it is an OPEN packet, the passive NETBLT
+ sends another RESPONSE (assuming that a previous RESPONSE packet was
+ sent and lost, causing the initiating NETBLT to retransmit its OPEN
+ packet). A non-matching unique ID must be treated as an attempt to
+ open a second connection between the same port pair and is rejected
+ by sending an ABORT message.
+
+5.2. Data Transfer
+
+ The simplest model of data transfer proceeds as follows. The sending
+ client sets up a buffer full of data. The receiving NETBLT sends a
+ GO message inside a CONTROL packet to the sender, signifying that it
+ too has set up a buffer and is ready to receive data. Once the GO
+ message is received, the sender transmits the buffer as a series of
+ DATA packets followed by an LDATA packet. When the last packet in
+ the buffer has been received, the receiver sends a RESEND message
+ inside a CONTROL packet containing a list of packets that were not
+ received. The sender resends these packets. This process continues
+ until there are no missing packets. At that time the receiver sends
+ an OK message inside a CONTROL packet, sets up another buffer to
+ receive data, and sends another GO message. The sender, having
+ received the OK message, sets up another buffer, waits for the GO
+
+
+
+Clark, Lambert, & Zhang [Page 7]
+
+RFC 998 March 1987
+
+
+ message, and repeats the process.
+
+ The above data transfer model is effectively a lock-step protocol,
+ and causes time to be wasted while the sending NETBLT waits for
+ permission to send a new buffer. A more efficient transfer model
+ uses multiple buffering to increase performance. Multiple buffering
+ is a technique in which the sender and receiver allocate and transmit
+ buffers in a manner that allows error recovery or successful
+ transmission confirmation of previous buffers to be concurrent with
+ transmission of the current buffer.
+
+ During the connection setup phase, one of the negotiated parameters
+ is the number of concurrent buffers permitted during the transfer.
+ If there is more than one buffer available, transfer of the next
+ buffer may start right after the current buffer finishes. This is
+ illustrated in the following example:
+
+ Assume two buffers A and B in a multiple-buffer transfer, with A
+ preceding B. When A has been transferred and the sending NETBLT is
+ waiting for either an OK or a RESEND message for it, the sending
+ NETBLT can start sending B immediately, keeping data flowing at a
+ stable rate. If the receiver of data sends an OK for A, all is well;
+ if it receives a RESEND, the missing packets specified in the RESEND
+ message are retransmitted.
+
+ In the multiple-buffer transfer model, all packets to be sent are
+ re-ordered by buffer number (lowest number first), with the transfer
+ rate specified by the burst size and burst rate. Since buffer
+ numbers increase monotonically, packets from an earlier buffer will
+ always precede packets from a later buffer.
+
+ Having several buffers transmitting concurrently is actually not that
+ much more complicated than transmitting a single buffer at a time.
+ The key is to visualize each buffer as a finite state machine;
+ several buffers are merely a group of finite state machines, each in
+ one of several states. The transfer process consists of moving
+ buffers through various states until the entire transmission has
+ completed.
+
+ There are several obvious flaws in the data transfer model as
+ described above. First, what if the GO, OK, or RESEND messages are
+ lost? The sender cannot act on a packet it has not received, so the
+ protocol will hang. Second, if an LDATA packet is lost, how does the
+ receiver know when the buffer has been transmitted? Solutions for
+ each of these problems are presented below.
+
+5.2.1. Recovering from Lost Control Messages
+
+ NETBLT solves the problem of lost OK, GO, and RESEND messages in two
+ ways. First, it makes use of a control timer. The receiver can send
+ one or more control messages (OK, GO, or RESEND) within a single
+
+
+
+Clark, Lambert, & Zhang [Page 8]
+
+RFC 998 March 1987
+
+
+ CONTROL packet. Whenever the receiver sends a control packet, it
+ sets a control timer. This timer is either "reset" (set again) or
+ "cleared" (deactivated), under the following conditions:
+
+ When the control timer expires, the receiving NETBLT resends the
+ control packet and resets the timer. The receiving NETBLT continues
+ to resend control packets in response to control timer's expiration
+ until either the control timer is cleared or the receiving NETBLT's
+ death timer (described later) expires (at which time it shuts down
+ the connection).
+
+ Each control message includes a sequence number which starts at one
+ and increases by one for each control message sent. The sending
+ NETBLT checks the sequence number of every incoming control message
+ against all other sequence numbers it has received. It stores the
+ highest sequence number below which all other received sequence
+ numbers are consecutive (in following paragraphs this is called the
+ high-acknowledged-sequence-number) and returns this number in every
+ packet flowing back to the receiver. The receiver is permitted to
+ clear its control timer when it receives a packet from the sender
+ with a high-acknowledged-sequence-number greater than or equal to the
+ highest sequence number in the control packet just sent.
+
+ Ideally, a NETBLT implementation should be able to cope with out-of-
+ sequence control messages, perhaps collecting them for later
+ processing, or even processing them immediately. If an incoming
+ control message "fills" a "hole" in a group of message sequence
+ numbers, the implementation could even be clever enough to detect
+ this and adjust its outgoing sequence value accordingly.
+
+ The sending NETBLT, upon receiving a CONTROL packet, should act on
+ the packet as quickly as possible. It either sets up a new buffer
+ (upon receipt of an OK message for a previous buffer), marks data for
+ resending (upon receipt of a RESEND message), or prepares a buffer
+ for sending (upon receipt of a GO message). If the sending NETBLT is
+ not in a position to send data, it should send a NULL-ACK packet,
+ which contains its high-acknowledged-sequence-number (this permits
+ the receiving NETBLT to acknowledge any outstanding control
+ messages), and wait until it can send more data. In all of these
+ cases, the system overhead for a response to the incoming control
+ message should be small and relatively constant.
+
+ The small amount of message-processing overhead allows accurate
+ control timers to be set for all types of control messages with a
+ single, simple algorithm -- the network round-trip transit time, plus
+ a variance factor. This is more efficient than schemes used by other
+ protocols, where timer value calculation has been a problem because
+ the processing time for a particular packet can vary greatly
+ depending on the packet type.
+
+ Control timer value estimation is extremely important in a high-
+
+
+
+Clark, Lambert, & Zhang [Page 9]
+
+RFC 998 March 1987
+
+
+ performance protocol like NETBLT. A long control timer causes the
+ receiving NETBLT to wait for long periods of time before
+ retransmitting unacknowledged messages. A short control timer value
+ causes the sending NETBLT to receive many duplicate control messages
+ (which it can reject, but which takes time).
+
+ In addition to the use of control timers, NETBLT reduces lost control
+ messages by using a single long-lived control packet; the packet is
+ treated like a FIFO queue, with new control messages added on at the
+ end and acknowledged control messages removed from the front. The
+ implementation places control messages in the control packet and
+ transmits the entire control packet, consisting of any unacknowledged
+ control messages plus new messages just added. The entire control
+ packet is also transmitted whenever the control timer expires. Since
+ control packet transmissions are fairly frequent, unacknowledged
+ messages may be transmitted several times before they are finally
+ acknowledged. This redundant transmission of control messages
+ provides automatic recovery for most control message losses over a
+ noisy channel.
+
+ This scheme places some burdens on the receiver of the control
+ messages. It must be able to quickly reject duplicate control
+ messages, since a given message may be retransmitted several times
+ before its acknowledgement is received and it is removed from the
+ control packet. Typically this is fairly easy to do; the sender of
+ data merely throws away any control messages with sequence numbers
+ lower than its high-acknowledged-sequence-number.
+
+ Another problem with this scheme is that the control packet may
+ become larger than the maximum allowable packet size if too many
+ control messages are placed into it. This has not been a problem in
+ the current NETBLT implementations: a typical control packet size is
+ 1000 bytes; RESEND control messages average about 20 bytes in length,
+ GO messages are 8 bytes long, and OK messages are 16 bytes long.
+ This allows 50-80 control messages to be placed in the control
+ packet, more than enough for reasonable transfers. Other
+ implementations can provide for multiple control packets if a single
+ control packet may not be sufficient.
+
+ The control timer value must be carefully estimated. It can have as
+ its initial value an arbitrary number. Subsequent control packets
+ should have their timer values based on the network round-trip
+ transit time (i.e. the time between sending the control packet and
+ receiving the acknowledgment of all messages in the control packet)
+ plus a variance factor. The timer value should be continually
+ updated, based on a smoothed average of collected round-trip transit
+ times.
+
+
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 10]
+
+RFC 998 March 1987
+
+
+5.2.2. Recovering from Lost LDATA Packets
+
+ NETBLT solves the problem of LDATA packet loss by using a data timer
+ for each buffer at the receiving end. The simplest data timer model
+ has a data timer set when a buffer is ready to be received; if the
+ data timer expires, the receiving NETBLT assumes a lost LDATA packet
+ and sends a RESEND message requesting all missing DATA packets in the
+ buffer. When all packets have been received, the timer is cleared.
+
+ Data timer values are not based on network round-trip transit time;
+ instead they are based on the amount of time taken to transfer a
+ buffer (as determined by the number of DATA packet bursts in the
+ buffer times the burst rate) plus a variance factor <1>.
+
+ Obviously an accurate estimation of the data timer value is very
+ important. A short data timer value causes the receiving NETBLT to
+ send unnecessary RESEND packets. This causes serious performance
+ degradation since the sending NETBLT has to stop what it is doing and
+ resend a number of DATA packets.
+
+ Data timer setting and clearing turns out to be fairly complicated,
+ particularly in a multiple-buffering transfer model. In
+ understanding how and when data timers are set and cleared, it is
+ helpful to visualize each buffer as a finite-state machine and take a
+ look at the various states.
+
+ The state sequence for a sending buffer is simple. When a GO message
+ for the buffer is received, the buffer is created, filled with data,
+ and placed in a SENDING state. When an OK for that buffer has been
+ received, it goes into a SENT state and is disposed of.
+
+ The state sequence for a receiving buffer is a little more
+ complicated. Assume existence of a buffer A. When a control message
+ for A is sent, the buffer moves into state ACK-WAIT (it is waiting
+ for acknowledgement of the control message).
+
+ As soon as the control message has been acknowledged, buffer A moves
+ from the ACK-WAIT state into the ACKED state (it is now waiting for
+ DATA packets to arrive). At this point, A's data timer is set and
+ the control message removed from the control packet. Estimation of
+ the data timer value at this point is quite difficult. In a
+ multiple-buffer transfer model, the receiving NETBLT can send several
+ GO messages at once. A single DATA packet from the sending NETBLT
+ could acknowledge all the GO messages, causing several buffers to
+ start up data timers. Clearly each of the data timers must be set in
+ a manner that takes into account each buffer's place in the order of
+ transmission. Packets for a buffer A - 1 will always be transmitted
+ before packets in A, so A's data timer must take into account the
+ arrival of all of A - 1's DATA packets as well as arrival of its own
+ DATA packets. This means that the timer values become increasingly
+ less accurate for higher-numbered buffers. Because this data timer
+
+
+
+Clark, Lambert, & Zhang [Page 11]
+
+RFC 998 March 1987
+
+
+ value can be quite inaccurate, it is called a "loose" data timer.
+ The loose data timer value is recalculated later (using the same
+ algorithm, but with updated information), giving a "tight" timer, as
+ described below.
+
+ When the first DATA packet for A arrives, A moves from the ACKED
+ state to the RECEIVING state and its data timer is set to a new
+ "tight" value. The tight timer value is calculated in the same
+ manner as the loose timer, but it is more accurate since we have
+ moved forward in time and those buffers numbered lower than A have
+ presumably been dealt with (or their packets would have arrived
+ before A's), leaving fewer packets to arrive between the setting of
+ the data timer and the arrival of the last DATA packet in A.
+
+ The receiving NETBLT also sets the tight data timers of any buffers
+ numbered lower than A that are also in the ACKED state. This is done
+ as an optimization: we know that buffers are processed in order,
+ lowest number first. If a buffer B numbered lower than A is in the
+ ACKED state, its DATA packets should arrive before A's. Since A's
+ have arrived first, B's must have gotten lost. Since B's loose data
+ timer has not expired (it would then have sent a RESEND message and
+ be in the ACK-WAIT state), we set the tight timer, allowing the
+ missing packets to be detected earlier. An immediate RESEND is not
+ sent because it is possible that A's packet was re-ordered before B's
+ by the network, and that B's packets may arrive shortly.
+
+ When all DATA packets for A have been received, it moves from the
+ RECEIVING state to the RECEIVED state and is disposed of. Had any
+ packets been missing, A's data timer would have expired and A would
+ have moved into the ACK-WAIT state after sending a RESEND message.
+ The state progression would then move as in the above example.
+
+ The control and data timer system can be summarized as follows:
+ normally, the receiving NETBLT is working under one of two types of
+ timers, a control timer or a data timer. There is one data timer per
+ buffer transmission and one control timer per control packet. The
+ data timer is active while its buffer is in either the ACKED (loose
+ data timer value is used) or the RECEIVING (tight data timer value is
+ used) states; a control timer is active whenever the receiving NETBLT
+ has any unacknowledged control messages in its control packet.
+
+5.2.3. Death Timers and Keepalive Packets
+
+ The above system still leaves a few problems. If the sending NETBLT
+ is not ready to send, it sends a single NULL-ACK packet to clear any
+ outstanding control timers at the receiving end. After this the
+ receiver will wait. The sending NETBLT could die and the receiver,
+ with its control timer cleared, would hang. Also, the above system
+ puts timers only on the receiving NETBLT. The sending NETBLT has no
+ timers; if the receiving NETBLT dies, the sending NETBLT will hang
+ while waiting for control messages to arrive.
+
+
+
+Clark, Lambert, & Zhang [Page 12]
+
+RFC 998 March 1987
+
+
+ The solution to the above two problems is the use of a death timer
+ and a keepalive packet for both the sending and receiving NETBLTs.
+ As soon as the connection is opened, each end sets a death timer;
+ this timer is reset every time a packet is received. When a NETBLT's
+ death timer expires, it can assume the other end has died and can
+ close the connection.
+
+ It is possible that the sending or receiving NETBLTs will have to
+ wait for long periods while their respective clients get buffer space
+ and load their buffers with data. Since a NETBLT waiting for buffer
+ space is in a perfectly valid state, the protocol must have some
+ method for preventing the other end's death timer from expiring. The
+ solution is to use a KEEPALIVE packet, which is sent repeatedly at
+ fixed intervals when a NETBLT cannot send other packets. Since the
+ death timer is reset whenever a packet is received, it will never
+ expire as long as the other end sends packets.
+
+ The frequency with which KEEPALIVE packets are transmitted is
+ computed as follows: At connection startup, each NETBLT chooses a
+ death-timer value and sends it to the other end in either the OPEN or
+ the RESPONSE packet. The other end takes the death-timeout value and
+ uses it to compute a frequency with which to send KEEPALIVE packets.
+ The KEEPALIVE frequency should be high enough that several KEEPALIVE
+ packets can be lost before the other end's death timer expires (e.g.
+ death timer value divided by four).
+
+ The death timer value is relatively easy to estimate. Since it is
+ continually reset, it need not be based on the transfer size.
+ Instead, it should be based at least in part on the type of
+ application using NETBLT. User applications should have smaller
+ death timeout values to avoid forcing humans to wait long periods of
+ time for a death timeout to occur. Machine applications can have
+ longer timeout values.
+
+5.3. Closing the Connection
+
+ There are three ways to close a connection: a connection close, a
+ "quit", or an "abort".
+
+5.3.1. Successful Transfer
+
+ After a successful data transfer, NETBLT closes the connection. When
+ the sender is transmitting the last buffer of data, it sets a "last-
+ buffer" flag on every DATA packet in the buffer. This means that no
+ NEW data will be transmitted. The receiver knows the transfer has
+ completed successfully when all of the following are true: (1) it has
+ received DATA packets with a "last-buffer" flag set, (2) all its
+ control messages have been acknowledged, and (3) it has no
+ outstanding buffers with missing packets. At that point, the
+ receiver is permitted to close its half of the connection. The
+ sender knows the transfer has completed when the following are true:
+
+
+
+Clark, Lambert, & Zhang [Page 13]
+
+RFC 998 March 1987
+
+
+ (1) it has transmitted DATA packets with a "last-buffer" flag set and
+ (2) it has received OK messages for all its buffers. At that point,
+ it "dallies" for a predetermined period of time before closing its
+ half of the connection. If the NULL-ACK packet acknowledging the
+ receiver's last OK message was lost, the receiver has time to
+ retransmit the OK message, receive a new NULL-ACK, and recognize a
+ successful transfer. The dally timer value MUST be based on the
+ receiver's control timer value; it must be long enough to allow the
+ receiver's control timer to expire so that the OK message can be re-
+ sent. For this reason, all OK messages contain (in addition to new
+ burst size and burst rate values), the receiver's current control
+ timer value in milliseconds. The sender uses this value to compute
+ its dally timer value.
+
+ Since the dally timer value may be quite large, the receiving NETBLT
+ is permitted to "short-circuit" the sending NETBLT's dally timer by
+ transmitting a DONE packet. The DONE packet is transmitted when the
+ receiver knows the transfer has been successfully completed. When
+ the sender receives a DONE packet, it is allowed to clear its dally
+ timer and close its half of the connection immediately. The DONE
+ packet is not reliably transmitted, since failure to receive it only
+ means that the sending NETBLT will take longer time to close its half
+ of the connection (as it waits for its dally timer to clear)
+
+5.3.2. Client QUIT
+
+ During a NETBLT transfer, one client may send a QUIT packet to the
+ other if it thinks that the other client is malfunctioning. Since
+ the QUIT occurs at a client level, the QUIT transmission can only
+ occur between buffer transmissions. The NETBLT receiving the QUIT
+ packet can take no action other than immediately notifying its client
+ and transmitting a QUITACK packet. The QUIT sender must time out and
+ retransmit until a QUITACK has been received or its death timer
+ expires. The sender of the QUITACK dallies before quitting, so that
+ it can respond to a retransmitted QUIT.
+
+5.3.3. NETBLT ABORT
+
+ An ABORT takes place when a NETBLT layer thinks that it or its
+ opposite is malfunctioning. Since the ABORT originates in the NETBLT
+ layer, it can be sent at any time. The ABORT implies that the NETBLT
+ layer is malfunctioning, so no transmit reliability is expected, and
+ the sender can immediately close it connection.
+
+6. Protocol Layering Structure
+
+ NETBLT is implemented directly on top of the Internet Protocol (IP).
+ It has been assigned an official protocol number of 30 (decimal).
+
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 14]
+
+RFC 998 March 1987
+
+
+7. Planned Enhancements
+
+ As currently specified, NETBLT has no algorithm for determining its
+ rate-control parameters (burst rate, burst size, etc.). In initial
+ performance testing, these parameters have been set by the person
+ performing the test. We are now exploring ways to have NETBLT set
+ and adjust its rate-control parameters automatically.
+
+8. Packet Formats
+
+ NETBLT packets are divided into three categories, all of which share
+ a common packet header. First, there are those packets that travel
+ only from data sender to receiver; these contain the high-
+ acknowledged-sequence-numbers which the receiver uses for control
+ message transmission reliability. These packets are the NULL-ACK,
+ DATA, and LDATA packets. Second, there is a packet that travels only
+ from receiver to sender. This is the CONTROL packet; each CONTROL
+ packet can contain an arbitrary number of control messages (GO, OK,
+ or RESEND), each with its own sequence number. Finally, there are
+ those packets which either have special ways of insuring reliability,
+ or are not reliably transmitted. These are the OPEN, RESPONSE,
+ REFUSED, QUIT, QUITACK, DONE, KEEPALIVE, and ABORT packets. Of
+ these, all save the DONE packet can be sent by both sending and
+ receiving NETBLTs.
+
+ All packets are "longword-aligned", i.e. all packets are a multiple
+ of 4 bytes in length and all 4-byte fields start on a longword
+ boundary. All arbitrary-length string fields are terminated with at
+ least one null byte, with extra null bytes added at the end to create
+ a field that is a multiple of 4 bytes long.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 15]
+
+RFC 998 March 1987
+
+
+ Packet Formats for NETBLT
+
+ OPEN (type 0) and RESPONSE (type 1):
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Checksum | Version | Type |
+ +---------------+---------------+---------------+---------------+
+ | Length | Local Port |
+ +---------------+---------------+---------------+---------------+
+ | Foreign Port | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+ | Connection Unique ID |
+ +---------------+---------------+---------------+---------------+
+ | Buffer Size |
+ +---------------+---------------+---------------+---------------+
+ | Transfer Size |
+ +---------------+---------------+---------------+---------------+
+ | DATA packet size | Burst Size |
+ +---------------+---------------+---------------+---------------+
+ | Burst Rate | Death Timer Value |
+ +---------------+---------------+---------------+---------------+
+ | Reserved (MBZ) |C|M| Maximum # Outstanding Buffers |
+ +---------------+---------------+---------------+---------------+
+ | Client String ...
+ +---------------+---------------+---------------
+ Longword Alignment Padding |
+ ---------------+-------------------------------+
+
+ Checksum: packet checksum (algorithm is described in the section
+ "Connection Setup")
+
+ Version: the NETBLT protocol version number
+
+ Type: the NETBLT packet type number (OPEN = 0, RESPONSE = 1,
+ etc.)
+
+ Length: the total length (NETBLT header plus data, if present)
+ of the NETBLT packet in bytes
+
+ Local Port: the local NETBLT's 16-bit port number
+
+ Foreign Port: the foreign NETBLT's 16-bit port number
+
+ Connection UID: the 32 bit connection UID specified in the
+ section "Connection Setup".
+
+ Buffer size: the size in bytes of each NETBLT buffer (save the
+ last)
+
+
+
+
+Clark, Lambert, & Zhang [Page 16]
+
+RFC 998 March 1987
+
+
+ Transfer size: (optional) the size in bytes of the transfer.
+
+ This is for client information only; the receiving NETBLT should
+ NOT make use of it.
+
+ Data packet size: length of each DATA packet in bytes
+
+ Burst Size: Number of DATA packets in a burst
+
+ Burst Rate: Transmit time in milliseconds of a single burst
+
+ Death timer: Packet sender's death timer value in seconds
+
+ "M": the transfer mode (0 = READ, 1 = WRITE)
+
+ "C": the DATA packet data checksum flag (0 = do not checksum
+ DATA packet data, 1 = do)
+
+ Maximum Outstanding Buffers: maximum number of buffers that can
+ be transferred before waiting for an OK message from the
+ receiving NETBLT.
+
+ Client string: an arbitrary, null-terminated, longword-aligned
+ string for use by NETBLT clients.
+
+ KEEPALIVE (type 2), QUITACK (type 4), and DONE (type 11)
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Checksum | Version | Type |
+ +---------------+---------------+---------------+---------------+
+ | Length | Local Port |
+ +---------------+---------------+---------------+---------------+
+ | Foreign Port | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 17]
+
+RFC 998 March 1987
+
+
+ QUIT (type 3), ABORT (type 5), and REFUSED (type 10)
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Checksum | Version | Type |
+ +---------------+---------------+---------------+---------------+
+ | Length | Local Port |
+ +---------------+---------------+---------------+---------------+
+ | Foreign Port | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+ | Reason for QUIT/ABORT/REFUSE...
+ +---------------+---------------+---------------
+ Longword Alignment Padding |
+ ---------------+-------------------------------+
+
+ DATA (type 6) and LDATA (type 7):
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Checksum | Version | Type |
+ +---------------+---------------+---------------+---------------+
+ | Length | Local Port |
+ +---------------+---------------+---------------+---------------+
+ | Foreign Port | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+ | Buffer Number |
+ +---------------+---------------+---------------+---------------+
+ | High Consecutive Seq Num Rcvd | Packet Number |
+ +---------------+---------------+---------------+---------------+
+ | Data Area Checksum Value | Reserved (MBZ) |L|
+ +---------------+---------------+---------------+---------------+
+
+ Buffer number: a 32 bit unique number assigned to every buffer.
+ Numbers are monotonically increasing.
+
+ High Consecutive Sequence Number Received: Highest control
+ message sequence number below which all sequence numbers received
+ are consecutive.
+
+ Packet number: monotonically increasing DATA packet identifier
+
+ Data Area Checksum Value: Checksum of the DATA packet's data.
+ Algorithm used is the same as that used to compute checksums of
+ other NETBLT packets.
+
+ "L" is a flag set when the buffer that this DATA packet belongs
+ to is the last buffer in the transfer.
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 18]
+
+RFC 998 March 1987
+
+
+ NULL-ACK (type 8)
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Checksum | Version | Type |
+ +---------------+---------------+---------------+---------------+
+ | Length | Local Port |
+ +---------------+---------------+---------------+---------------+
+ | Foreign Port | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+ | High Consecutive Seq Num Rcvd | New Burst Size |
+ +---------------+---------------+---------------+---------------+
+ | New Burst Rate | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+
+ High Consecutive Sequence Number Received: same as in DATA/LDATA
+ packet
+
+ New Burst Size: Burst size as negotiated from value given by
+ receiving NETBLT in OK message
+
+ New burst rate: Burst rate as negotiated from value given
+ by receiving NETBLT in OK message. Value is in milliseconds.
+
+ CONTROL (type 9):
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Checksum | Version | Type |
+ +---------------+---------------+---------------+---------------+
+ | Length | Local Port |
+ +---------------+---------------+---------------+---------------+
+ | Foreign Port | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+
+ Followed by any number of messages, each of which is longword
+ aligned, with the following formats:
+
+ GO message (type 0):
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Type | Word Padding | Sequence Number |
+ +---------------+---------------+---------------+---------------+
+ | Buffer Number |
+ +---------------+---------------+---------------+---------------+
+
+ Type: message type (GO = 0, OK = 1, RESEND = 2)
+
+
+
+Clark, Lambert, & Zhang [Page 19]
+
+RFC 998 March 1987
+
+
+ Sequence number: A 16 bit unique message number. Sequence
+ numbers must be monotonically increasing, starting from 1.
+
+ Buffer number: as in DATA/LDATA packet
+
+ OK message (type 1):
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Type | Word Padding | Sequence Number |
+ +---------------+---------------+---------------+---------------+
+ | Buffer Number |
+ +---------------+---------------+---------------+---------------+
+ | New Offered Burst Size | New Offered Burst Rate |
+ +---------------+---------------+---------------+---------------+
+ | Current control timer value | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+
+ New offered burst size: burst size for subsequent buffer
+ transfers, possibly based on performance information for previous
+ buffer transfers.
+
+ New offered burst rate: burst rate for subsequent buffer
+ transfers, possibly based on performance information for previous
+ buffer transfers. Rate is in milliseconds.
+
+ Current control timer value: Receiving NETBLT's control timer
+ value in milliseconds.
+
+ RESEND Message (type 2):
+
+ 1 2 3
+ 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
+ +---------------+---------------+---------------+---------------+
+ | Type | Word Padding | Sequence Number |
+ +---------------+---------------+---------------+---------------+
+ | Buffer Number |
+ +---------------+---------------+---------------+---------------+
+ | Number of Missing Packets | Longword Alignment Padding |
+ +---------------+---------------+---------------+---------------+
+ | Packet Number (2 bytes) ...
+ +---------------+---------------+----------
+ | Padding (if necessary) |
+ -----------+---------------+---------------+
+
+ Packet number: the 16 bit data packet identifier found in each
+ DATA packet.
+
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 20]
+
+RFC 998 March 1987
+
+
+NOTES:
+
+ <1> When the buffer size is large, the variances in the round trip
+ delays of many packets may cancel each other out; this means the
+ variance value need not be very big. This expectation will be
+ explored in further testing.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Clark, Lambert, & Zhang [Page 21]
+