summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2481.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc2481.txt')
-rw-r--r--doc/rfc/rfc2481.txt1403
1 files changed, 1403 insertions, 0 deletions
diff --git a/doc/rfc/rfc2481.txt b/doc/rfc/rfc2481.txt
new file mode 100644
index 0000000..a04f95f
--- /dev/null
+++ b/doc/rfc/rfc2481.txt
@@ -0,0 +1,1403 @@
+
+
+
+
+
+
+Network Working Group K. Ramakrishnan
+Request for Comments: 2481 AT&T Labs Research
+Category: Experimental S. Floyd
+ LBNL
+ January 1999
+
+
+ A Proposal to add Explicit Congestion Notification (ECN) to IP
+
+Status of this Memo
+
+ This memo defines an Experimental Protocol for the Internet
+ community. It does not specify an Internet standard of any kind.
+ Discussion and suggestions for improvement are requested.
+ Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (1999). All Rights Reserved.
+
+Abstract
+
+ This note describes a proposed addition of ECN (Explicit Congestion
+ Notification) to IP. TCP is currently the dominant transport
+ protocol used in the Internet. We begin by describing TCP's use of
+ packet drops as an indication of congestion. Next we argue that with
+ the addition of active queue management (e.g., RED) to the Internet
+ infrastructure, where routers detect congestion before the queue
+ overflows, routers are no longer limited to packet drops as an
+ indication of congestion. Routers could instead set a Congestion
+ Experienced (CE) bit in the packet header of packets from ECN-capable
+ transport protocols. We describe when the CE bit would be set in the
+ routers, and describe what modifications would be needed to TCP to
+ make it ECN-capable. Modifications to other transport protocols
+ (e.g., unreliable unicast or multicast, reliable multicast, other
+ reliable unicast transport protocols) could be considered as those
+ protocols are developed and advance through the standards process.
+
+1. Conventions and Acronyms
+
+ The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
+ SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
+ document, are to be interpreted as described in [B97].
+
+
+
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 1]
+
+RFC 2481 ECN to IP January 1999
+
+
+2. Introduction
+
+ TCP's congestion control and avoidance algorithms are based on the
+ notion that the network is a black-box [Jacobson88, Jacobson90]. The
+ network's state of congestion or otherwise is determined by end-
+ systems probing for the network state, by gradually increasing the
+ load on the network (by increasing the window of packets that are
+ outstanding in the network) until the network becomes congested and a
+ packet is lost. Treating the network as a "black-box" and treating
+ loss as an indication of congestion in the network is appropriate for
+ pure best-effort data carried by TCP which has little or no
+ sensitivity to delay or loss of individual packets. In addition,
+ TCP's congestion management algorithms have techniques built-in (such
+ as Fast Retransmit and Fast Recovery) to minimize the impact of
+ losses from a throughput perspective.
+
+ However, these mechanisms are not intended to help applications that
+ are in fact sensitive to the delay or loss of one or more individual
+ packets. Interactive traffic such as telnet, web-browsing, and
+ transfer of audio and video data can be sensitive to packet losses
+ (using an unreliable data delivery transport such as UDP) or to the
+ increased latency of the packet caused by the need to retransmit the
+ packet after a loss (for reliable data delivery such as TCP).
+
+ Since TCP determines the appropriate congestion window to use by
+ gradually increasing the window size until it experiences a dropped
+ packet, this causes the queues at the bottleneck router to build up.
+ With most packet drop policies at the router that are not sensitive
+ to the load placed by each individual flow, this means that some of
+ the packets of latency-sensitive flows are going to be dropped.
+ Active queue management mechanisms detect congestion before the queue
+ overflows, and provide an indication of this congestion to the end
+ nodes. The advantages of active queue management are discussed in
+ RFC 2309 [RFC2309]. Active queue management avoids some of the bad
+ properties of dropping on queue overflow, including the undesirable
+ synchronization of loss across multiple flows. More importantly,
+ active queue management means that transport protocols with
+ congestion control (e.g., TCP) do not have to rely on buffer overflow
+ as the only indication of congestion. This can reduce unnecessary
+ queueing delay for all traffic sharing that queue.
+
+ Active queue management mechanisms may use one of several methods for
+ indicating congestion to end-nodes. One is to use packet drops, as is
+ currently done. However, active queue management allows the router to
+ separate policies of queueing or dropping packets from the policies
+ for indicating congestion. Thus, active queue management allows
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 2]
+
+RFC 2481 ECN to IP January 1999
+
+
+ routers to use the Congestion Experienced (CE) bit in a packet header
+ as an indication of congestion, instead of relying solely on packet
+ drops.
+
+3. Assumptions and General Principles
+
+ In this section, we describe some of the important design principles
+ and assumptions that guided the design choices in this proposal.
+
+ (1) Congestion may persist over different time-scales. The time
+ scales that we are concerned with are congestion events that may
+ last longer than a round-trip time.
+ (2) The number of packets in an individual flow (e.g., TCP connection
+ or an exchange using UDP) may range from a small number of
+ packets to quite a large number. We are interested in managing
+ the congestion caused by flows that send enough packets so that
+ they are still active when network feedback reaches them.
+ (3) New mechanisms for congestion control and avoidance need to co-
+ exist and cooperate with existing mechanisms for congestion
+ control. In particular, new mechanisms have to co-exist with
+ TCP's current methods of adapting to congestion and with routers'
+ current practice of dropping packets in periods of congestion.
+ (4) Because ECN is likely to be adopted gradually, accommodating
+ migration is essential. Some routers may still only drop packets
+ to indicate congestion, and some end-systems may not be ECN-
+ capable. The most viable strategy is one that accommodates
+ incremental deployment without having to resort to "islands" of
+ ECN-capable and non-ECN-capable environments.
+ (5) Asymmetric routing is likely to be a normal occurrence in the
+ Internet. The path (sequence of links and routers) followed by
+ data packets may be different from the path followed by the
+ acknowledgment packets in the reverse direction.
+ (6) Many routers process the "regular" headers in IP packets more
+ efficiently than they process the header information in IP
+ options. This suggests keeping congestion experienced
+ information in the regular headers of an IP packet.
+ (7) It must be recognized that not all end-systems will cooperate in
+ mechanisms for congestion control. However, new mechanisms
+ shouldn't make it easier for TCP applications to disable TCP
+ congestion control. The benefit of lying about participating in
+ new mechanisms such as ECN-capability should be small.
+
+4. Random Early Detection (RED)
+
+ Random Early Detection (RED) is a mechanism for active queue
+ management that has been proposed to detect incipient congestion
+ [FJ93], and is currently being deployed in the Internet backbone
+ [RFC2309]. Although RED is meant to be a general mechanism using one
+
+
+
+Ramakrishnan & Floyd Experimental [Page 3]
+
+RFC 2481 ECN to IP January 1999
+
+
+ of several alternatives for congestion indication, in the current
+ environment of the Internet RED is restricted to using packet drops
+ as a mechanism for congestion indication. RED drops packets based on
+ the average queue length exceeding a threshold, rather than only when
+ the queue overflows. However, when RED drops packets before the
+ queue actually overflows, RED is not forced by memory limitations to
+ discard the packet.
+
+ RED could set a Congestion Experienced (CE) bit in the packet header
+ instead of dropping the packet, if such a bit was provided in the IP
+ header and understood by the transport protocol. The use of the CE
+ bit would allow the receiver(s) to receive the packet, avoiding the
+ potential for excessive delays due to retransmissions after packet
+ losses. We use the term 'CE packet' to denote a packet that has the
+ CE bit set.
+
+5. Explicit Congestion Notification in IP
+
+ We propose that the Internet provide a congestion indication for
+ incipient congestion (as in RED and earlier work [RJ90]) where the
+ notification can sometimes be through marking packets rather than
+ dropping them. This would require an ECN field in the IP header with
+ two bits. The ECN-Capable Transport (ECT) bit would be set by the
+ data sender to indicate that the end-points of the transport protocol
+ are ECN-capable. The CE bit would be set by the router to indicate
+ congestion to the end nodes. Routers that have a packet arriving at
+ a full queue would drop the packet, just as they do now.
+
+ Bits 6 and 7 in the IPv4 TOS octet are designated as the ECN field.
+ Bit 6 is designated as the ECT bit, and bit 7 is designated as the CE
+ bit. The IPv4 TOS octet corresponds to the Traffic Class octet in
+ IPv6. The definitions for the IPv4 TOS octet [RFC791] and the IPv6
+ Traffic Class octet are intended to be superseded by the DS
+ (Differentiated Services) Field [DIFFSERV]. Bits 6 and 7 are listed
+ in [DIFFSERV] as Currently Unused. Section 19 gives a brief history
+ of the TOS octet.
+
+ Because of the unstable history of the TOS octet, the use of the ECN
+ field as specified in this document cannot be guaranteed to be
+ backwards compatible with all past uses of these two bits. The
+ potential dangers of this lack of backwards compatibility are
+ discussed in Section 19.
+
+ Upon the receipt by an ECN-Capable transport of a single CE packet,
+ the congestion control algorithms followed at the end-systems MUST be
+ essentially the same as the congestion control response to a *single*
+ dropped packet. For example, for ECN-Capable TCP the source TCP is
+ required to halve its congestion window for any window of data
+
+
+
+Ramakrishnan & Floyd Experimental [Page 4]
+
+RFC 2481 ECN to IP January 1999
+
+
+ containing either a packet drop or an ECN indication. However, we
+ would like to point out some notable exceptions in the reaction of
+ the source TCP, related to following the shorter-time-scale details
+ of particular implementations of TCP. For TCP's response to an ECN
+ indication, we do not recommend such behavior as the slow-start of
+ Tahoe TCP in response to a packet drop, or Reno TCP's wait of roughly
+ half a round-trip time during Fast Recovery.
+
+ One reason for requiring that the congestion-control response to the
+ CE packet be essentially the same as the response to a dropped packet
+ is to accommodate the incremental deployment of ECN in both end-
+ systems and in routers. Some routers may drop ECN-Capable packets
+ (e.g., using the same RED policies for congestion detection) while
+ other routers set the CE bit, for equivalent levels of congestion.
+ Similarly, a router might drop a non-ECN-Capable packet but set the
+ CE bit in an ECN-Capable packet, for equivalent levels of congestion.
+ Different congestion control responses to a CE bit indication and to
+ a packet drop could result in unfair treatment for different flows.
+
+ An additional requirement is that the end-systems should react to
+ congestion at most once per window of data (i.e., at most once per
+ roundtrip time), to avoid reacting multiple times to multiple
+ indications of congestion within a roundtrip time.
+
+ For a router, the CE bit of an ECN-Capable packet should only be set
+ if the router would otherwise have dropped the packet as an
+ indication of congestion to the end nodes. When the router's buffer
+ is not yet full and the router is prepared to drop a packet to inform
+ end nodes of incipient congestion, the router should first check to
+ see if the ECT bit is set in that packet's IP header. If so, then
+ instead of dropping the packet, the router MAY instead set the CE bit
+ in the IP header.
+
+ An environment where all end nodes were ECN-Capable could allow new
+ criteria to be developed for setting the CE bit, and new congestion
+ control mechanisms for end-node reaction to CE packets. However,
+ this is a research issue, and as such is not addressed in this
+ document.
+
+ When a CE packet is received by a router, the CE bit is left
+ unchanged, and the packet transmitted as usual. When severe
+ congestion has occurred and the router's queue is full, then the
+ router has no choice but to drop some packet when a new packet
+ arrives. We anticipate that such packet losses will become
+ relatively infrequent when a majority of end-systems become ECN-
+ Capable and participate in TCP or other compatible congestion control
+ mechanisms. In an adequately-provisioned network in such an ECN-
+ Capable environment, packet losses should occur primarily during
+
+
+
+Ramakrishnan & Floyd Experimental [Page 5]
+
+RFC 2481 ECN to IP January 1999
+
+
+ transients or in the presence of non-cooperating sources.
+
+ We expect that routers will set the CE bit in response to incipient
+ congestion as indicated by the average queue size, using the RED
+ algorithms suggested in [FJ93, RFC2309]. To the best of our
+ knowledge, this is the only proposal currently under discussion in
+ the IETF for routers to drop packets proactively, before the buffer
+ overflows. However, this document does not attempt to specify a
+ particular mechanism for active queue management, leaving that
+ endeavor, if needed, to other areas of the IETF. While ECN is
+ inextricably tied up with active queue management at the router, the
+ reverse does not hold; active queue management mechanisms have been
+ developed and deployed independently from ECN, using packet drops as
+ indications of congestion in the absence of ECN in the IP
+ architecture.
+
+6. Support from the Transport Protocol
+
+ ECN requires support from the transport protocol, in addition to the
+ functionality given by the ECN field in the IP packet header. The
+ transport protocol might require negotiation between the endpoints
+ during setup to determine that all of the endpoints are ECN-capable,
+ so that the sender can set the ECT bit in transmitted packets.
+ Second, the transport protocol must be capable of reacting
+ appropriately to the receipt of CE packets. This reaction could be
+ in the form of the data receiver informing the data sender of the
+ received CE packet (e.g., TCP), of the data receiver unsubscribing to
+ a layered multicast group (e.g., RLM [MJV96]), or of some other
+ action that ultimately reduces the arrival rate of that flow to that
+ receiver.
+
+ This document only addresses the addition of ECN Capability to TCP,
+ leaving issues of ECN and other transport protocols to further
+ research. For TCP, ECN requires three new mechanisms: negotiation
+ between the endpoints during setup to determine if they are both
+ ECN-capable; an ECN-Echo flag in the TCP header so that the data
+ receiver can inform the data sender when a CE packet has been
+ received; and a Congestion Window Reduced (CWR) flag in the TCP
+ header so that the data sender can inform the data receiver that the
+ congestion window has been reduced. The support required from other
+ transport protocols is likely to be different, particular for
+ unreliable or reliable multicast transport protocols, and will have
+ to be determined as other transport protocols are brought to the IETF
+ for standardization.
+
+
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 6]
+
+RFC 2481 ECN to IP January 1999
+
+
+6.1. TCP
+
+ The following sections describe in detail the proposed use of ECN in
+ TCP. This proposal is described in essentially the same form in
+ [Floyd94]. We assume that the source TCP uses the standard congestion
+ control algorithms of Slow-start, Fast Retransmit and Fast Recovery
+ [RFC 2001].
+
+ This proposal specifies two new flags in the Reserved field of the
+ TCP header. The TCP mechanism for negotiating ECN-Capability uses
+ the ECN-Echo flag in the TCP header. (This was called the ECN Notify
+ flag in some earlier documents.) Bit 9 in the Reserved field of the
+ TCP header is designated as the ECN-Echo flag. The location of the
+ 6-bit Reserved field in the TCP header is shown in Figure 3 of RFC
+ 793 [RFC793].
+
+ To enable the TCP receiver to determine when to stop setting the
+ ECN-Echo flag, we introduce a second new flag in the TCP header, the
+ Congestion Window Reduced (CWR) flag. The CWR flag is assigned to
+ Bit 8 in the Reserved field of the TCP header.
+
+ The use of these flags is described in the sections below.
+
+6.1.1. TCP Initialization
+
+ In the TCP connection setup phase, the source and destination TCPs
+ exchange information about their desire and/or capability to use ECN.
+ Subsequent to the completion of this negotiation, the TCP sender sets
+ the ECT bit in the IP header of data packets to indicate to the
+ network that the transport is capable and willing to participate in
+ ECN for this packet. This will indicate to the routers that they may
+ mark this packet with the CE bit, if they would like to use that as a
+ method of congestion notification. If the TCP connection does not
+ wish to use ECN notification for a particular packet, the sending TCP
+ sets the ECT bit equal to 0 (i.e., not set), and the TCP receiver
+ ignores the CE bit in the received packet.
+
+ When a node sends a TCP SYN packet, it may set the ECN-Echo and CWR
+ flags in the TCP header. For a SYN packet, the setting of both the
+ ECN-Echo and CWR flags are defined as an indication that the sending
+ TCP is ECN-Capable, rather than as an indication of congestion or of
+ response to congestion. More precisely, a SYN packet with both the
+ ECN-Echo and CWR flags set indicates that the TCP implementation
+ transmitting the SYN packet will participate in ECN as both a sender
+ and receiver. As a receiver, it will respond to incoming data
+ packets that have the CE bit set in the IP header by setting the
+ ECN-Echo flag in outgoing TCP Acknowledgement (ACK) packets. As a
+ sender, it will respond to incoming packets that have the ECN-Echo
+
+
+
+Ramakrishnan & Floyd Experimental [Page 7]
+
+RFC 2481 ECN to IP January 1999
+
+
+ flag set by reducing the congestion window when appropriate.
+
+ When a node sends a SYN-ACK packet, it may set the ECN-Echo flag, but
+ it does not set the CWR flag. For a SYN-ACK packet, the pattern of
+ the ECN-Echo flag set and the CWR flag not set in the TCP header is
+ defined as an indication that the TCP transmitting the SYN-ACK packet
+ is ECN-Capable.
+
+ There is the question of why we chose to have the TCP sending the SYN
+ set two ECN-related flags in the Reserved field of the TCP header for
+ the SYN packet, while the responding TCP sending the SYN-ACK sets
+ only one ECN-related flag in the SYN-ACK packet. This asymmetry is
+ necessary for the robust negotiation of ECN-capability with deployed
+ TCP implementations. There exists at least one TCP implementation in
+ which TCP receivers set the Reserved field of the TCP header in ACK
+ packets (and hence the SYN-ACK) simply to reflect the Reserved field
+ of the TCP header in the received data packet. Because the TCP SYN
+ packet sets the ECN-Echo and CWR flags to indicate ECN-capability,
+ while the SYN-ACK packet sets only the ECN-Echo flag, the sending TCP
+ correctly interprets a receiver's reflection of its own flags in the
+ Reserved field as an indication that the receiver is not ECN-capable.
+
+6.1.2. The TCP Sender
+
+ For a TCP connection using ECN, data packets are transmitted with the
+ ECT bit set in the IP header (set to a "1"). If the sender receives
+ an ECN-Echo ACK packet (that is, an ACK packet with the ECN-Echo flag
+ set in the TCP header), then the sender knows that congestion was
+ encountered in the network on the path from the sender to the
+ receiver. The indication of congestion should be treated just as a
+ congestion loss in non-ECN-Capable TCP. That is, the TCP source
+ halves the congestion window "cwnd" and reduces the slow start
+ threshold "ssthresh". The sending TCP does NOT increase the
+ congestion window in response to the receipt of an ECN-Echo ACK
+ packet.
+
+ A critical condition is that TCP does not react to congestion
+ indications more than once every window of data (or more loosely,
+ more than once every round-trip time). That is, the TCP sender's
+ congestion window should be reduced only once in response to a series
+ of dropped and/or CE packets from a single window of data, In
+ addition, the TCP source should not decrease the slow-start
+ threshold, ssthresh, if it has been decreased within the last round
+ trip time. However, if any retransmitted packets are dropped or have
+ the CE bit set, then this is interpreted by the source TCP as a new
+ instance of congestion.
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 8]
+
+RFC 2481 ECN to IP January 1999
+
+
+ After the source TCP reduces its congestion window in response to a
+ CE packet, incoming acknowledgements that continue to arrive can
+ "clock out" outgoing packets as allowed by the reduced congestion
+ window. If the congestion window consists of only one MSS (maximum
+ segment size), and the sending TCP receives an ECN-Echo ACK packet,
+ then the sending TCP should in principle still reduce its congestion
+ window in half. However, the value of the congestion window is
+ bounded below by a value of one MSS. If the sending TCP were to
+ continue to send, using a congestion window of 1 MSS, this results in
+ the transmission of one packet per round-trip time. We believe it is
+ desirable to still reduce the sending rate of the TCP sender even
+ further, on receipt of an ECN-Echo packet when the congestion window
+ is one. We use the retransmit timer as a means to reduce the rate
+ further in this circumstance. Therefore, the sending TCP should also
+ reset the retransmit timer on receiving the ECN-Echo packet when the
+ congestion window is one. The sending TCP will then be able to send
+ a new packet when the retransmit timer expires.
+
+ [Floyd94] discusses TCP's response to ECN in more detail. [Floyd98]
+ discusses the validation test in the ns simulator, which illustrates
+ a wide range of ECN scenarios. These scenarios include the following:
+ an ECN followed by another ECN, a Fast Retransmit, or a Retransmit
+ Timeout; a Retransmit Timeout or a Fast Retransmit followed by an
+ ECN, and a congestion window of one packet followed by an ECN.
+
+ TCP follows existing algorithms for sending data packets in response
+ to incoming ACKs, multiple duplicate acknowledgements, or retransmit
+ timeouts [RFC2001].
+
+6.1.3. The TCP Receiver
+
+ When TCP receives a CE data packet at the destination end-system, the
+ TCP data receiver sets the ECN-Echo flag in the TCP header of the
+ subsequent ACK packet. If there is any ACK withholding implemented,
+ as in current "delayed-ACK" TCP implementations where the TCP
+ receiver can send an ACK for two arriving data packets, then the
+ ECN-Echo flag in the ACK packet will be set to the OR of the CE bits
+ of all of the data packets being acknowledged. That is, if any of
+ the received data packets are CE packets, then the returning ACK has
+ the ECN-Echo flag set.
+
+ To provide robustness against the possibility of a dropped ACK packet
+ carrying an ECN-Echo flag, the TCP receiver must set the ECN-Echo
+ flag in a series of ACK packets. The TCP receiver uses the CWR flag
+ to determine when to stop setting the ECN-Echo flag.
+
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 9]
+
+RFC 2481 ECN to IP January 1999
+
+
+ When an ECN-Capable TCP reduces its congestion window for any reason
+ (because of a retransmit timeout, a Fast Retransmit, or in response
+ to an ECN Notification), the TCP sets the CWR flag in the TCP header
+ of the first data packet sent after the window reduction. If that
+ data packet is dropped in the network, then the sending TCP will have
+ to reduce the congestion window again and retransmit the dropped
+ packet. Thus, the Congestion Window Reduced message is reliably
+ delivered to the data receiver.
+
+ After a TCP receiver sends an ACK packet with the ECN-Echo bit set,
+ that TCP receiver continues to set the ECN-Echo flag in ACK packets
+ until it receives a CWR packet (a packet with the CWR flag set).
+ After the receipt of the CWR packet, acknowledgements for subsequent
+ non-CE data packets do not have the ECN-Echo flag set. If another CE
+ packet is received by the data receiver, the receiver would once
+ again send ACK packets with the ECN-Echo flag set. While the receipt
+ of a CWR packet does not guarantee that the data sender received the
+ ECN-Echo message, this does indicate that the data sender reduced its
+ congestion window at some point *after* it sent the data packet for
+ which the CE bit was set.
+
+ We have already specified that a TCP sender reduces its congestion
+ window at most once per window of data. This mechanism requires some
+ care to make sure that the sender reduces its congestion window at
+ most once per ECN indication, and that multiple ECN messages over
+ several successive windows of data are properly reported to the ECN
+ sender. This is discussed further in [Floyd98].
+
+6.1.4. Congestion on the ACK-path
+
+ For the current generation of TCP congestion control algorithms, pure
+ acknowledgement packets (e.g., packets that do not contain any
+ accompanying data) should be sent with the ECT bit off. Current TCP
+ receivers have no mechanisms for reducing traffic on the ACK-path in
+ response to congestion notification. Mechanisms for responding to
+ congestion on the ACK-path are areas for current and future research.
+ (One simple possibility would be for the sender to reduce its
+ congestion window when it receives a pure ACK packet with the CE bit
+ set). For current TCP implementations, a single dropped ACK generally
+ has only a very small effect on the TCP's sending rate.
+
+7. Summary of changes required in IP and TCP
+
+ Two bits need to be specified in the IP header, the ECN-Capable
+ Transport (ECT) bit and the Congestion Experienced (CE) bit. The ECT
+ bit set to "0" indicates that the transport protocol will ignore the
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 10]
+
+RFC 2481 ECN to IP January 1999
+
+
+ CE bit. This is the default value for the ECT bit. The ECT bit set
+ to "1" indicates that the transport protocol is willing and able to
+ participate in ECN.
+
+ The default value for the CE bit is "0". The router sets the CE bit
+ to "1" to indicate congestion to the end nodes. The CE bit in a
+ packet header should never be reset by a router from "1" to "0".
+
+ TCP requires three changes, a negotiation phase during setup to
+ determine if both end nodes are ECN-capable, and two new flags in the
+ TCP header, from the "reserved" flags in the TCP flags field. The
+ ECN-Echo flag is used by the data receiver to inform the data sender
+ of a received CE packet. The Congestion Window Reduced flag is used
+ by the data sender to inform the data receiver that the congestion
+ window has been reduced.
+
+8. Non-relationship to ATM's EFCI indicator or Frame Relay's FECN
+
+ Since the ATM and Frame Relay mechanisms for congestion indication
+ have typically been defined without any notion of average queue size
+ as the basis for determining that an intermediate node is congested,
+ we believe that they provide a very noisy signal. The TCP-sender
+ reaction specified in this draft for ECN is NOT the appropriate
+ reaction for such a noisy signal of congestion notification. It is
+ our expectation that ATM's EFCI and Frame Relay's FECN mechanisms
+ would be phased out over time within the ATM network. However, if
+ the routers that interface to the ATM network have a way of
+ maintaining the average queue at the interface, and use it to come to
+ a reliable determination that the ATM subnet is congested, they may
+ use the ECN notification that is defined here.
+
+ We emphasize that a *single* packet with the CE bit set in an IP
+ packet causes the transport layer to respond, in terms of congestion
+ control, as it would to a packet drop. As such, the CE bit is not a
+ good match to a transient signal such as one based on the
+ instantaneous queue size. However, experiments in techniques at
+ layer 2 (e.g., in ATM switches or Frame Relay switches) should be
+ encouraged. For example, using a scheme such as RED (where packet
+ marking is based on the average queue length exceeding a threshold),
+ layer 2 devices could provide a reasonably reliable indication of
+ congestion. When all the layer 2 devices in a path set that layer's
+ own Congestion Experienced bit (e.g., the EFCI bit for ATM, the FECN
+ bit in Frame Relay) in this reliable manner, then the interface
+ router to the layer 2 network could copy the state of that layer 2
+ Congestion Experienced bit into the CE bit in the IP header. We
+ recognize that this is not the current practice, nor is it in current
+ standards. However, encouraging experimentation in this manner may
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 11]
+
+RFC 2481 ECN to IP January 1999
+
+
+ provide the information needed to enable evolution of existing layer
+ 2 mechanisms to provide a more reliable means of congestion
+ indication, when they use a single bit for indicating congestion.
+
+9. Non-compliance by the End Nodes
+
+ This section discusses concerns about the vulnerability of ECN to
+ non-compliant end-nodes (i.e., end nodes that set the ECT bit in
+ transmitted packets but do not respond to received CE packets). We
+ argue that the addition of ECN to the IP architecture would not
+ significantly increase the current vulnerability of the architecture
+ to unresponsive flows.
+
+ Even for non-ECN environments, there are serious concerns about the
+ damage that can be done by non-compliant or unresponsive flows (that
+ is, flows that do not respond to congestion control indications by
+ reducing their arrival rate at the congested link). For example, an
+ end-node could "turn off congestion control" by not reducing its
+ congestion window in response to packet drops. This is a concern for
+ the current Internet. It has been argued that routers will have to
+ deploy mechanisms to detect and differentially treat packets from
+ non-compliant flows. It has also been argued that techniques such as
+ end-to-end per-flow scheduling and isolation of one flow from
+ another, differentiated services, or end-to-end reservations could
+ remove some of the more damaging effects of unresponsive flows.
+
+ It has been argued that dropping packets in itself may be an adequate
+ deterrent for non-compliance, and that the use of ECN removes this
+ deterrent. We would argue in response that (1) ECN-capable routers
+ preserve packet-dropping behavior in times of high congestion; and
+ (2) even in times of high congestion, dropping packets in itself is
+ not an adequate deterrent for non-compliance.
+
+ First, ECN-Capable routers will only mark packets (as opposed to
+ dropping them) when the packet marking rate is reasonably low. During
+ periods where the average queue size exceeds an upper threshold, and
+ therefore the potential packet marking rate would be high, our
+ recommendation is that routers drop packets rather then set the CE
+ bit in packet headers.
+
+ During the periods of low or moderate packet marking rates when ECN
+ would be deployed, there would be little deterrent effect on
+ unresponsive flows of dropping rather than marking those packets. For
+ example, delay-insensitive flows using reliable delivery might have
+ an incentive to increase rather than to decrease their sending rate
+ in the presence of dropped packets. Similarly, delay-sensitive flows
+ using unreliable delivery might increase their use of FEC in response
+ to an increased packet drop rate, increasing rather than decreasing
+
+
+
+Ramakrishnan & Floyd Experimental [Page 12]
+
+RFC 2481 ECN to IP January 1999
+
+
+ their sending rate. For the same reasons, we do not believe that
+ packet dropping itself is an effective deterrent for non-compliance
+ even in an environment of high packet drop rates.
+
+ Several methods have been proposed to identify and restrict non-
+ compliant or unresponsive flows. The addition of ECN to the network
+ environment would not in any way increase the difficulty of designing
+ and deploying such mechanisms. If anything, the addition of ECN to
+ the architecture would make the job of identifying unresponsive flows
+ slightly easier. For example, in an ECN-Capable environment routers
+ are not limited to information about packets that are dropped or have
+ the CE bit set at that router itself; in such an environment routers
+ could also take note of arriving CE packets that indicate congestion
+ encountered by that packet earlier in the path.
+
+10. Non-compliance in the Network
+
+ The breakdown of effective congestion control could be caused not
+ only by a non-compliant end-node, but also by the loss of the
+ congestion indication in the network itself. This could happen
+ through a rogue or broken router that set the ECT bit in a packet
+ from a non-ECN-capable transport, or "erased" the CE bit in arriving
+ packets. As one example, a rogue or broken router that "erased" the
+ CE bit in arriving CE packets would prevent that indication of
+ congestion from reaching downstream receivers. This could result in
+ the failure of congestion control for that flow and a resulting
+ increase in congestion in the network, ultimately resulting in
+ subsequent packets dropped for this flow as the average queue size
+ increased at the congested gateway.
+
+ The actions of a rogue or broken router could also result in an
+ unnecessary indication of congestion to the end-nodes. These actions
+ can include a router dropping a packet or setting the CE bit in the
+ absence of congestion. From a congestion control point of view,
+ setting the CE bit in the absence of congestion by a non-compliant
+ router would be no different than a router dropping a packet
+ unecessarily. By "erasing" the ECT bit of a packet that is later
+ dropped in the network, a router's actions could result in an
+ unnecessary packet drop for that packet later in the network.
+
+ Concerns regarding the loss of congestion indications from
+ encapsulated, dropped, or corrupted packets are discussed below.
+
+
+
+
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 13]
+
+RFC 2481 ECN to IP January 1999
+
+
+10.1. Encapsulated packets
+
+ Some care is required to handle the CE and ECT bits appropriately
+ when packets are encapsulated and de-encapsulated for tunnels.
+
+ When a packet is encapsulated, the following rules apply regarding
+ the ECT bit. First, if the ECT bit in the encapsulated ('inside')
+ header is a 0, then the ECT bit in the encapsulating ('outside')
+ header MUST be a 0. If the ECT bit in the inside header is a 1, then
+ the ECT bit in the outside header SHOULD be a 1.
+
+ When a packet is de-encapsulated, the following rules apply regarding
+ the CE bit. If the ECT bit is a 1 in both the inside and the outside
+ header, then the CE bit in the outside header MUST be ORed with the
+ CE bit in the inside header. (That is, in this case a CE bit of 1 in
+ the outside header must be copied to the inside header.) If the ECT
+ bit in either header is a 0, then the CE bit in the outside header is
+ ignored. This requirement for the treatment of de-encapsulated
+ packets does not currently apply to IPsec tunnels.
+
+ A specific example of the use of ECN with encapsulation occurs when a
+ flow wishes to use ECN-capability to avoid the danger of an
+ unnecessary packet drop for the encapsulated packet as a result of
+ congestion at an intermediate node in the tunnel. This functionality
+ can be supported by copying the ECN field in the inner IP header to
+ the outer IP header upon encapsulation, and using the ECN field in
+ the outer IP header to set the ECN field in the inner IP header upon
+ decapsulation. This effectively allows routers along the tunnel to
+ cause the CE bit to be set in the ECN field of the unencapsulated IP
+ header of an ECN-capable packet when such routers experience
+ congestion.
+
+10.2. IPsec Tunnel Considerations
+
+ The IPsec protocol, as defined in [ESP, AH], does not include the IP
+ header's ECN field in any of its cryptographic calculations (in the
+ case of tunnel mode, the outer IP header's ECN field is not
+ included). Hence modification of the ECN field by a network node has
+ no effect on IPsec's end-to-end security, because it cannot cause any
+ IPsec integrity check to fail. As a consequence, IPsec does not
+ provide any defense against an adversary's modification of the ECN
+ field (i.e., a man-in-the-middle attack), as the adversary's
+ modification will also have no effect on IPsec's end-to-end security.
+ In some environments, the ability to modify the ECN field without
+ affecting IPsec integrity checks may constitute a covert channel; if
+ it is necessary to eliminate such a channel or reduce its bandwidth,
+ then the outer IP header's ECN field can be zeroed at the tunnel
+ ingress and egress nodes.
+
+
+
+Ramakrishnan & Floyd Experimental [Page 14]
+
+RFC 2481 ECN to IP January 1999
+
+
+ The IPsec protocol currently requires that the inner header's ECN
+ field not be changed by IPsec decapsulation processing at a tunnel
+ egress node. This ensures that an adversary's modifications to the
+ ECN field cannot be used to launch theft- or denial-of-service
+ attacks across an IPsec tunnel endpoint, as any such modifications
+ will be discarded at the tunnel endpoint. This document makes no
+ change to that IPsec requirement. As a consequence of the current
+ specification of the IPsec protocol, we suggest that experiments with
+ ECN not be carried out for flows that will undergo IPsec tunneling at
+ the present time.
+
+ If the IPsec specifications are modified in the future to permit a
+ tunnel egress node to modify the ECN field in an inner IP header
+ based on the ECN field value in the outer header (e.g., copying part
+ or all of the outer ECN field to the inner ECN field), or to permit
+ the ECN field of the outer IP header to be zeroed during
+ encapsulation, then experiments with ECN may be used in combination
+ with IPsec tunneling.
+
+ This discussion of ECN and IPsec tunnel considerations draws heavily
+ on related discussions and documents from the Differentiated Services
+ Working Group.
+
+10.3. Dropped or Corrupted Packets
+
+ An additional issue concerns a packet that has the CE bit set at one
+ router and is dropped by a subsequent router. For the proposed use
+ for ECN in this paper (that is, for a transport protocol such as TCP
+ for which a dropped data packet is an indication of congestion), end
+ nodes detect dropped data packets, and the congestion response of the
+ end nodes to a dropped data packet is at least as strong as the
+ congestion response to a received CE packet.
+
+ However, transport protocols such as TCP do not necessarily detect
+ all packet drops, such as the drop of a "pure" ACK packet; for
+ example, TCP does not reduce the arrival rate of subsequent ACK
+ packets in response to an earlier dropped ACK packet. Any proposal
+ for extending ECN-Capability to such packets would have to address
+ concerns raised by CE packets that were later dropped in the network.
+
+ Similarly, if a CE packet is dropped later in the network due to
+ corruption (bit errors), the end nodes should still invoke congestion
+ control, just as TCP would today in response to a dropped data
+ packet. This issue of corrupted CE packets would have to be
+ considered in any proposal for the network to distinguish between
+ packets dropped due to corruption, and packets dropped due to
+ congestion or buffer overflow.
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 15]
+
+RFC 2481 ECN to IP January 1999
+
+
+11. A summary of related work.
+
+ [Floyd94] considers the advantages and drawbacks of adding ECN to the
+ TCP/IP architecture. As shown in the simulation-based comparisons,
+ one advantage of ECN is to avoid unnecessary packet drops for short
+ or delay-sensitive TCP connections. A second advantage of ECN is in
+ avoiding some unnecessary retransmit timeouts in TCP. This paper
+ discusses in detail the integration of ECN into TCP's congestion
+ control mechanisms. The possible disadvantages of ECN discussed in
+ the paper are that a non-compliant TCP connection could falsely
+ advertise itself as ECN-capable, and that a TCP ACK packet carrying
+ an ECN-Echo message could itself be dropped in the network. The
+ first of these two issues is discussed in Section 8 of this document,
+ and the second is addressed by the proposal in Section 5.1.3 for a
+ CWR flag in the TCP header.
+
+ [CKLTZ97] reports on an experimental implementation of ECN in IPv6.
+ The experiments include an implementation of ECN in an existing
+ implementation of RED for FreeBSD. A number of experiments were run
+ to demonstrate the control of the average queue size in the router,
+ the performance of ECN for a single TCP connection as a congested
+ router, and fairness with multiple competing TCP connections. One
+ conclusion of the experiments is that dropping packets from a bulk-
+ data transfer can degrade performance much more severely than marking
+ packets.
+
+ Because the experimental implementation in [CKLTZ97] predates some of
+ the developments in this document, the implementation does not
+ conform to this document in all respects. For example, in the
+ experimental implementation the CWR flag is not used, but instead the
+ TCP receiver sends the ECN-Echo bit on a single ACK packet.
+
+ [K98] and [CKLTZ98] build on [CKLTZ97] to further analyze the
+ benefits of ECN for TCP. The conclusions are that ECN TCP gets
+ moderately better throughput than non-ECN TCP; that ECN TCP flows are
+ fair towards non-ECN TCP flows; and that ECN TCP is robust with two-
+ way traffic, congestion in both directions, and with multiple
+ congested gateways. Experiments with many short web transfers show
+ that, while most of the short connections have similar transfer times
+ with or without ECN, a small percentage of the short connections have
+ very long transfer times for the non-ECN experiments as compared to
+ the ECN experiments. This increased transfer time is particularly
+ dramatic for those short connections that have their first packet
+ dropped in the non-ECN experiments, and that therefore have to wait
+ six seconds for the retransmit timer to expire.
+
+ The ECN Web Page [ECN] has pointers to other implementations of ECN
+ in progress.
+
+
+
+Ramakrishnan & Floyd Experimental [Page 16]
+
+RFC 2481 ECN to IP January 1999
+
+
+12. Conclusions
+
+ Given the current effort to implement RED, we believe this is the
+ right time for router vendors to examine how to implement congestion
+ avoidance mechanisms that do not depend on packet drops alone. With
+ the increased deployment of applications and transports sensitive to
+ the delay and loss of a single packet (e.g., realtime traffic, short
+ web transfers), depending on packet loss as a normal congestion
+ notification mechanism appears to be insufficient (or at the very
+ least, non-optimal).
+
+13. Acknowledgements
+
+ Many people have made contributions to this RFC. In particular, we
+ would like to thank Kenjiro Cho for the proposal for the TCP
+ mechanism for negotiating ECN-Capability, Kevin Fall for the proposal
+ of the CWR bit, Steve Blake for material on IPv4 Header Checksum
+ Recalculation, Jamal Hadi Salim for discussions of ECN issues, and
+ Steve Bellovin, Jim Bound, Brian Carpenter, Paul Ferguson, Stephen
+ Kent, Greg Minshall, and Vern Paxson for discussions of security
+ issues. We also thank the Internet End-to-End Research Group for
+ ongoing discussions of these issues.
+
+
+14. References
+
+ [AH] Kent, S. and R. Atkinson, "IP Authentication Header",
+ RFC 2402, November 1998.
+
+ [B97] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [CKLT98] Chen, C., Krishnan, H., Leung, S., Tang, N., and Zhang,
+ L., "Implementing ECN for TCP/IPv6", presentation to the
+ ECN BOF at the L.A. IETF, March 1998, URL
+ "http://www.cs.ucla.edu/~hari/ecn-ietf.ps".
+
+ [DIFFSERV] Nichols, K., Blake, S., Baker, F. and D. Black,
+ "Definition of the Differentiated Services Field (DS
+ Field) in the IPv4 and IPv6 Headers", RFC 2474, December
+ 1998.
+
+ [ECN] "The ECN Web Page", URL "http://www-
+ nrg.ee.lbl.gov/floyd/ecn.html".
+
+ [ESP] Kent, S. and R. Atkinson, "IP Encapsulating Security
+ Payload", RFC 2406, November 1998.
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 17]
+
+RFC 2481 ECN to IP January 1999
+
+
+ [FJ93] Floyd, S., and Jacobson, V., "Random Early Detection
+ gateways for Congestion Avoidance", IEEE/ACM
+ Transactions on Networking, V.1 N.4, August 1993, p.
+ 397-413. URL "ftp://ftp.ee.lbl.gov/papers/early.pdf".
+
+ [Floyd94] Floyd, S., "TCP and Explicit Congestion Notification",
+ ACM Computer Communication Review, V. 24 N. 5, October
+ 1994, p. 10-23. URL
+ "ftp://ftp.ee.lbl.gov/papers/tcp_ecn.4.ps.Z".
+
+ [Floyd97] Floyd, S., and Fall, K., "Router Mechanisms to Support
+ End-to-End Congestion Control", Technical report,
+ February 1997. URL "http://www-
+ nrg.ee.lbl.gov/floyd/end2end-paper.html".
+
+ [Floyd98] Floyd, S., "The ECN Validation Test in the NS
+ Simulator", URL "http://www-mash.cs.berkeley.edu/ns/",
+ test tcl/test/test-all-ecn.
+
+ [K98] Krishnan, H., "Analyzing Explicit Congestion
+ Notification (ECN) benefits for TCP", Master's thesis,
+ UCLA, 1998, URL
+ "http://www.cs.ucla.edu/~hari/software/ecn/
+ ecn_report.ps.gz".
+
+ [FRED] Lin, D., and Morris, R., "Dynamics of Random Early
+ Detection", SIGCOMM '97, September 1997. URL
+ "http://www.inria.fr/rodeo/sigcomm97/program.html#ab078".
+
+ [Jacobson88] V. Jacobson, "Congestion Avoidance and Control", Proc.
+ ACM SIGCOMM '88, pp. 314-329. URL
+ "ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z".
+
+ [Jacobson90] V. Jacobson, "Modified TCP Congestion Avoidance
+ Algorithm", Message to end2end-interest mailing list,
+ April 1990. URL
+ "ftp://ftp.ee.lbl.gov/email/vanj.90apr30.txt".
+
+ [MJV96] S. McCanne, V. Jacobson, and M. Vetterli, "Receiver-
+ driven Layered Multicast", SIGCOMM '96, August 1996, pp.
+ 117-130.
+
+ [RFC791] Postel, J., "Internet Protocol", STD 5, RFC 791,
+ September 1981.
+
+ [RFC793] Postel, J., "Transmission Control Protocol", STD 7, RFC
+ 793, September 1981.
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 18]
+
+RFC 2481 ECN to IP January 1999
+
+
+ [RFC1141] Mallory, T. and A. Kullberg, "Incremental Updating of
+ the Internet Checksum", RFC 1141, January 1990.
+
+ [RFC1349] Almquist, P., "Type of Service in the Internet Protocol
+ Suite", RFC 1349, July 1992.
+
+ [RFC1455] Eastlake, D., "Physical Link Security Type of Service",
+ RFC 1455, May 1993.
+
+ [RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
+ Retransmit, and Fast Recovery Algorithms", RFC 2001,
+ January 1997.
+
+ [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B.,
+ Deering, S., Estrin, D., Floyd, S., Jacobson, V.,
+ Minshall, G., Partridge, C., Peterson, L., Ramakrishnan,
+ K., Shenker, S., Wroclawski, J. and L. Zhang,
+ "Recommendations on Queue Management and Congestion
+ Avoidance in the Internet", RFC 2309, April 1998.
+
+ [RJ90] K. K. Ramakrishnan and Raj Jain, "A Binary Feedback
+ Scheme for Congestion Avoidance in Computer Networks",
+ ACM Transactions on Computer Systems, Vol.8, No.2, pp.
+ 158-181, May 1990.
+
+15. Security Considerations
+
+ Security considerations have been discussed in Section 9.
+
+16. IPv4 Header Checksum Recalculation
+
+ IPv4 header checksum recalculation is an issue with some high-end
+ router architectures using an output-buffered switch, since most if
+ not all of the header manipulation is performed on the input side of
+ the switch, while the ECN decision would need to be made local to the
+ output buffer. This is not an issue for IPv6, since there is no IPv6
+ header checksum. The IPv4 TOS octet is the last byte of a 16-bit
+ half-word.
+
+ RFC 1141 [RFC1141] discusses the incremental updating of the IPv4
+ checksum after the TTL field is decremented. The incremental
+ updating of the IPv4 checksum after the CE bit was set would work as
+ follows: Let HC be the original header checksum, and let HC' be the
+ new header checksum after the CE bit has been set. Then for header
+ checksums calculated with one's complement subtraction, HC' would be
+ recalculated as follows:
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 19]
+
+RFC 2481 ECN to IP January 1999
+
+
+ HC' = { HC - 1 HC > 1
+ { 0x0000 HC = 1
+
+ For header checksums calculated on two's complement machines, HC'
+ would be recalculated as follows after the CE bit was set:
+
+ HC' = { HC - 1 HC > 0
+ { 0xFFFE HC = 0
+
+17. The motivation for the ECT bit.
+
+ The need for the ECT bit is motivated by the fact that ECN will be
+ deployed incrementally in an Internet where some transport protocols
+ and routers understand ECN and some do not. With the ECT bit, the
+ router can drop packets from flows that are not ECN-capable, but can
+ *instead* set the CE bit in flows that *are* ECN-capable. Because the
+ ECT bit allows an end node to have the CE bit set in a packet
+ *instead* of having the packet dropped, an end node might have some
+ incentive to deploy ECN.
+
+ If there was no ECT indication, then the router would have to set the
+ CE bit for packets from both ECN-capable and non-ECN-capable flows.
+ In this case, there would be no incentive for end-nodes to deploy
+ ECN, and no viable path of incremental deployment from a non-ECN
+ world to an ECN-capable world. Consider the first stages of such an
+ incremental deployment, where a subset of the flows are ECN-capable.
+ At the onset of congestion, when the packet dropping/marking rate
+ would be low, routers would only set CE bits, rather than dropping
+ packets. However, only those flows that are ECN-capable would
+ understand and respond to CE packets. The result is that the ECN-
+ capable flows would back off, and the non-ECN-capable flows would be
+ unaware of the ECN signals and would continue to open their
+ congestion windows.
+
+ In this case, there are two possible outcomes: (1) the ECN-capable
+ flows back off, the non-ECN-capable flows get all of the bandwidth,
+ and congestion remains mild, or (2) the ECN-capable flows back off,
+ the non-ECN-capable flows don't, and congestion increases until the
+ router transitions from setting the CE bit to dropping packets.
+ While this second outcome evens out the fairness, the ECN-capable
+ flows would still receive little benefit from being ECN-capable,
+ because the increased congestion would drive the router to packet-
+ dropping behavior.
+
+ A flow that advertised itself as ECN-Capable but does not respond to
+ CE bits is functionally equivalent to a flow that turns off
+ congestion control, as discussed in Sections 8 and 9.
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 20]
+
+RFC 2481 ECN to IP January 1999
+
+
+ Thus, in a world when a subset of the flows are ECN-capable, but
+ where ECN-capable flows have no mechanism for indicating that fact to
+ the routers, there would be less effective and less fair congestion
+ control in the Internet, resulting in a strong incentive for end
+ nodes not to deploy ECN.
+
+18. Why use two bits in the IP header?
+
+ Given the need for an ECT indication in the IP header, there still
+ remains the question of whether the ECT (ECN-Capable Transport) and
+ CE (Congestion Experienced) indications should be overloaded on a
+ single bit. This overloaded-one-bit alternative, explored in
+ [Floyd94], would involve a single bit with two values. One value,
+ "ECT and not CE", would represent an ECN-Capable Transport, and the
+ other value, "CE or not ECT", would represent either Congestion
+ Experienced or a non-ECN-Capable transport.
+
+ One difference between the one-bit and two-bit implementations
+ concerns packets that traverse multiple congested routers. Consider
+ a CE packet that arrives at a second congested router, and is
+ selected by the active queue management at that router for either
+ marking or dropping. In the one-bit implementation, the second
+ congested router has no choice but to drop the CE packet, because it
+ cannot distinguish between a CE packet and a non-ECT packet. In the
+ two-bit implementation, the second congested router has the choice of
+ either dropping the CE packet, or of leaving it alone with the CE bit
+ set.
+
+ Another difference between the one-bit and two-bit implementations
+ comes from the fact that with the one-bit implementation, receivers
+ in a single flow cannot distinguish between CE and non-ECT packets.
+ Thus, in the one-bit implementation an ECN-capable data sender would
+ have to unambiguously indicate to the receiver or receivers whether
+ each packet had been sent as ECN-Capable or as non-ECN-Capable. One
+ possibility would be for the sender to indicate in the transport
+ header whether the packet was sent as ECN-Capable. A second
+ possibility that would involve a functional limitation for the one-
+ bit implementation would be for the sender to unambiguously indicate
+ that it was going to send *all* of its packets as ECN-Capable or as
+ non-ECN-Capable. For a multicast transport protocol, this
+ unambiguous indication would have to be apparent to receivers joining
+ an on-going multicast session.
+
+ Another advantage of the two-bit approach is that it is somewhat more
+ robust. The most critical issue, discussed in Section 8, is that the
+ default indication should be that of a non-ECN-Capable transport. In
+ a two-bit implementation, this requirement for the default value
+ simply means that the ECT bit should be `OFF' by default. In the
+
+
+
+Ramakrishnan & Floyd Experimental [Page 21]
+
+RFC 2481 ECN to IP January 1999
+
+
+ one-bit implementation, this means that the single overloaded bit
+ should by default be in the "CE or not ECT" position. This is less
+ clear and straightforward, and possibly more open to incorrect
+ implementations either in the end nodes or in the routers.
+
+ In summary, while the one-bit implementation could be a possible
+ implementation, it has the following significant limitations relative
+ to the two-bit implementation. First, the one-bit implementation has
+ more limited functionality for the treatment of CE packets at a
+ second congested router. Second, the one-bit implementation requires
+ either that extra information be carried in the transport header of
+ packets from ECN-Capable flows (to convey the functionality of the
+ second bit elsewhere, namely in the transport header), or that
+ senders in ECN-Capable flows accept the limitation that receivers
+ must be able to determine a priori which packets are ECN-Capable and
+ which are not ECN-Capable. Third, the one-bit implementation is
+ possibly more open to errors from faulty implementations that choose
+ the wrong default value for the ECN bit. We believe that the use of
+ the extra bit in the IP header for the ECT-bit is extremely valuable
+ to overcome these limitations.
+
+19. Historical definitions for the IPv4 TOS octet
+
+ RFC 791 [RFC791] defined the ToS (Type of Service) octet in the IP
+ header. In RFC 791, bits 6 and 7 of the ToS octet are listed as
+ "Reserved for Future Use", and are shown set to zero. The first two
+ fields of the ToS octet were defined as the Precedence and Type of
+ Service (TOS) fields.
+
+ 0 1 2 3 4 5 6 7
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+ | PRECEDENCE | TOS | 0 | 0 | RFC 791
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+
+ RFC 1122 included bits 6 and 7 in the TOS field, though it did not
+ discuss any specific use for those two bits:
+
+ 0 1 2 3 4 5 6 7
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+ | PRECEDENCE | TOS | RFC 1122
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+
+ The IPv4 TOS octet was redefined in RFC 1349 [RFC1349] as follows:
+
+ 0 1 2 3 4 5 6 7
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+ | PRECEDENCE | TOS | MBZ | RFC 1349
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 22]
+
+RFC 2481 ECN to IP January 1999
+
+
+ Bit 6 in the TOS field was defined in RFC 1349 for "Minimize Monetary
+ Cost". In addition to the Precedence and Type of Service (TOS)
+ fields, the last field, MBZ (for "must be zero") was defined as
+ currently unused. RFC 1349 stated that "The originator of a datagram
+ sets [the MBZ] field to zero (unless participating in an Internet
+ protocol experiment which makes use of that bit)."
+
+ RFC 1455 [RFC 1455] defined an experimental standard that used all
+ four bits in the TOS field to request a guaranteed level of link
+ security.
+
+ RFC 1349 is obsoleted by "Definition of the Differentiated Services
+ Field (DS Field) in the IPv4 and IPv6 Headers" [DIFFSERV], in which
+ bits 6 and 7 of the DS field are listed as Currently Unused (CU).
+ The first six bits of the DS field are defined as the Differentiated
+ Services CodePoint (DSCP):
+
+ 0 1 2 3 4 5 6 7
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+ | DSCP | CU |
+ +-----+-----+-----+-----+-----+-----+-----+-----+
+
+ Because of this unstable history, the definition of the ECN field in
+ this document cannot be guaranteed to be backwards compatible with
+ all past uses of these two bits. The damage that could be done by a
+ non-ECN-capable router would be to "erase" the CE bit for an ECN-
+ capable packet that arrived at the router with the CE bit set, or set
+ the CE bit even in the absence of congestion. This has been
+ discussed in Section 10 on "Non-compliance in the Network".
+
+ The damage that could be done in an ECN-capable environment by a
+ non-ECN-capable end-node transmitting packets with the ECT bit set
+ has been discussed in Section 9 on "Non-compliance by the End Nodes".
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 23]
+
+RFC 2481 ECN to IP January 1999
+
+
+AUTHORS' ADDRESSES
+
+ K. K. Ramakrishnan
+ AT&T Labs. Research
+
+ Phone: +1 (973) 360-8766
+ EMail: kkrama@research.att.com
+ URL: http://www.research.att.com/info/kkrama
+
+
+ Sally Floyd
+ Lawrence Berkeley National Laboratory
+
+ Phone: +1 (510) 486-7518
+ EMail: floyd@ee.lbl.gov
+ URL: http://www-nrg.ee.lbl.gov/floyd/
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 24]
+
+RFC 2481 ECN to IP January 1999
+
+
+Full Copyright Statement
+
+ Copyright (C) The Internet Society (1999). All Rights Reserved.
+
+ This document and translations of it may be copied and furnished to
+ others, and derivative works that comment on or otherwise explain it
+ or assist in its implementation may be prepared, copied, published
+ and distributed, in whole or in part, without restriction of any
+ kind, provided that the above copyright notice and this paragraph are
+ included on all such copies and derivative works. However, this
+ document itself may not be modified in any way, such as by removing
+ the copyright notice or references to the Internet Society or other
+ Internet organizations, except as needed for the purpose of
+ developing Internet standards in which case the procedures for
+ copyrights defined in the Internet Standards process must be
+ followed, or as required to translate it into languages other than
+ English.
+
+ The limited permissions granted above are perpetual and will not be
+ revoked by the Internet Society or its successors or assigns.
+
+ This document and the information contained herein is provided on an
+ "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+ TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+ BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+ HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+ MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Ramakrishnan & Floyd Experimental [Page 25]
+