1 files changed, 2299 insertions, 0 deletions
diff --git a/doc/rfc/rfc7141.txt b/doc/rfc/rfc7141.txt
new file mode 100644
index 0000000..8751058
--- /dev/null
+++ b/doc/rfc/rfc7141.txt
@@ -0,0 +1,2299 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                        B. Briscoe
+Request for Comments: 7141                                            BT
+BCP: 41                                                        J. Manner
+Updates: 2309, 2914                                     Aalto University
+Category: Best Current Practice                            February 2014
+ISSN: 2070-1721
+
+
+                Byte and Packet Congestion Notification
+
+Abstract
+
+   This document provides recommendations of best current practice for
+   dropping or marking packets using any active queue management (AQM)
+   algorithm, including Random Early Detection (RED), BLUE, Pre-
+   Congestion Notification (PCN), and newer schemes such as CoDel
+   (Controlled Delay) and PIE (Proportional Integral controller
+   Enhanced).  We give three strong recommendations: (1) packet size
+   should be taken into account when transports detect and respond to
+   congestion indications, (2) packet size should not be taken into
+   account when network equipment creates congestion signals (marking,
+   dropping), and therefore (3) in the specific case of RED, the byte-
+   mode packet drop variant that drops fewer small packets should not be
+   used.  This memo updates RFC 2309 to deprecate deliberate
+   preferential treatment of small packets in AQM algorithms.
+
+Status of This Memo
+
+   This memo documents an Internet Best Current Practice.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Further information on
+   BCPs is available in Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc7141.
+
+
+
+
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 1]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+Copyright Notice
+
+   Copyright (c) 2014 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 2]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+Table of Contents
+
+   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
+     1.1.  Terminology and Scoping . . . . . . . . . . . . . . . . .   6
+     1.2.  Example Comparing Packet-Mode Drop and Byte-Mode Drop . .   7
+   2.  Recommendations . . . . . . . . . . . . . . . . . . . . . . .   9
+     2.1.  Recommendation on Queue Measurement . . . . . . . . . . .   9
+     2.2.  Recommendation on Encoding Congestion Notification  . . .  10
+     2.3.  Recommendation on Responding to Congestion  . . . . . . .  11
+     2.4.  Recommendation on Handling Congestion Indications When
+           Splitting or Merging Packets  . . . . . . . . . . . . . .  12
+   3.  Motivating Arguments  . . . . . . . . . . . . . . . . . . . .  13
+     3.1.  Avoiding Perverse Incentives to (Ab)use Smaller Packets .  13
+     3.2.  Small != Control  . . . . . . . . . . . . . . . . . . . .  14
+     3.3.  Transport-Independent Network . . . . . . . . . . . . . .  14
+     3.4.  Partial Deployment of AQM . . . . . . . . . . . . . . . .  16
+     3.5.  Implementation Efficiency . . . . . . . . . . . . . . . .  17
+   4.  A Survey and Critique of Past Advice  . . . . . . . . . . . .  17
+     4.1.  Congestion Measurement Advice . . . . . . . . . . . . . .  18
+       4.1.1.  Fixed-Size Packet Buffers . . . . . . . . . . . . . .  18
+       4.1.2.  Congestion Measurement without a Queue  . . . . . . .  19
+     4.2.  Congestion Notification Advice  . . . . . . . . . . . . .  20
+       4.2.1.  Network Bias When Encoding  . . . . . . . . . . . . .  20
+       4.2.2.  Transport Bias When Decoding  . . . . . . . . . . . .  22
+       4.2.3.  Making Transports Robust against Control Packet
+               Losses  . . . . . . . . . . . . . . . . . . . . . . .  23
+       4.2.4.  Congestion Notification: Summary of Conflicting
+               Advice  . . . . . . . . . . . . . . . . . . . . . . .  24
+   5.  Outstanding Issues and Next Steps . . . . . . . . . . . . . .  25
+     5.1.  Bit-congestible Network . . . . . . . . . . . . . . . . .  25
+     5.2.  Bit- and Packet-Congestible Network . . . . . . . . . . .  26
+   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  26
+   7.  Conclusions . . . . . . . . . . . . . . . . . . . . . . . . .  27
+   8.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  28
+   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  28
+     9.1.  Normative References  . . . . . . . . . . . . . . . . . .  28
+     9.2.  Informative References  . . . . . . . . . . . . . . . . .  29
+   Appendix A.  Survey of RED Implementation Status  . . . . . . . .  33
+   Appendix B.  Sufficiency of Packet-Mode Drop  . . . . . . . . . .  34
+     B.1.  Packet-Size (In)Dependence in Transports  . . . . . . . .  35
+     B.2.  Bit-Congestible and Packet-Congestible Indications  . . .  38
+   Appendix C.  Byte-Mode Drop Complicates Policing Congestion
+                Response . . . . . . . . . . . . . . . . . . . . . .  39
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 3]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+1.  Introduction
+
+   This document provides recommendations of best current practice for
+   how we should correctly scale congestion control functions with
+   respect to packet size for the long term.  It also recognises that
+   expediency may be necessary to deal with existing widely deployed
+   protocols that don't live up to the long-term goal.
+
+   When signalling congestion, the problem of how (and whether) to take
+   packet sizes into account has exercised the minds of researchers and
+   practitioners for as long as active queue management (AQM) has been
+   discussed.  Indeed, one reason AQM was originally introduced was to
+   reduce the lock-out effects that small packets can have on large
+   packets in tail-drop queues.  This memo aims to state the principles
+   we should be using and to outline how these principles will affect
+   future protocol design, taking into account pre-existing deployments.
+
+   The question of whether to take into account packet size arises at
+   three stages in the congestion notification process:
+
+   Measuring congestion:  When a congested resource measures locally how
+      congested it is, should it measure its queue length in time,
+      bytes, or packets?
+
+   Encoding congestion notification into the wire protocol:  When a
+      congested network resource signals its level of congestion, should
+      the probability that it drops/marks each packet depend on the size
+      of the particular packet in question?
+
+   Decoding congestion notification from the wire protocol:  When a
+      transport interprets the notification in order to decide how much
+      to respond to congestion, should it take into account the size of
+      each missing or marked packet?
+
+   Consensus has emerged over the years concerning the first stage,
+   which Section 2.1 records in the RFC Series.  In summary: If
+   possible, it is best to measure congestion by time in the queue;
+   otherwise, the choice between bytes and packets solely depends on
+   whether the resource is congested by bytes or packets.
+
+   The controversy is mainly around the last two stages: whether to
+   allow for the size of the specific packet notifying congestion i)
+   when the network encodes or ii) when the transport decodes the
+   congestion notification.
+
+   Currently, the RFC series is silent on this matter other than a paper
+   trail of advice referenced from [RFC2309], which conditionally
+   recommends byte-mode (packet-size dependent) drop [pktByteEmail].
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 4]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   Reducing the number of small packets dropped certainly has some
+   tempting advantages: i) it drops fewer control packets, which tend to
+   be small and ii) it makes TCP's bit rate less dependent on packet
+   size.  However, there are ways of addressing these issues at the
+   transport layer, rather than reverse engineering network forwarding
+   to fix the problems.
+
+   This memo updates [RFC2309] to deprecate deliberate preferential
+   treatment of packets in AQM algorithms solely because of their size.
+   It recommends that (1) packet size should be taken into account when
+   transports detect and respond to congestion indications, (2) not when
+   network equipment creates them.  This memo also adds to the
+   congestion control principles enumerated in BCP 41 [RFC2914].
+
+   In the particular case of Random Early Detection (RED), this means
+   that the byte-mode packet drop variant should not be used to drop
+   fewer small packets, because that creates a perverse incentive for
+   transports to use tiny segments, consequently also opening up a DoS
+   vulnerability.  Fortunately, all the RED implementers who responded
+   to our admittedly limited survey (Section 4.2.4) have not followed
+   the earlier advice to use byte-mode drop, so the position this memo
+   argues for seems to already exist in implementations.
+
+   However, at the transport layer, TCP congestion control is a widely
+   deployed protocol that doesn't scale with packet size (i.e., its
+   reduction in rate does not take into account the size of a lost
+   packet).  To date, this hasn't been a significant problem because
+   most TCP implementations have been used with similar packet sizes.
+   But, as we design new congestion control mechanisms, this memo
+   recommends that we build in scaling with packet size rather than
+   assuming that we should follow TCP's example.
+
+   This memo continues as follows.  First, it discusses terminology and
+   scoping.  Section 2 gives concrete formal recommendations, followed
+   by motivating arguments in Section 3.  We then critically survey the
+   advice given previously in the RFC Series and the research literature
+   (Section 4), referring to an assessment of whether or not this advice
+   has been followed in production networks (Appendix A).  To wrap up,
+   outstanding issues are discussed that will need resolution both to
+   inform future protocol designs and to handle legacy AQM deployments
+   (Section 5).  Then security issues are collected together in
+   Section 6 before conclusions are drawn in Section 7.  The interested
+   reader can find discussion of more detailed issues on the theme of
+   byte vs. packet in the appendices.
+
+   This memo intentionally includes a non-negligible amount of material
+   on the subject.  For the busy reader, Section 2 summarises the
+   recommendations for the Internet community.
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 5]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+1.1.  Terminology and Scoping
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in [RFC2119].
+
+   This memo applies to the design of all AQM algorithms, for example,
+   Random Early Detection (RED) [RFC2309], BLUE [BLUE02], Pre-Congestion
+   Notification (PCN) [RFC5670], Controlled Delay (CoDel) [CoDel], and
+   the Proportional Integral controller Enhanced (PIE) [PIE].
+   Throughout, RED is used as a concrete example because it is a widely
+   known and deployed AQM algorithm.  There is no intention to imply
+   that the advice is any less applicable to the other algorithms, nor
+   that RED is preferred.
+
+   Congestion Notification:  Congestion notification is a changing
+      signal that aims to communicate the probability that the network
+      resource(s) will not be able to forward the level of traffic load
+      offered (or that there is an impending risk that they will not be
+      able to).
+
+      The 'impending risk' qualifier is added, because AQM systems set a
+      virtual limit smaller than the actual limit to the resource, then
+      notify the transport when this virtual limit is exceeded in order
+      to avoid uncontrolled congestion of the actual capacity.
+
+      Congestion notification communicates a real number bounded by the
+      range [ 0 , 1 ].  This ties in with the most well-understood
+      measure of congestion notification: drop probability.
+
+   Explicit and Implicit Notification:  The byte vs. packet dilemma
+      concerns congestion notification irrespective of whether it is
+      signalled implicitly by drop or explicitly using ECN [RFC3168] or
+      PCN [RFC5670].  Throughout this document, unless clear from the
+      context, the term 'marking' will be used to mean notifying
+      congestion explicitly, while 'congestion notification' will be
+      used to mean notifying congestion either implicitly by drop or
+      explicitly by marking.
+
+   Bit-congestible vs. Packet-congestible:  If the load on a resource
+      depends on the rate at which packets arrive, it is called 'packet-
+      congestible'.  If the load depends on the rate at which bits
+      arrive, it is called 'bit-congestible'.
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 6]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+      Examples of packet-congestible resources are route look-up engines
+      and firewalls, because load depends on how many packet headers
+      they have to process.  Examples of bit-congestible resources are
+      transmission links, radio power, and most buffer memory, because
+      the load depends on how many bits they have to transmit or store.
+      Some machine architectures use fixed-size packet buffers, so
+      buffer memory in these cases is packet-congestible (see
+      Section 4.1.1).
+
+      The path through a machine will typically encounter both packet-
+      congestible and bit-congestible resources.  However, currently, a
+      design goal of network processing equipment such as routers and
+      firewalls is to size the packet-processing engine(s) relative to
+      the lines in order to keep packet processing uncongested, even
+      under worst-case packet rates with runs of minimum-size packets.
+      Therefore, packet congestion is currently rare (see Section 3.3 of
+      [RFC6077]), but there is no guarantee that it will not become more
+      common in the future.
+
+      Note that information is generally processed or transmitted with a
+      minimum granularity greater than a bit (e.g., octets).  The
+      appropriate granularity for the resource in question should be
+      used, but for the sake of brevity we will talk in terms of bytes
+      in this memo.
+
+   Coarser Granularity:  Resources may be congestible at higher levels
+      of granularity than bits or packets, for instance stateful
+      firewalls are flow-congestible and call-servers are session-
+      congestible.  This memo focuses on congestion of connectionless
+      resources, but the same principles may be applicable for
+      congestion notification protocols controlling per-flow and per-
+      session processing or state.
+
+   RED Terminology:  In RED, whether to use packets or bytes when
+      measuring queues is called, respectively, 'packet-mode queue
+      measurement' or 'byte-mode queue measurement'.  And whether the
+      probability of dropping a particular packet is independent or
+      dependent on its size is called, respectively, 'packet-mode drop'
+      or 'byte-mode drop'.  The terms 'byte-mode' and 'packet-mode'
+      should not be used without specifying whether they apply to queue
+      measurement or to drop.
+
+1.2.  Example Comparing Packet-Mode Drop and Byte-Mode Drop
+
+   Taking RED as a well-known example algorithm, a central question
+   addressed by this document is whether to recommend RED's packet-mode
+   drop variant and to deprecate byte-mode drop.  Table 1 compares how
+   packet-mode and byte-mode drop affect two flows of different size
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 7]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   packets.  For each it gives the expected number of packets and of
+   bits dropped in one second.  Each example flow runs at the same bit
+   rate of 48 Mbps, but one is broken up into small 60 byte packets and
+   the other into large 1,500 byte packets.
+
+   To keep up the same bit rate, in one second there are about 25 times
+   more small packets because they are 25 times smaller.  As can be seen
+   from the table, the packet rate is 100,000 small packets versus 4,000
+   large packets per second (pps).
+
+     Parameter            Formula         Small packets Large packets
+     -------------------- --------------- ------------- -------------
+     Packet size          s/8                      60 B       1,500 B
+     Packet size          s                       480 b      12,000 b
+     Bit rate             x                     48 Mbps       48 Mbps
+     Packet rate          u = x/s              100 kpps        4 kpps
+
+     Packet-mode Drop
+     Pkt-loss probability p                        0.1%          0.1%
+     Pkt-loss rate        p*u                   100 pps         4 pps
+     Bit-loss rate        p*u*s                 48 kbps       48 kbps
+
+     Byte-mode Drop       MTU, M=12,000 b
+     Pkt-loss probability b = p*s/M              0.004%          0.1%
+     Pkt-loss rate        b*u                     4 pps         4 pps
+     Bit-loss rate        b*u*s               1.92 kbps       48 kbps
+
+         Table 1: Example Comparing Packet-Mode and Byte-Mode Drop
+
+   For packet-mode drop, we illustrate the effect of a drop probability
+   of 0.1%, which the algorithm applies to all packets irrespective of
+   size.  Because there are 25 times more small packets in one second,
+   it naturally drops 25 times more small packets, that is, 100 small
+   packets but only 4 large packets.  But if we count how many bits it
+   drops, there are 48,000 bits in 100 small packets and 48,000 bits in
+   4 large packets -- the same number of bits of small packets as large.
+
+      The packet-mode drop algorithm drops any bit with the same
+      probability whether the bit is in a small or a large packet.
+
+   For byte-mode drop, again we use an example drop probability of 0.1%,
+   but only for maximum size packets (assuming the link maximum
+   transmission unit (MTU) is 1,500 B or 12,000 b).  The byte-mode
+   algorithm reduces the drop probability of smaller packets
+   proportional to their size, making the probability that it drops a
+   small packet 25 times smaller at 0.004%.  But there are 25 times more
+   small packets, so dropping them with 25 times lower probability
+   results in dropping the same number of packets: 4 drops in both
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 8]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   cases.  The 4 small dropped packets contain 25 times less bits than
+   the 4 large dropped packets: 1,920 compared to 48,000.
+
+      The byte-mode drop algorithm drops any bit with a probability
+      proportionate to the size of the packet it is in.
+
+2.  Recommendations
+
+   This section gives recommendations related to network equipment in
+   Sections 2.1 and 2.2, and we discuss the implications on transport
+   protocols in Sections 2.3 and 2.4.
+
+2.1.  Recommendation on Queue Measurement
+
+   Ideally, an AQM would measure the service time of the queue to
+   measure congestion of a resource.  However service time can only be
+   measured as packets leave the queue, where it is not always expedient
+   to implement a full AQM algorithm.  To predict the service time as
+   packets join the queue, an AQM algorithm needs to measure the length
+   of the queue.
+
+   In this case, if the resource is bit-congestible, the AQM
+   implementation SHOULD measure the length of the queue in bytes and,
+   if the resource is packet-congestible, the implementation SHOULD
+   measure the length of the queue in packets.  Subject to the
+   exceptions below, no other choice makes sense, because the number of
+   packets waiting in the queue isn't relevant if the resource gets
+   congested by bytes and vice versa.  For example, the length of the
+   queue into a transmission line would be measured in bytes, while the
+   length of the queue into a firewall would be measured in packets.
+
+   To avoid the pathological effects of tail drop, the AQM can then
+   transform this service time or queue length into the probability of
+   dropping or marking a packet (e.g., RED's piecewise linear function
+   between thresholds).
+
+   What this advice means for RED as a specific example:
+
+   1.  A RED implementation SHOULD use byte-mode queue measurement for
+       measuring the congestion of bit-congestible resources and packet-
+       mode queue measurement for packet-congestible resources.
+
+   2.  An implementation SHOULD NOT make it possible to configure the
+       way a queue measures itself, because whether a queue is bit-
+       congestible or packet-congestible is an inherent property of the
+       queue.
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                 [Page 9]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   Exceptions to these recommendations might be necessary, for instance
+   where a packet-congestible resource has to be configured as a proxy
+   bottleneck for a bit-congestible resource in an adjacent box that
+   does not support AQM.
+
+   The recommended approach in less straightforward scenarios, such as
+   fixed-size packet buffers, resources without a queue, and buffers
+   comprising a mix of packet and bit-congestible resources, is
+   discussed in Section 4.1.  For instance, Section 4.1.1 explains that
+   the queue into a line should be measured in bytes even if the queue
+   consists of fixed-size packet buffers, because the root cause of any
+   congestion is bytes arriving too fast for the line -- packets filling
+   buffers are merely a symptom of the underlying congestion of the
+   line.
+
+2.2.  Recommendation on Encoding Congestion Notification
+
+   When encoding congestion notification (e.g., by drop, ECN, or PCN),
+   the probability that network equipment drops or marks a particular
+   packet to notify congestion SHOULD NOT depend on the size of the
+   packet in question.  As the example in Section 1.2 illustrates, to
+   drop any bit with probability 0.1%, it is only necessary to drop
+   every packet with probability 0.1% without regard to the size of each
+   packet.
+
+   This approach ensures the network layer offers sufficient congestion
+   information for all known and future transport protocols and also
+   ensures no perverse incentives are created that would encourage
+   transports to use inappropriately small packet sizes.
+
+   What this advice means for RED as a specific example:
+
+   1.  The RED AQM algorithm SHOULD NOT use byte-mode drop, i.e., it
+       ought to use packet-mode drop.  Byte-mode drop is more complex,
+       it creates the perverse incentive to fragment segments into tiny
+       pieces and it is vulnerable to floods of small packets.
+
+   2.  If a vendor has implemented byte-mode drop, and an operator has
+       turned it on, it is RECOMMENDED that the operator use packet-mode
+       drop instead, after establishing if there are any implications on
+       the relative performance of applications using different packet
+       sizes.  The unlikely possibility of some application-specific
+       legacy use of byte-mode drop is the only reason that all the
+       above recommendations on encoding congestion notification are not
+       phrased more strongly.
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 10]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+       RED as a whole SHOULD NOT be switched off.  Without RED, a tail-
+       drop queue biases against large packets and is vulnerable to
+       floods of small packets.
+
+   Note well that RED's byte-mode queue drop is completely orthogonal to
+   byte-mode queue measurement and should not be confused with it.  If a
+   RED implementation has a byte-mode but does not specify what sort of
+   byte-mode, it is most probably byte-mode queue measurement, which is
+   fine.  However, if in doubt, the vendor should be consulted.
+
+   A survey (Appendix A) showed that there appears to be little, if any,
+   installed base of the byte-mode drop variant of RED.  This suggests
+   that deprecating byte-mode drop will have little, if any, incremental
+   deployment impact.
+
+2.3.  Recommendation on Responding to Congestion
+
+   When a transport detects that a packet has been lost or congestion
+   marked, it SHOULD consider the strength of the congestion indication
+   as proportionate to the size in octets (bytes) of the missing or
+   marked packet.
+
+   In other words, when a packet indicates congestion (by being lost or
+   marked), it can be considered conceptually as if there is a
+   congestion indication on every octet of the packet, not just one
+   indication per packet.
+
+   To be clear, the above recommendation solely describes how a
+   transport should interpret the meaning of a congestion indication, as
+   a long term goal.  It makes no recommendation on whether a transport
+   should act differently based on this interpretation.  It merely aids
+   interoperability between transports, if they choose to make their
+   actions depend on the strength of congestion indications.
+
+   This definition will be useful as the IETF transport area continues
+   its programme of:
+
+   o  updating host-based congestion control protocols to take packet
+      size into account, and
+
+   o  making transports less sensitive to losing control packets like
+      SYNs and pure ACKs.
+
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 11]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   What this advice means for the case of TCP:
+
+   1.  If two TCP flows with different packet sizes are required to run
+       at equal bit rates under the same path conditions, this SHOULD be
+       done by altering TCP (Section 4.2.2), not network equipment (the
+       latter affects other transports besides TCP).
+
+   2.  If it is desired to improve TCP performance by reducing the
+       chance that a SYN or a pure ACK will be dropped, this SHOULD be
+       done by modifying TCP (Section 4.2.3), not network equipment.
+
+   To be clear, we are not recommending at all that TCPs under
+   equivalent conditions should aim for equal bit rates.  We are merely
+   saying that anyone trying to do such a thing should modify their TCP
+   algorithm, not the network.
+
+   These recommendations are phrased as 'SHOULD' rather than 'MUST',
+   because there may be cases where expediency dictates that
+   compatibility with pre-existing versions of a transport protocol make
+   the recommendations impractical.
+
+2.4.  Recommendation on Handling Congestion Indications When Splitting
+      or Merging Packets
+
+   Packets carrying congestion indications may be split or merged in
+   some circumstances (e.g., at an RTP / RTP Control Protocol (RTCP)
+   transcoder or during IP fragment reassembly).  Splitting and merging
+   only make sense in the context of ECN, not loss.
+
+   The general rule to follow is that the number of octets in packets
+   with congestion indications SHOULD be equivalent before and after
+   merging or splitting.  This is based on the principle used above;
+   that an indication of congestion on a packet can be considered as an
+   indication of congestion on each octet of the packet.
+
+   The above rule is not phrased with the word 'MUST' to allow the
+   following exception.  There are cases in which pre-existing protocols
+   were not designed to conserve congestion-marked octets (e.g., IP
+   fragment reassembly [RFC3168] or loss statistics in RTCP receiver
+   reports [RFC3550] before ECN was added [RFC6679]).  When any such
+   protocol is updated, it SHOULD comply with the above rule to conserve
+   marked octets.  However, the rule may be relaxed if it would
+   otherwise become too complex to interoperate with pre-existing
+   implementations of the protocol.
+
+   One can think of a splitting or merging process as if all the
+   incoming congestion-marked octets increment a counter and all the
+   outgoing marked octets decrement the same counter.  In order to
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 12]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   ensure that congestion indications remain timely, even the smallest
+   positive remainder in the conceptual counter should trigger the next
+   outgoing packet to be marked (causing the counter to go negative).
+
+3.  Motivating Arguments
+
+   This section is informative.  It justifies the recommendations made
+   in the previous section.
+
+3.1.  Avoiding Perverse Incentives to (Ab)use Smaller Packets
+
+   Increasingly, it is being recognised that a protocol design must take
+   care not to cause unintended consequences by giving the parties in
+   the protocol exchange perverse incentives [Evol_cc] [RFC3426].  Given
+   there are many good reasons why larger path maximum transmission
+   units (PMTUs) would help solve a number of scaling issues, we do not
+   want to create any bias against large packets that is greater than
+   their true cost.
+
+   Imagine a scenario where the same bit rate of packets will contribute
+   the same to bit congestion of a link irrespective of whether it is
+   sent as fewer larger packets or more smaller packets.  A protocol
+   design that caused larger packets to be more likely to be dropped
+   than smaller ones would be dangerous in both of the following cases:
+
+   Malicious transports:  A queue that gives an advantage to small
+      packets can be used to amplify the force of a flooding attack.  By
+      sending a flood of small packets, the attacker can get the queue
+      to discard more large-packet traffic, allowing more attack traffic
+      to get through to cause further damage.  Such a queue allows
+      attack traffic to have a disproportionately large effect on
+      regular traffic without the attacker having to do much work.
+
+   Non-malicious transports:  Even if an application designer is not
+      actually malicious, if over time it is noticed that small packets
+      tend to go faster, designers will act in their own interest and
+      use smaller packets.  Queues that give advantage to small packets
+      create an evolutionary pressure for applications or transports to
+      send at the same bit rate but break their data stream down into
+      tiny segments to reduce their drop rate.  Encouraging a high
+      volume of tiny packets might in turn unnecessarily overload a
+      completely unrelated part of the system, perhaps more limited by
+      header processing than bandwidth.
+
+   Imagine that two unresponsive flows arrive at a bit-congestible
+   transmission link each with the same bit rate, say 1 Mbps, but one
+   consists of 1,500 B and the other 60 B packets, which are 25x
+   smaller.  Consider a scenario where gentle RED [gentle_RED] is used,
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 13]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   along with the variant of RED we advise against, i.e., where the RED
+   algorithm is configured to adjust the drop probability of packets in
+   proportion to each packet's size (byte-mode packet drop).  In this
+   case, RED aims to drop 25x more of the larger packets than the
+   smaller ones.  Thus, for example, if RED drops 25% of the larger
+   packets, it will aim to drop 1% of the smaller packets (but, in
+   practice, it may drop more as congestion increases; see Appendix B.4
+   of [RFC4828]).  Even though both flows arrive with the same bit rate,
+   the bit rate the RED queue aims to pass to the line will be 750 kbps
+   for the flow of larger packets but 990 kbps for the smaller packets
+   (because of rate variations, it will actually be a little less than
+   this target).
+
+   Note that, although the byte-mode drop variant of RED amplifies
+   small-packet attacks, tail-drop queues amplify small-packet attacks
+   even more (see Security Considerations in Section 6).  Wherever
+   possible, neither should be used.
+
+3.2.  Small != Control
+
+   Dropping fewer control packets considerably improves performance.  It
+   is tempting to drop small packets with lower probability in order to
+   improve performance, because many control packets tend to be smaller
+   (TCP SYNs and ACKs, DNS queries and responses, SIP messages, HTTP
+   GETs, etc).  However, we must not give control packets preference
+   purely by virtue of their smallness, otherwise it is too easy for any
+   data source to get the same preferential treatment simply by sending
+   data in smaller packets.  Again, we should not create perverse
+   incentives to favour small packets rather than to favour control
+   packets, which is what we intend.
+
+   Just because many control packets are small does not mean all small
+   packets are control packets.
+
+   So, rather than fix these problems in the network, we argue that the
+   transport should be made more robust against losses of control
+   packets (see Section 4.2.3).
+
+3.3.  Transport-Independent Network
+
+   TCP congestion control ensures that flows competing for the same
+   resource each maintain the same number of segments in flight,
+   irrespective of segment size.  So under similar conditions, flows
+   with different segment sizes will get different bit rates.
+
+   To counter this effect, it seems tempting not to follow our
+   recommendation, and instead for the network to bias congestion
+   notification by packet size in order to equalise the bit rates of
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 14]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   flows with different packet sizes.  However, in order to do this, the
+   queuing algorithm has to make assumptions about the transport, which
+   become embedded in the network.  Specifically:
+
+   o  The queuing algorithm has to assume how aggressively the transport
+      will respond to congestion (see Section 4.2.4).  If the network
+      assumes the transport responds as aggressively as TCP NewReno, it
+      will be wrong for Compound TCP and differently wrong for Cubic
+      TCP, etc.  To achieve equal bit rates, each transport then has to
+      guess what assumption the network made, and work out how to
+      replace this assumed aggressiveness with its own aggressiveness.
+
+   o  Also, if the network biases congestion notification by packet
+      size, it has to assume a baseline packet size -- all proposed
+      algorithms use the local MTU (for example, see the byte-mode loss
+      probability formula in Table 1).  Then if the non-Reno transports
+      mentioned above are trying to reverse engineer what the network
+      assumed, they also have to guess the MTU of the congested link.
+
+   Even though reducing the drop probability of small packets (e.g.,
+   RED's byte-mode drop) helps ensure TCP flows with different packet
+   sizes will achieve similar bit rates, we argue that this correction
+   should be made to any future transport protocols based on TCP, not to
+   the network in order to fix one transport, no matter how predominant
+   it is.  Effectively, favouring small packets is reverse engineering
+   of network equipment around one particular transport protocol (TCP),
+   contrary to the excellent advice in [RFC3426], which asks designers
+   to question "Why are you proposing a solution at this layer of the
+   protocol stack, rather than at another layer?"
+
+   In contrast, if the network never takes packet size into account, the
+   transport can be certain it will never need to guess any assumptions
+   that the network has made.  And the network passes two pieces of
+   information to the transport that are sufficient in all cases: i)
+   congestion notification on the packet and ii) the size of the packet.
+   Both are available for the transport to combine (by taking packet
+   size into account when responding to congestion) or not.  Appendix B
+   checks that these two pieces of information are sufficient for all
+   relevant scenarios.
+
+   When the network does not take packet size into account, it allows
+   transport protocols to choose whether or not to take packet size into
+   account.  However, if the network were to bias congestion
+   notification by packet size, transport protocols would have no
+   choice; those that did not take into account packet size themselves
+   would unwittingly become dependent on packet size, and those that
+   already took packet size into account would end up taking it into
+   account twice.
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 15]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+3.4.  Partial Deployment of AQM
+
+   In overview, the argument in this section runs as follows:
+
+   o  Because the network does not and cannot always drop packets in
+      proportion to their size, it shouldn't be given the task of making
+      drop signals depend on packet size at all.
+
+   o  Transports on the other hand don't always want to make their rate
+      response proportional to the size of dropped packets, but if they
+      want to, they always can.
+
+   The argument is similar to the end-to-end argument that says "Don't
+   do X in the network if end systems can do X by themselves, and they
+   want to be able to choose whether to do X anyway".  Actually the
+   following argument is stronger; in addition it says "Don't give the
+   network task X that could be done by the end systems, if X is not
+   deployed on all network nodes, and end systems won't be able to tell
+   whether their network is doing X, or whether they need to do X
+   themselves."  In this case, the X in question is "making the response
+   to congestion depend on packet size".
+
+   We will now re-run this argument reviewing each step in more depth.
+   The argument applies solely to drop, not to ECN marking.
+
+   A queue drops packets for either of two reasons: a) to signal to host
+   congestion controls that they should reduce the load and b) because
+   there is no buffer left to store the packets.  Active queue
+   management tries to use drops as a signal for hosts to slow down
+   (case a) so that drops due to buffer exhaustion (case b) should not
+   be necessary.
+
+   AQM is not universally deployed in every queue in the Internet; many
+   cheap Ethernet bridges, software firewalls, NATs on consumer devices,
+   etc implement simple tail-drop buffers.  Even if AQM were universal,
+   it has to be able to cope with buffer exhaustion (by switching to a
+   behaviour like tail drop), in order to cope with unresponsive or
+   excessive transports.  For these reasons networks will sometimes be
+   dropping packets as a last resort (case b) rather than under AQM
+   control (case a).
+
+   When buffers are exhausted (case b), they don't naturally drop
+   packets in proportion to their size.  The network can only reduce the
+   probability of dropping smaller packets if it has enough space to
+   store them somewhere while it waits for a larger packet that it can
+   drop.  If the buffer is exhausted, it does not have this choice.
+   Admittedly tail drop does naturally drop somewhat fewer small
+   packets, but exactly how few depends more on the mix of sizes than
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 16]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   the size of the packet in question.  Nonetheless, in general, if we
+   wanted networks to do size-dependent drop, we would need universal
+   deployment of (packet-size dependent) AQM code, which is currently
+   unrealistic.
+
+   A host transport cannot know whether any particular drop was a
+   deliberate signal from an AQM or a sign of a queue shedding packets
+   due to buffer exhaustion.  Therefore, because the network cannot
+   universally do size-dependent drop, it should not do it all.
+
+   Whereas universality is desirable in the network, diversity is
+   desirable between different transport-layer protocols -- some, like
+   standards track TCP congestion control [RFC5681], may not choose to
+   make their rate response proportionate to the size of each dropped
+   packet, while others will (e.g., TCP-Friendly Rate Control for Small
+   Packets (TFRC-SP) [RFC4828]).
+
+3.5.  Implementation Efficiency
+
+   Biasing against large packets typically requires an extra multiply
+   and divide in the network (see the example byte-mode drop formula in
+   Table 1).  Taking packet size into account at the transport rather
+   than in the network ensures that neither the network nor the
+   transport needs to do a multiply operation -- multiplication by
+   packet size is effectively achieved as a repeated add when the
+   transport adds to its count of marked bytes as each congestion event
+   is fed to it.  Also, the work to do the biasing is spread over many
+   hosts, rather than concentrated in just the congested network
+   element.  These aren't principled reasons in themselves, but they are
+   a happy consequence of the other principled reasons.
+
+4.  A Survey and Critique of Past Advice
+
+   This section is informative, not normative.
+
+   The original 1993 paper on RED [RED93] proposed two options for the
+   RED active queue management algorithm: packet mode and byte mode.
+   Packet mode measured the queue length in packets and dropped (or
+   marked) individual packets with a probability independent of their
+   size.  Byte mode measured the queue length in bytes and marked an
+   individual packet with probability in proportion to its size
+   (relative to the maximum packet size).  In the paper's outline of
+   further work, it was stated that no recommendation had been made on
+   whether the queue size should be measured in bytes or packets, but
+   noted that the difference could be significant.
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 17]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   When RED was recommended for general deployment in 1998 [RFC2309],
+   the two modes were mentioned implying the choice between them was a
+   question of performance, referring to a 1997 email [pktByteEmail] for
+   advice on tuning.  A later addendum to this email introduced the
+   insight that there are in fact two orthogonal choices:
+
+   o  whether to measure queue length in bytes or packets (Section 4.1),
+      and
+
+   o  whether the drop probability of an individual packet should depend
+      on its own size (Section 4.2).
+
+   The rest of this section is structured accordingly.
+
+4.1.  Congestion Measurement Advice
+
+   The choice of which metric to use to measure queue length was left
+   open in RFC 2309.  It is now well understood that queues for bit-
+   congestible resources should be measured in bytes, and queues for
+   packet-congestible resources should be measured in packets
+   [pktByteEmail].
+
+   Congestion in some legacy bit-congestible buffers is only measured in
+   packets not bytes.  In such cases, the operator has to take into
+   account a typical mix of packet sizes when setting the thresholds.
+   Any AQM algorithm on such a buffer will be oversensitive to high
+   proportions of small packets, e.g., a DoS attack, and under-sensitive
+   to high proportions of large packets.  However, there is no need to
+   make allowances for the possibility of such a legacy in future
+   protocol design.  This is safe because any under-sensitivity during
+   unusual traffic mixes cannot lead to congestion collapse given that
+   the buffer will eventually revert to tail drop, which discards
+   proportionately more large packets.
+
+4.1.1.  Fixed-Size Packet Buffers
+
+   The question of whether to measure queues in bytes or packets seems
+   to be well understood.  However, measuring congestion is confusing
+   when the resource is bit-congestible but the queue into the resource
+   is packet-congestible.  This section outlines the approach to take.
+
+   Some, mostly older, queuing hardware allocates fixed-size buffers in
+   which to store each packet in the queue.  This hardware forwards
+   packets to the line in one of two ways:
+
+   o  With some hardware, any fixed-size buffers not completely filled
+      by a packet are padded when transmitted to the wire.  This case
+      should clearly be treated as packet-congestible, because both
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 18]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+      queuing and transmission are in fixed MTU-size units.  Therefore,
+      the queue length in packets is a good model of congestion of the
+      link.
+
+   o  More commonly, hardware with fixed-size packet buffers transmits
+      packets to the line without padding.  This implies a hybrid
+      forwarding system with transmission congestion dependent on the
+      size of packets but queue congestion dependent on the number of
+      packets, irrespective of their size.
+
+      Nonetheless, there would be no queue at all unless the line had
+      become congested -- the root cause of any congestion is too many
+      bytes arriving for the line.  Therefore, the AQM should measure
+      the queue length as the sum of all the packet sizes in bytes that
+      are queued up waiting to be serviced by the line, irrespective of
+      whether each packet is held in a fixed-size buffer.
+
+   In the (unlikely) first case where use of padding means the queue
+   should be measured in packets, further confusion is likely because
+   the fixed buffers are rarely all one size.  Typically, pools of
+   different-sized buffers are provided (Cisco uses the term 'buffer
+   carving' for the process of dividing up memory into these pools
+   [IOSArch]).  Usually, if the pool of small buffers is exhausted,
+   arriving small packets can borrow space in the pool of large buffers,
+   but not vice versa.  However, there is no need to consider all this
+   complexity, because the root cause of any congestion is still line
+   overload -- buffer consumption is only the symptom.  Therefore, the
+   length of the queue should be measured as the sum of the bytes in the
+   queue that will be transmitted to the line, including any padding.
+   In the (unusual) case of transmission with padding, this means the
+   sum of the sizes of the small buffers queued plus the sum of the
+   sizes of the large buffers queued.
+
+   We will return to borrowing of fixed-size buffers when we discuss
+   biasing the drop/marking probability of a specific packet because of
+   its size in Section 4.2.1.  But here, we can repeat the simple rule
+   for how to measure the length of queues of fixed buffers: no matter
+   how complicated the buffering scheme is, ultimately a transmission
+   line is nearly always bit-congestible so the number of bytes queued
+   up waiting for the line measures how congested the line is, and it is
+   rarely important to measure how congested the buffering system is.
+
+4.1.2.  Congestion Measurement without a Queue
+
+   AQM algorithms are nearly always described assuming there is a queue
+   for a congested resource and the algorithm can use the queue length
+   to determine the probability that it will drop or mark each packet.
+   But not all congested resources lead to queues.  For instance, power-
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 19]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   limited resources are usually bit-congestible if energy is primarily
+   required for transmission rather than header processing, but it is
+   rare for a link protocol to build a queue as it approaches maximum
+   power.
+
+   Nonetheless, AQM algorithms do not require a queue in order to work.
+   For instance, spectrum congestion can be modelled by signal quality
+   using the target bit-energy-to-noise-density ratio.  And, to model
+   radio power exhaustion, transmission-power levels can be measured and
+   compared to the maximum power available.  [ECNFixedWireless] proposes
+   a practical and theoretically sound way to combine congestion
+   notification for different bit-congestible resources at different
+   layers along an end-to-end path, whether wireless or wired, and
+   whether with or without queues.
+
+   In wireless protocols that use request to send / clear to send
+   (RTS / CTS) control, such as some variants of IEEE802.11, it is
+   reasonable to base an AQM on the time spent waiting for transmission
+   opportunities (TXOPs) even though the wireless spectrum is usually
+   regarded as congested by bits (for a given coding scheme).  This is
+   because requests for TXOPs queue up as the spectrum gets congested by
+   all the bits being transferred.  So the time that TXOPs are queued
+   directly reflects bit congestion of the spectrum.
+
+4.2.  Congestion Notification Advice
+
+4.2.1.  Network Bias When Encoding
+
+4.2.1.1.  Advice on Packet-Size Bias in RED
+
+   The previously mentioned email [pktByteEmail] referred to by
+   [RFC2309] advised that most scarce resources in the Internet were
+   bit-congestible, which is still believed to be true (Section 1.1).
+   But it went on to offer advice that is updated by this memo.  It said
+   that drop probability should depend on the size of the packet being
+   considered for drop if the resource is bit-congestible, but not if it
+   is packet-congestible.  The argument continued that if packet drops
+   were inflated by packet size (byte-mode dropping), "a flow's fraction
+   of the packet drops is then a good indication of that flow's fraction
+   of the link bandwidth in bits per second".  This was consistent with
+   a referenced policing mechanism being worked on at the time for
+   detecting unusually high bandwidth flows, eventually published in
+   1999 [pBox].  However, the problem could and should have been solved
+   by making the policing mechanism count the volume of bytes randomly
+   dropped, not the number of packets.
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 20]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   A few months before RFC 2309 was published, an addendum was added to
+   the above archived email referenced from the RFC, in which the final
+   paragraph seemed to partially retract what had previously been said.
+   It clarified that the question of whether the probability of
+   dropping/marking a packet should depend on its size was not related
+   to whether the resource itself was bit-congestible, but a completely
+   orthogonal question.  However, the only example given had the queue
+   measured in packets but packet drop depended on the size of the
+   packet in question.  No example was given the other way round.
+
+   In 2000, Cnodder et al. [REDbyte] pointed out that there was an error
+   in the part of the original 1993 RED algorithm that aimed to
+   distribute drops uniformly, because it didn't correctly take into
+   account the adjustment for packet size.  They recommended an
+   algorithm called RED_4 to fix this.  But they also recommended a
+   further change, RED_5, to adjust the drop rate dependent on the
+   square of the relative packet size.  This was indeed consistent with
+   one implied motivation behind RED's byte-mode drop -- that we should
+   reverse engineer the network to improve the performance of dominant
+   end-to-end congestion control mechanisms.  This memo makes a
+   different recommendations in Section 2.
+
+   By 2003, a further change had been made to the adjustment for packet
+   size, this time in the RED algorithm of the ns2 simulator.  Instead
+   of taking each packet's size relative to a 'maximum packet size', it
+   was taken relative to a 'mean packet size', intended to be a static
+   value representative of the 'typical' packet size on the link.  We
+   have not been able to find a justification in the literature for this
+   change; however, Eddy and Allman conducted experiments [REDbias] that
+   assessed how sensitive RED was to this parameter, amongst other
+   things.  This changed algorithm can often lead to drop probabilities
+   of greater than 1 (which gives a hint that there is probably a
+   mistake in the theory somewhere).
+
+   On 10-Nov-2004, this variant of byte-mode packet drop was made the
+   default in the ns2 simulator.  It seems unlikely that byte-mode drop
+   has ever been implemented in production networks (Appendix A);
+   therefore, any conclusions based on ns2 simulations that use RED
+   without disabling byte-mode drop are likely to behave very
+   differently from RED in production networks.
+
+4.2.1.2.  Packet-Size Bias Regardless of AQM
+
+   The byte-mode drop variant of RED (or a similar variant of other AQM
+   algorithms) is not the only possible bias towards small packets in
+   queuing systems.  We have already mentioned that tail-drop queues
+   naturally tend to lock out large packets once they are full.
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 21]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   But also, queues with fixed-size buffers reduce the probability that
+   small packets will be dropped if (and only if) they allow small
+   packets to borrow buffers from the pools for larger packets (see
+   Section 4.1.1).  Borrowing effectively makes the maximum queue size
+   for small packets greater than that for large packets, because more
+   buffers can be used by small packets while less will fit large
+   packets.  Incidentally, the bias towards small packets from buffer
+   borrowing is nothing like as large as that of RED's byte-mode drop.
+
+   Nonetheless, fixed-buffer memory with tail drop is still prone to
+   lock out large packets, purely because of the tail-drop aspect.  So,
+   fixed-size packet buffers should be augmented with a good AQM
+   algorithm and packet-mode drop.  If an AQM is too complicated to
+   implement with multiple fixed buffer pools, the minimum necessary to
+   prevent large-packet lockout is to ensure that smaller packets never
+   use the last available buffer in any of the pools for larger packets.
+
+4.2.2.  Transport Bias When Decoding
+
+   The above proposals to alter the network equipment to bias towards
+   smaller packets have largely carried on outside the IETF process.
+   Whereas, within the IETF, there are many different proposals to alter
+   transport protocols to achieve the same goals, i.e., either to make
+   the flow bit rate take into account packet size, or to protect
+   control packets from loss.  This memo argues that altering transport
+   protocols is the more principled approach.
+
+   A recently approved experimental RFC adapts its transport-layer
+   protocol to take into account packet sizes relative to typical TCP
+   packet sizes.  This proposes a new small-packet variant of TCP-
+   friendly rate control (TFRC [RFC5348]), which is called TFRC-SP
+   [RFC4828].  Essentially, it proposes a rate equation that inflates
+   the flow rate by the ratio of a typical TCP segment size (1,500 B
+   including TCP header) over the actual segment size [PktSizeEquCC].
+   (There are also other important differences of detail relative to
+   TFRC, such as using virtual packets [CCvarPktSize] to avoid
+   responding to multiple losses per round trip and using a minimum
+   inter-packet interval.)
+
+   Section 4.5.1 of the TFRC-SP specification discusses the implications
+   of operating in an environment where queues have been configured to
+   drop smaller packets with proportionately lower probability than
+   larger ones.  But it only discusses TCP operating in such an
+   environment, only mentioning TFRC-SP briefly when discussing how to
+   define fairness with TCP.  And it only discusses the byte-mode
+   dropping version of RED as it was before Cnodder et al. pointed out
+   that it didn't sufficiently bias towards small packets to make TCP
+   independent of packet size.
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 22]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   So the TFRC-SP specification doesn't address the issue of whether the
+   network or the transport _should_ handle fairness between different
+   packet sizes.  In Appendix B.4 of RFC 4828, it discusses the
+   possibility of both TFRC-SP and some network buffers duplicating each
+   other's attempts to deliberately bias towards small packets.  But the
+   discussion is not conclusive, instead reporting simulations of many
+   of the possibilities in order to assess performance but not
+   recommending any particular course of action.
+
+   The paper originally proposing TFRC with virtual packets (VP-TFRC)
+   [CCvarPktSize] proposed that there should perhaps be two variants to
+   cater for the different variants of RED.  However, as the TFRC-SP
+   authors point out, there is no way for a transport to know whether
+   some queues on its path have deployed RED with byte-mode packet drop
+   (except if an exhaustive survey found that no one has deployed it! --
+   see Appendix A).  Incidentally, VP-TFRC also proposed that byte-mode
+   RED dropping should really square the packet-size compensation factor
+   (like that of Cnodder's RED_5, but apparently unaware of it).
+
+   Pre-congestion notification [RFC5670] is an IETF technology to use a
+   virtual queue for AQM marking for packets within one Diffserv class
+   in order to give early warning prior to any real queuing.  The PCN-
+   marking algorithms have been designed not to take into account packet
+   size when forwarding through queues.  Instead, the general principle
+   has been to take the sizes of marked packets into account when
+   monitoring the fraction of marking at the edge of the network, as
+   recommended here.
+
+4.2.3.  Making Transports Robust against Control Packet Losses
+
+   Recently, two RFCs have defined changes to TCP that make it more
+   robust against losing small control packets [RFC5562] [RFC5690].  In
+   both cases, they note that the case for these two TCP changes would
+   be weaker if RED were biased against dropping small packets.  We
+   argue here that these two proposals are a safer and more principled
+   way to achieve TCP performance improvements than reverse engineering
+   RED to benefit TCP.
+
+   Although there are no known proposals, it would also be possible and
+   perfectly valid to make control packets robust against drop by
+   requesting a scheduling class with lower drop probability, which
+   would be achieved by re-marking to a Diffserv code point [RFC2474]
+   within the same behaviour aggregate.
+
+   Although not brought to the IETF, a simple proposal from Wischik
+   [DupTCP] suggests that the first three packets of every TCP flow
+   should be routinely duplicated after a short delay.  It shows that
+   this would greatly improve the chances of short flows completing
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 23]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   quickly, but it would hardly increase traffic levels on the Internet,
+   because Internet bytes have always been concentrated in the large
+   flows.  It further shows that the performance of many typical
+   applications depends on completion of long serial chains of short
+   messages.  It argues that, given most of the value people get from
+   the Internet is concentrated within short flows, this simple
+   expedient would greatly increase the value of the best-effort
+   Internet at minimal cost.  A similar but more extensive approach has
+   been evaluated on Google servers [GentleAggro].
+
+   The proposals discussed in this sub-section are experimental
+   approaches that are not yet in wide operational use, but they are
+   existence proofs that transports can make themselves robust against
+   loss of control packets.  The examples are all TCP-based, but
+   applications over non-TCP transports could mitigate loss of control
+   packets by making similar use of Diffserv, data duplication, FEC,
+   etc.
+
+4.2.4.  Congestion Notification: Summary of Conflicting Advice
+
+   +-----------+-----------------+-----------------+-------------------+
+   | transport |  RED_1 (packet- |  RED_4 (linear  |   RED_5 (square   |
+   |        cc |    mode drop)   | byte-mode drop) |  byte-mode drop)  |
+   +-----------+-----------------+-----------------+-------------------+
+   |    TCP or |    s/sqrt(p)    |    sqrt(s/p)    |     1/sqrt(p)     |
+   |      TFRC |                 |                 |                   |
+   |   TFRC-SP |    1/sqrt(p)    |   1/sqrt(s*p)   |   1/(s*sqrt(p))   |
+   +-----------+-----------------+-----------------+-------------------+
+
+    Table 2: Dependence of flow bit rate per RTT on packet size, s, and
+     drop probability, p, when there is network and/or transport bias
+                 towards small packets to varying degrees
+
+   Table 2 aims to summarise the potential effects of all the advice
+   from different sources.  Each column shows a different possible AQM
+   behaviour in different queues in the network, using the terminology
+   of Cnodder et al. outlined earlier (RED_1 is basic RED with packet-
+   mode drop).  Each row shows a different transport behaviour: TCP
+   [RFC5681] and TFRC [RFC5348] on the top row with TFRC-SP [RFC4828]
+   below.  Each cell shows how the bits per round trip of a flow depends
+   on packet size, s, and drop probability, p.  In order to declutter
+   the formulae to focus on packet-size dependence, they are all given
+   per round trip, which removes any RTT term.
+
+   Let us assume that the goal is for the bit rate of a flow to be
+   independent of packet size.  Suppressing all inessential details, the
+   table shows that this should either be achievable by not altering the
+   TCP transport in a RED_5 network, or using the small packet TFRC-SP
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 24]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   transport (or similar) in a network without any byte-mode dropping
+   RED (top right and bottom left).  Top left is the 'do nothing'
+   scenario, while bottom right is the 'do both' scenario in which the
+   bit rate would become far too biased towards small packets.  Of
+   course, if any form of byte-mode dropping RED has been deployed on a
+   subset of queues that congest, each path through the network will
+   present a different hybrid scenario to its transport.
+
+   Whatever the case, we can see that the linear byte-mode drop column
+   in the middle would considerably complicate the Internet.  Even if
+   one believes the network should be doing the biasing, linear byte-
+   mode drop is a half-way house that doesn't bias enough towards small
+   packets.  Section 2 recommends that _all_ bias in network equipment
+   towards small packets should be turned off -- if indeed any equipment
+   vendors have implemented it -- leaving packet-size bias solely as the
+   preserve of the transport layer (solely the leftmost, packet-mode
+   drop column).
+
+   In practice, it seems that no deliberate bias towards small packets
+   has been implemented for production networks.  Of the 19% of vendors
+   who responded to a survey of 84 equipment vendors, none had
+   implemented byte-mode drop in RED (see Appendix A for details).
+
+5.  Outstanding Issues and Next Steps
+
+5.1.  Bit-congestible Network
+
+   For a connectionless network with nearly all resources being bit-
+   congestible, the recommended position is clear -- the network should
+   not make allowance for packet sizes and the transport should.  This
+   leaves two outstanding issues:
+
+   o  The question of how to handle any legacy AQM deployments using
+      byte-mode drop;
+
+   o  The need to start a programme to update transport congestion
+      control protocol standards to take packet size into account.
+
+   A survey of equipment vendors (Section 4.2.4) found no evidence that
+   byte-mode packet drop had been implemented, so deployment will be
+   sparse at best.  A migration strategy is not really needed to remove
+   an algorithm that may not even be deployed.
+
+   A programme of experimental updates to take packet size into account
+   in transport congestion control protocols has already started with
+   TFRC-SP [RFC4828].
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 25]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+5.2.  Bit- and Packet-Congestible Network
+
+   The position is much less clear-cut if the Internet becomes populated
+   by a more even mix of both packet-congestible and bit-congestible
+   resources (see Appendix B.2).  This problem is not pressing, because
+   most Internet resources are designed to be bit-congestible before
+   packet processing starts to congest (see Section 1.1).
+
+   The IRTF's Internet Congestion Control Research Group (ICCRG) has set
+   itself the task of reaching consensus on generic forwarding
+   mechanisms that are necessary and sufficient to support the
+   Internet's future congestion control requirements (the first
+   challenge in [RFC6077]).  The research question of whether packet
+   congestion might become common and what to do if it does may in the
+   future be explored in the IRTF (the "Challenge 3: Packet Size" in
+   [RFC6077]).
+
+   Note that sometimes it seems that resources might be congested by
+   neither bits nor packets, e.g., where the queue for access to a
+   wireless medium is in units of transmission opportunities.  However,
+   the root cause of congestion of the underlying spectrum is overload
+   of bits (see Section 4.1.2).
+
+6.  Security Considerations
+
+   This memo recommends that queues do not bias drop probability due to
+   packets size.  For instance, dropping small packets less often than
+   large ones creates a perverse incentive for transports to break down
+   their flows into tiny segments.  One of the benefits of implementing
+   AQM was meant to be to remove this perverse incentive that tail-drop
+   queues gave to small packets.
+
+   In practice, transports cannot all be trusted to respond to
+   congestion.  So another reason for recommending that queues not bias
+   drop probability towards small packets is to avoid the vulnerability
+   to small-packet DDoS attacks that would otherwise result.  One of the
+   benefits of implementing AQM was meant to be to remove tail drop's
+   DoS vulnerability to small packets, so we shouldn't add it back
+   again.
+
+   If most queues implemented AQM with byte-mode drop, the resulting
+   network would amplify the potency of a small-packet DDoS attack.  At
+   the first queue, the stream of packets would push aside a greater
+   proportion of large packets, so more of the small packets would
+   survive to attack the next queue.  Thus a flood of small packets
+   would continue on towards the destination, pushing regular traffic
+   with large packets out of the way in one queue after the next, but
+   suffering much less drop itself.
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 26]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   Appendix C explains why the ability of networks to police the
+   response of _any_ transport to congestion depends on bit-congestible
+   network resources only doing packet-mode drop, not byte-mode drop.
+   In summary, it says that making drop probability depend on the size
+   of the packets that bits happen to be divided into simply encourages
+   the bits to be divided into smaller packets.  Byte-mode drop would
+   therefore irreversibly complicate any attempt to fix the Internet's
+   incentive structures.
+
+7.  Conclusions
+
+   This memo identifies the three distinct stages of the congestion
+   notification process where implementations need to decide whether to
+   take packet size into account.  The recommendations provided in
+   Section 2 of this memo are different in each case:
+
+   o  When network equipment measures the length of a queue, if it is
+      not feasible to use time; it is recommended to count in bytes if
+      the network resource is congested by bytes, or to count in packets
+      if is congested by packets.
+
+   o  When network equipment decides whether to drop (or mark) a packet,
+      it is recommended that the size of the particular packet should
+      not be taken into account.
+
+   o  However, when a transport algorithm responds to a dropped or
+      marked packet, the size of the rate reduction should be
+      proportionate to the size of the packet.
+
+   In summary, the answers are 'it depends', 'no', and 'yes',
+   respectively.
+
+   For the specific case of RED, this means that byte-mode queue
+   measurement will often be appropriate, but the use of byte-mode drop
+   is very strongly discouraged.
+
+   At the transport layer, the IETF should continue updating congestion
+   control protocols to take into account the size of each packet that
+   indicates congestion.  Also, the IETF should continue to make
+   protocols less sensitive to losing control packets like SYNs, pure
+   ACKs, and DNS exchanges.  Although many control packets happen to be
+   small, the alternative of network equipment favouring all small
+   packets would be dangerous.  That would create perverse incentives to
+   split data transfers into smaller packets.
+
+   The memo develops these recommendations from principled arguments
+   concerning scaling, layering, incentives, inherent efficiency,
+   security, and 'policeability'.  It also addresses practical issues
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 27]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   such as specific buffer architectures and incremental deployment.
+   Indeed, a limited survey of RED implementations is discussed, which
+   shows there appears to be little, if any, installed base of RED's
+   byte-mode drop.  Therefore, it can be deprecated with little, if any,
+   incremental deployment complications.
+
+   The recommendations have been developed on the well-founded basis
+   that most Internet resources are bit-congestible, not packet-
+   congestible.  We need to know the likelihood that this assumption
+   will prevail in the longer term and, if it might not, what protocol
+   changes will be needed to cater for a mix of the two.  The IRTF
+   Internet Congestion Control Research Group (ICCRG) is currently
+   working on these problems [RFC6077].
+
+8.  Acknowledgements
+
+   Thank you to Sally Floyd, who gave extensive and useful review
+   comments.  Also thanks for the reviews from Philip Eardley, David
+   Black, Fred Baker, David Taht, Toby Moncaster, Arnaud Jacquet, and
+   Mirja Kuehlewind, as well as helpful explanations of different
+   hardware approaches from Larry Dunn and Fred Baker.  We are grateful
+   to Bruce Davie and his colleagues for providing a timely and
+   efficient survey of RED implementation in Cisco's product range.
+   Also, grateful thanks to Toby Moncaster, Will Dormann, John Regnault,
+   Simon Carter, and Stefaan De Cnodder who further helped survey the
+   current status of RED implementation and deployment, and, finally,
+   thanks to the anonymous individuals who responded.
+
+   Bob Briscoe and Jukka Manner were partly funded by Trilogy and
+   Trilogy 2, research projects (ICT-216372, ICT-317756) supported by
+   the European Community under its Seventh Framework Programme.  The
+   views expressed here are those of the authors only.
+
+9.  References
+
+9.1.  Normative References
+
+   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
+              Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+   [RFC2309]  Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
+              S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
+              Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
+              S., Wroclawski, J., and L. Zhang, "Recommendations on
+              Queue Management and Congestion Avoidance in the
+              Internet", RFC 2309, April 1998.
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 28]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   [RFC2914]  Floyd, S., "Congestion Control Principles", BCP 41, RFC
+              2914, September 2000.
+
+   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
+              of Explicit Congestion Notification (ECN) to IP", RFC
+              3168, September 2001.
+
+9.2.  Informative References
+
+   [BLUE02]   Feng, W-c., Shin, K., Kandlur, D., and D. Saha, "The BLUE
+              active queue management algorithms", IEEE/ACM Transactions
+              on Networking 10(4) 513-528, August 2002,
+              <http://dx.doi.org/10.1109/TNET.2002.801399>.
+
+   [CCvarPktSize]
+              Widmer, J., Boutremans, C., and J-Y. Le Boudec, "End-to-
+              end congestion control for TCP-friendly flows with
+              variable packet size", ACM CCR 34(2) 137-151, April 2004,
+              <http://doi.acm.org/10.1145/997150.997162>.
+
+   [CHOKe_Var_Pkt]
+              Psounis, K., Pan, R., and B. Prabhaker, "Approximate Fair
+              Dropping for Variable-Length Packets", IEEE Micro
+              21(1):48-56, January-February 2001,
+              <http://ieeexplore.ieee.org/xpl/
+              articleDetails.jsp?arnumber=903061>.
+
+   [CoDel]    Nichols, K. and V. Jacobson, "Controlled Delay Active
+              Queue Management", Work in Progress, February 2013.
+
+   [DRQ]      Shin, M., Chong, S., and I. Rhee, "Dual-Resource TCP/AQM
+              for Processing-Constrained Networks", IEEE/ACM
+              Transactions on Networking Vol 16, issue 2, April 2008,
+              <http://dx.doi.org/10.1109/TNET.2007.900415>.
+
+   [DupTCP]   Wischik, D., "Short messages", Philosophical Transactions
+              of the Royal Society A 366(1872):1941-1953, June 2008,
+              <http://rsta.royalsocietypublishing.org/content/366/1872/
+              1941.full.pdf+html>.
+
+   [ECNFixedWireless]
+              Siris, V., "Resource Control for Elastic Traffic in CDMA
+              Networks", Proc. ACM MOBICOM'02 , September 2002,
+              <http://www.ics.forth.gr/netlab/publications/
+              resource_control_elastic_cdma.html>.
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 29]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   [Evol_cc]  Gibbens, R. and F. Kelly, "Resource pricing and the
+              evolution of congestion control", Automatica
+              35(12)1969-1985, December 1999,
+              <http://www.sciencedirect.com/science/article/pii/
+              S0005109899001351>.
+
+   [GentleAggro]
+              Flach, T., Dukkipati, N., Terzis, A., Raghavan, B.,
+              Cardwell, N., Cheng, Y., Jain, A., Hao, S., Katz-Bassett,
+              E., and R. Govindan, "Reducing web latency: the virtue of
+              gentle aggression", ACM SIGCOMM CCR 43(4)159-170, August
+              2013, <http://doi.acm.org/10.1145/2486001.2486014>.
+
+   [IOSArch]  Bollapragada, V., White, R., and C. Murphy, "Inside Cisco
+              IOS Software Architecture", Cisco Press: CCIE Professional
+              Development ISBN13: 978-1-57870-181-0, July 2000.
+
+   [PIE]      Pan, R., Natarajan, P., Piglione, C., Prabhu, M.,
+              Subramanian, V., Baker, F., and B. Steeg, "PIE: A
+              Lightweight Control Scheme To Address the Bufferbloat
+              Problem", Work in Progress, February 2014.
+
+   [PktSizeEquCC]
+              Vasallo, P., "Variable Packet Size Equation-Based
+              Congestion Control", ICSI Technical Report tr-00-008,
+              2000, <http://http.icsi.berkeley.edu/ftp/global/pub/
+              techreports/2000/tr-00-008.pdf>.
+
+   [RED93]    Floyd, S. and V. Jacobson, "Random Early Detection (RED)
+              gateways for Congestion Avoidance", IEEE/ACM Transactions
+              on Networking 1(4) 397--413, August 1993,
+              <http://ieeexplore.ieee.org/xpls/
+              abs_all.jsp?arnumber=251892>.
+
+   [REDbias]  Eddy, W. and M. Allman, "A Comparison of RED's Byte and
+              Packet Modes", Computer Networks 42(3) 261--280, June
+              2003,
+              <http://www.ir.bbn.com/documents/articles/redbias.ps>.
+
+   [REDbyte]  De Cnodder, S., Elloumi, O., and K. Pauwels, "Effect of
+              different packet sizes on RED performance", Proc. 5th IEEE
+              Symposium on Computers and Communications (ISCC) 793-799,
+              July 2000, <http://ieeexplore.ieee.org/xpls/
+              abs_all.jsp?arnumber=860741>.
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 30]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   [RFC2474]  Nichols, K., Blake, S., Baker, F., and D. Black,
+              "Definition of the Differentiated Services Field (DS
+              Field) in the IPv4 and IPv6 Headers", RFC 2474, December
+              1998.
+
+   [RFC3426]  Floyd, S., "General Architectural and Policy
+              Considerations", RFC 3426, November 2002.
+
+   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
+              Jacobson, "RTP: A Transport Protocol for Real-Time
+              Applications", STD 64, RFC 3550, July 2003.
+
+   [RFC3714]  Floyd, S. and J. Kempf, "IAB Concerns Regarding Congestion
+              Control for Voice Traffic in the Internet", RFC 3714,
+              March 2004.
+
+   [RFC4828]  Floyd, S. and E. Kohler, "TCP Friendly Rate Control
+              (TFRC): The Small-Packet (SP) Variant", RFC 4828, April
+              2007.
+
+   [RFC5348]  Floyd, S., Handley, M., Padhye, J., and J. Widmer, "TCP
+              Friendly Rate Control (TFRC): Protocol Specification", RFC
+              5348, September 2008.
+
+   [RFC5562]  Kuzmanovic, A., Mondal, A., Floyd, S., and K.
+              Ramakrishnan, "Adding Explicit Congestion Notification
+              (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562, June
+              2009.
+
+   [RFC5670]  Eardley, P., "Metering and Marking Behaviour of PCN-
+              Nodes", RFC 5670, November 2009.
+
+   [RFC5681]  Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
+              Control", RFC 5681, September 2009.
+
+   [RFC5690]  Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding
+              Acknowledgement Congestion Control to TCP", RFC 5690,
+              February 2010.
+
+   [RFC6077]  Papadimitriou, D., Welzl, M., Scharf, M., and B. Briscoe,
+              "Open Research Issues in Internet Congestion Control", RFC
+              6077, February 2011.
+
+   [RFC6679]  Westerlund, M., Johansson, I., Perkins, C., O'Hanlon, P.,
+              and K. Carlberg, "Explicit Congestion Notification (ECN)
+              for RTP over UDP", RFC 6679, August 2012.
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 31]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   [RFC6789]  Briscoe, B., Woundy, R., and A. Cooper, "Congestion
+              Exposure (ConEx) Concepts and Use Cases", RFC 6789,
+              December 2012.
+
+   [Rate_fair_Dis]
+              Briscoe, B., "Flow Rate Fairness: Dismantling a Religion",
+              ACM CCR 37(2)63-74, April 2007,
+              <http://portal.acm.org/citation.cfm?id=1232926>.
+
+   [gentle_RED]
+              Floyd, S., "Recommendation on using the "gentle_" variant
+              of RED", Web page , March 2000,
+              <http://www.icir.org/floyd/red/gentle.html>.
+
+   [pBox]     Floyd, S. and K. Fall, "Promoting the Use of End-to-End
+              Congestion Control", IEEE/ACM Transactions on Networking
+              7(4) 458--472, August 1999, <http://ieeexplore.ieee.org/
+              xpls/abs_all.jsp?arnumber=793002>.
+
+   [pktByteEmail]
+              Floyd, S., "RED: Discussions of Byte and Packet Modes",
+              email, March 1997,
+              <http://ee.lbl.gov/floyd/REDaveraging.txt>.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 32]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+Appendix A.  Survey of RED Implementation Status
+
+   This Appendix is informative, not normative.
+
+   In May 2007 a survey was conducted of 84 vendors to assess how widely
+   drop probability based on packet size has been implemented in RED
+   Table 3.  About 19% of those surveyed replied, giving a sample size
+   of 16.  Although in most cases we do not have permission to identify
+   the respondents, we can say that those that have responded include
+   most of the larger equipment vendors, covering a large fraction of
+   the market.  The two who gave permission to be identified were Cisco
+   and Alcatel-Lucent.  The others range across the large network
+   equipment vendors at L3 & L2, firewall vendors, wireless equipment
+   vendors, as well as large software businesses with a small selection
+   of networking products.  All those who responded confirmed that they
+   have not implemented the variant of RED with drop dependent on packet
+   size (2 were fairly sure they had not but needed to check more
+   thoroughly).  At the time the survey was conducted, Linux did not
+   implement RED with packet-size bias of drop, although we have not
+   investigated a wider range of open source code.
+
+     +-------------------------------+----------------+--------------+
+     |                      Response | No. of vendors | % of vendors |
+     +-------------------------------+----------------+--------------+
+     |               Not implemented |             14 |          17% |
+     |    Not implemented (probably) |              2 |           2% |
+     |                   Implemented |              0 |           0% |
+     |                   No response |             68 |          81% |
+     | Total companies/orgs surveyed |             84 |         100% |
+     +-------------------------------+----------------+--------------+
+
+    Table 3: Vendor Survey on byte-mode drop variant of RED (lower drop
+                      probability for small packets)
+
+   Where reasons were given for why the byte-mode drop variant had not
+   been implemented, the extra complexity of packet-bias code was most
+   prevalent, though one vendor had a more principled reason for
+   avoiding it -- similar to the argument of this document.
+
+   Our survey was of vendor implementations, so we cannot be certain
+   about operator deployment.  But we believe many queues in the
+   Internet are still tail drop.  The company of one of the co-authors
+   (BT) has widely deployed RED; however, many tail-drop queues are
+   bound to still exist, particularly in access network equipment and on
+   middleboxes like firewalls, where RED is not always available.
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 33]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   Routers using a memory architecture based on fixed-size buffers with
+   borrowing may also still be prevalent in the Internet.  As explained
+   in Section 4.2.1, these also provide a marginal (but legitimate) bias
+   towards small packets.  So even though RED byte-mode drop is not
+   prevalent, it is likely there is still some bias towards small
+   packets in the Internet due to tail-drop and fixed-buffer borrowing.
+
+Appendix B.  Sufficiency of Packet-Mode Drop
+
+   This Appendix is informative, not normative.
+
+   Here we check that packet-mode drop (or marking) in the network gives
+   sufficiently generic information for the transport layer to use.  We
+   check against a 2x2 matrix of four scenarios that may occur now or in
+   the future (Table 4).  Checking the two scenarios in each of the
+   horizontal and vertical dimensions tests the extremes of sensitivity
+   to packet size in the transport and in the network respectively.
+
+   Note that this section does not consider byte-mode drop at all.
+   Having deprecated byte-mode drop, the goal here is to check that
+   packet-mode drop will be sufficient in all cases.
+
+   +-------------------------------+-----------------+-----------------+
+   |                  Transport -> |  a) Independent | b) Dependent on |
+   | ----------------------------- |  of packet size |  packet size of |
+   | Network                       |  of congestion  |    congestion   |
+   |                               |  notifications  |  notifications  |
+   +-------------------------------+-----------------+-----------------+
+   | 1) Predominantly bit-         |   Scenario a1)  |   Scenario b1)  |
+   | congestible network           |                 |                 |
+   | 2) Mix of bit-congestible and |   Scenario a2)  |   Scenario b2)  |
+   | pkt-congestible network       |                 |                 |
+   +-------------------------------+-----------------+-----------------+
+
+                Table 4: Four Possible Congestion Scenarios
+
+   Appendix B.1 focuses on the horizontal dimension of Table 4 checking
+   that packet-mode drop (or marking) gives sufficient information,
+   whether or not the transport uses it -- scenarios b) and a)
+   respectively.
+
+   Appendix B.2 focuses on the vertical dimension of Table 4, checking
+   that packet-mode drop gives sufficient information to the transport
+   whether resources in the network are bit-congestible or packet-
+   congestible (these terms are defined in Section 1.1).
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 34]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   Notation:  To be concrete, we will compare two flows with different
+      packet sizes, s_1 and s_2.  As an example, we will take
+      s_1 = 60 B = 480 b and s_2 = 1,500 B = 12,000 b.
+
+      A flow's bit rate, x [bps], is related to its packet rate, u
+      [pps], by
+
+         x(t) = s*u(t).
+
+      In the bit-congestible case, path congestion will be denoted by
+      p_b, and in the packet-congestible case by p_p.  When either case
+      is implied, the letter p alone will denote path congestion.
+
+B.1.  Packet-Size (In)Dependence in Transports
+
+   In all cases, we consider a packet-mode drop queue that indicates
+   congestion by dropping (or marking) packets with probability p
+   irrespective of packet size.  We use an example value of loss
+   (marking) probability, p=0.1%.
+
+   A transport like TCP as specified in RFC 5681 treats a congestion
+   notification on any packet whatever its size as one event.  However,
+   a network with just the packet-mode drop algorithm gives more
+   information if the transport chooses to use it.  We will use Table 5
+   to illustrate this.
+
+   We will set aside the last column until later.  The columns labelled
+   'Flow 1' and 'Flow 2' compare two flows consisting of 60 B and
+   1,500 B packets respectively.  The body of the table considers two
+   separate cases, one where the flows have an equal bit rate and the
+   other with equal packet rates.  In both cases, the two flows fill a
+   96 Mbps link.  Therefore, in the equal bit rate case, they each have
+   half the bit rate (48Mbps).  Whereas, with equal packet rates, Flow 1
+   uses 25 times smaller packets so it gets 25 times less bit rate -- it
+   only gets 1/(1+25) of the link capacity (96 Mbps / 26 = 4 Mbps after
+   rounding).  In contrast Flow 2 gets 25 times more bit rate (92 Mbps)
+   in the equal packet rate case because its packets are 25 times
+   larger.  The packet rate shown for each flow could easily be derived
+   once the bit rate was known by dividing the bit rate by packet size,
+   as shown in the column labelled 'Formula'.
+
+
+
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 35]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+      Parameter               Formula       Flow 1   Flow 2 Combined
+      ----------------------- ----------- -------- -------- --------
+      Packet size             s/8             60 B  1,500 B    (Mix)
+      Packet size             s              480 b 12,000 b    (Mix)
+      Pkt loss probability    p               0.1%     0.1%     0.1%
+
+      EQUAL BIT RATE CASE
+      Bit rate                x            48 Mbps  48 Mbps  96 Mbps
+      Packet rate             u = x/s     100 kpps   4 kpps 104 kpps
+      Absolute pkt-loss rate  p*u          100 pps    4 pps  104 pps
+      Absolute bit-loss rate  p*u*s        48 kbps  48 kbps  96 kbps
+      Ratio of lost/sent pkts p*u/u           0.1%     0.1%     0.1%
+      Ratio of lost/sent bits p*u*s/(u*s)     0.1%     0.1%     0.1%
+
+      EQUAL PACKET RATE CASE
+      Bit rate                x             4 Mbps  92 Mbps  96 Mbps
+      Packet rate             u = x/s       8 kpps   8 kpps  15 kpps
+      Absolute pkt-loss rate  p*u            8 pps    8 pps   15 pps
+      Absolute bit-loss rate  p*u*s         4 kbps  92 kbps  96 kbps
+      Ratio of lost/sent pkts p*u/u           0.1%     0.1%     0.1%
+      Ratio of lost/sent bits p*u*s/(u*s)     0.1%     0.1%     0.1%
+
+    Table 5: Absolute Loss Rates and Loss Ratios for Flows of Small and
+                      Large Packets and Both Combined
+
+   So far, we have merely set up the scenarios.  We now consider
+   congestion notification in the scenario.  Two TCP flows with the same
+   round-trip time aim to equalise their packet-loss rates over time;
+   that is, the number of packets lost in a second, which is the packets
+   per second (u) multiplied by the probability that each one is dropped
+   (p).  Thus, TCP converges on the case labelled 'Equal packet rate' in
+   the table, where both flows aim for the same absolute packet-loss
+   rate (both 8 pps in the table).
+
+   Packet-mode drop actually gives flows sufficient information to
+   measure their loss rate in bits per second, if they choose, not just
+   packets per second.  Each flow can count the size of a lost or marked
+   packet and scale its rate response in proportion (as TFRC-SP does).
+   The result is shown in the row entitled 'Absolute bit-loss rate',
+   where the bits lost in a second is the packets per second (u)
+   multiplied by the probability of losing a packet (p) multiplied by
+   the packet size (s).  Such an algorithm would try to remove any
+   imbalance in the bit-loss rate such as the wide disparity in the case
+   labelled 'Equal packet rate' (4k bps vs. 92 kbps).  Instead, a
+   packet-size-dependent algorithm would aim for equal bit-loss rates,
+   which would drive both flows towards the case labelled 'Equal bit
+   rate', by driving them to equal bit-loss rates (both 48 kbps in this
+   example).
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 36]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   The explanation so far has assumed that each flow consists of packets
+   of only one constant size.  Nonetheless, it extends naturally to
+   flows with mixed packet sizes.  In the right-most column of Table 5,
+   a flow of mixed-size packets is created simply by considering Flow 1
+   and Flow 2 as a single aggregated flow.  There is no need for a flow
+   to maintain an average packet size.  It is only necessary for the
+   transport to scale its response to each congestion indication by the
+   size of each individual lost (or marked) packet.  Taking, for
+   example, the case labelled 'Equal packet rate', in one second about 8
+   small packets and 8 large packets are lost (making closer to 15 than
+   16 losses per second due to rounding).  If the transport multiplies
+   each loss by its size, in one second it responds to 8*480 and
+   8*12,000 lost bits, adding up to 96,000 lost bits in a second.  This
+   double checks correctly, being the same as 0.1% of the total bit rate
+   of 96 Mbps.  For completeness, the formula for absolute bit-loss rate
+   is p(u1*s1+u2*s2).
+
+   Incidentally, a transport will always measure the loss probability
+   the same, irrespective of whether it measures in packets or in bytes.
+   In other words, the ratio of lost packets to sent packets will be the
+   same as the ratio of lost bytes to sent bytes.  (This is why TCP's
+   bit rate is still proportional to packet size, even when byte
+   counting is used, as recommended for TCP in [RFC5681], mainly for
+   orthogonal security reasons.)  This is intuitively obvious by
+   comparing two example flows; one with 60 B packets, the other with
+   1,500 B packets.  If both flows pass through a queue with drop
+   probability 0.1%, each flow will lose 1 in 1,000 packets.  In the
+   stream of 60 B packets, the ratio of lost bytes to sent bytes will be
+   60 B in every 60,000 B; and in the stream of 1,500 B packets, the
+   loss ratio will be 1,500 B out of 1,500,000 B.  When the transport
+   responds to the ratio of lost to sent packets, it will measure the
+   same ratio whether it measures in packets or bytes: 0.1% in both
+   cases.  The fact that this ratio is the same whether measured in
+   packets or bytes can be seen in Table 5, where the ratio of lost
+   packets to sent packets and the ratio of lost bytes to sent bytes is
+   always 0.1% in all cases (recall that the scenario was set up with
+   p=0.1%).
+
+   This discussion of how the ratio can be measured in packets or bytes
+   is only raised here to highlight that it is irrelevant to this memo!
+   Whether or not a transport depends on packet size depends on how this
+   ratio is used within the congestion control algorithm.
+
+   So far, we have shown that packet-mode drop passes sufficient
+   information to the transport layer so that the transport can take bit
+   congestion into account, by using the sizes of the packets that
+   indicate congestion.  We have also shown that the transport can
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 37]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   choose not to take packet size into account if it wishes.  We will
+   now consider whether the transport can know which to do.
+
+B.2.  Bit-Congestible and Packet-Congestible Indications
+
+   As a thought-experiment, imagine an idealised congestion notification
+   protocol that supports both bit-congestible and packet-congestible
+   resources.  It would require at least two ECN flags, one for each of
+   the bit-congestible and packet-congestible resources.
+
+   1.  A packet-congestible resource trying to code congestion level p_p
+       into a packet stream should mark the idealised 'packet
+       congestion' field in each packet with probability p_p
+       irrespective of the packet's size.  The transport should then
+       take a packet with the packet congestion field marked to mean
+       just one mark, irrespective of the packet size.
+
+   2.  A bit-congestible resource trying to code time-varying byte-
+       congestion level p_b into a packet stream should mark the 'byte
+       congestion' field in each packet with probability p_b, again
+       irrespective of the packet's size.  Unlike before, the transport
+       should take a packet with the byte congestion field marked to
+       count as a mark on each byte in the packet.
+
+   This hides a fundamental problem -- much more fundamental than
+   whether we can magically create header space for yet another ECN
+   flag, or whether it would work while being deployed incrementally.
+   Distinguishing drop from delivery naturally provides just one
+   implicit bit of congestion indication information -- the packet is
+   either dropped or not.  It is hard to drop a packet in two ways that
+   are distinguishable remotely.  This is a similar problem to that of
+   distinguishing wireless transmission losses from congestive losses.
+
+   This problem would not be solved, even if ECN were universally
+   deployed.  A congestion notification protocol must survive a
+   transition from low levels of congestion to high.  Marking two states
+   is feasible with explicit marking, but it is much harder if packets
+   are dropped.  Also, it will not always be cost-effective to implement
+   AQM at every low-level resource, so drop will often have to suffice.
+
+   We are not saying two ECN fields will be needed (and we are not
+   saying that somehow a resource should be able to drop a packet in one
+   of two different ways so that the transport can distinguish which
+   sort of drop it was!).  These two congestion notification channels
+   are a conceptual device to illustrate a dilemma we could face in the
+   future.  Section 3 gives four good reasons why it would be a bad idea
+   to allow for packet size by biasing drop probability in favour of
+   small packets within the network.  The impracticality of our thought
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 38]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   experiment shows that it will be hard to give transports a practical
+   way to know whether or not to take into account the size of
+   congestion indication packets.
+
+   Fortunately, this dilemma is not pressing because by design most
+   equipment becomes bit-congested before its packet processing becomes
+   congested (as already outlined in Section 1.1).  Therefore,
+   transports can be designed on the relatively sound assumption that a
+   congestion indication will usually imply bit congestion.
+
+   Nonetheless, although the above idealised protocol isn't intended for
+   implementation, we do want to emphasise that research is needed to
+   predict whether there are good reasons to believe that packet
+   congestion might become more common, and if so, to find a way to
+   somehow distinguish between bit and packet congestion [RFC3714].
+
+   Recently, the dual resource queue (DRQ) proposal [DRQ] has been made
+   on the premise that, as network processors become more cost-
+   effective, per-packet operations will become more complex
+   (irrespective of whether more function in the network is desirable).
+   Consequently the premise is that CPU congestion will become more
+   common.  DRQ is a proposed modification to the RED algorithm that
+   folds both bit congestion and packet congestion into one signal
+   (either loss or ECN).
+
+   Finally, we note one further complication.  Strictly, packet-
+   congestible resources are often cycle-congestible.  For instance, for
+   routing lookups, load depends on the complexity of each lookup and
+   whether or not the pattern of arrivals is amenable to caching.  This
+   also reminds us that any solution must not require a forwarding
+   engine to use excessive processor cycles in order to decide how to
+   say it has no spare processor cycles.
+
+Appendix C.  Byte-Mode Drop Complicates Policing Congestion Response
+
+   This section is informative, not normative.
+
+   There are two main classes of approach to policing congestion
+   response: (i) policing at each bottleneck link or (ii) policing at
+   the edges of networks.  Packet-mode drop in RED is compatible with
+   either, while byte-mode drop precludes edge policing.
+
+   The simplicity of an edge policer relies on one dropped or marked
+   packet being equivalent to another of the same size without having to
+   know which link the drop or mark occurred at.  However, the byte-mode
+   drop algorithm has to depend on the local MTU of the line -- it needs
+   to use some concept of a 'normal' packet size.  Therefore, one
+   dropped or marked packet from a byte-mode drop algorithm is not
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 39]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+   necessarily equivalent to another from a different link.  A policing
+   function local to the link can know the local MTU where the
+   congestion occurred.  However, a policer at the edge of the network
+   cannot, at least not without a lot of complexity.
+
+   The early research proposals for type (i) policing at a bottleneck
+   link [pBox] used byte-mode drop, then detected flows that contributed
+   disproportionately to the number of packets dropped.  However, with
+   no extra complexity, later proposals used packet-mode drop and looked
+   for flows that contributed a disproportionate amount of dropped bytes
+   [CHOKe_Var_Pkt].
+
+   Work is progressing on the Congestion Exposure (ConEx) protocol
+   [RFC6789], which enables a type (ii) edge policer located at a user's
+   attachment point.  The idea is to be able to take an integrated view
+   of the effect of all a user's traffic on any link in the
+   internetwork.  However, byte-mode drop would effectively preclude
+   such edge policing because of the MTU issue above.
+
+   Indeed, making drop probability depend on the size of the packets
+   that bits happen to be divided into would simply encourage the bits
+   to be divided into smaller packets in order to confuse policing.  In
+   contrast, as long as a dropped/marked packet is taken to mean that
+   all the bytes in the packet are dropped/marked, a policer can remain
+   robust against sequences of bits being re-divided into different size
+   packets or across different size flows [Rate_fair_Dis].
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 40]
+
+RFC 7141         Byte and Packet Congestion Notification   February 2014
+
+
+Authors' Addresses
+
+   Bob Briscoe
+   BT
+   B54/77, Adastral Park
+   Martlesham Heath
+   Ipswich  IP5 3RE
+   UK
+
+   Phone: +44 1473 645196
+   EMail: bob.briscoe@bt.com
+   URI:   http://bobbriscoe.net/
+
+   Jukka Manner
+   Aalto University
+   Department of Communications and Networking (Comnet)
+   P.O. Box 13000
+   FIN-00076 Aalto
+   Finland
+
+   Phone: +358 9 470 22481
+   EMail: jukka.manner@aalto.fi
+   URI:   http://www.netlab.tkk.fi/~jmanner/
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Briscoe & Manner          Best Current Practice                [Page 41]
+