summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc8033.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc8033.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc8033.txt')
-rw-r--r--doc/rfc/rfc8033.txt1683
1 files changed, 1683 insertions, 0 deletions
diff --git a/doc/rfc/rfc8033.txt b/doc/rfc/rfc8033.txt
new file mode 100644
index 0000000..952afa5
--- /dev/null
+++ b/doc/rfc/rfc8033.txt
@@ -0,0 +1,1683 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) R. Pan
+Request for Comments: 8033 P. Natarajan
+Category: Experimental Cisco Systems
+ISSN: 2070-1721 F. Baker
+ Unaffiliated
+ G. White
+ CableLabs
+ February 2017
+
+
+ Proportional Integral Controller Enhanced (PIE):
+ A Lightweight Control Scheme to Address the Bufferbloat Problem
+
+Abstract
+
+ Bufferbloat is a phenomenon in which excess buffers in the network
+ cause high latency and latency variation. As more and more
+ interactive applications (e.g., voice over IP, real-time video
+ streaming, and financial transactions) run in the Internet, high
+ latency and latency variation degrade application performance. There
+ is a pressing need to design intelligent queue management schemes
+ that can control latency and latency variation, and hence provide
+ desirable quality of service to users.
+
+ This document presents a lightweight active queue management design
+ called "PIE" (Proportional Integral controller Enhanced) that can
+ effectively control the average queuing latency to a target value.
+ Simulation results, theoretical analysis, and Linux testbed results
+ have shown that PIE can ensure low latency and achieve high link
+ utilization under various congestion situations. The design does not
+ require per-packet timestamps, so it incurs very little overhead and
+ is simple enough to implement in both hardware and software.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 1]
+
+RFC 8033 PIE February 2017
+
+
+Status of This Memo
+
+ This document is not an Internet Standards Track specification; it is
+ published for examination, experimental implementation, and
+ evaluation.
+
+ This document defines an Experimental Protocol for the Internet
+ community. This document is a product of the Internet Engineering
+ Task Force (IETF). It represents the consensus of the IETF
+ community. It has received public review and has been approved for
+ publication by the Internet Engineering Steering Group (IESG). Not
+ all documents approved by the IESG are a candidate for any level of
+ Internet Standard; see Section 2 of RFC 7841.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc8033.
+
+Copyright Notice
+
+ Copyright (c) 2017 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 2]
+
+RFC 8033 PIE February 2017
+
+
+Table of Contents
+
+ 1. Introduction ....................................................3
+ 2. Terminology .....................................................5
+ 3. Design Goals ....................................................5
+ 4. The Basic PIE Scheme ............................................6
+ 4.1. Random Dropping ............................................7
+ 4.2. Drop Probability Calculation ...............................7
+ 4.3. Latency Calculation ........................................9
+ 4.4. Burst Tolerance ...........................................10
+ 5. Optional Design Elements of PIE ................................11
+ 5.1. ECN Support ...............................................11
+ 5.2. Dequeue Rate Estimation ...................................11
+ 5.3. Setting PIE Active and Inactive ...........................13
+ 5.4. Derandomization ...........................................14
+ 5.5. Cap Drop Adjustment .......................................15
+ 6. Implementation Cost ............................................15
+ 7. Scope of Experimentation .......................................17
+ 8. Incremental Deployment .........................................17
+ 9. Security Considerations ........................................18
+ 10. References ....................................................18
+ 10.1. Normative References .....................................18
+ 10.2. Informative References ...................................18
+ Appendix A. The Basic PIE Pseudocode ..............................21
+ Appendix B. Pseudocode for PIE with Optional Enhancement ..........24
+ Contributors ......................................................29
+ Authors' Addresses ................................................30
+
+1. Introduction
+
+ The explosion of smart phones, tablets, and video traffic in the
+ Internet brings about a unique set of challenges for congestion
+ control. To avoid packet drops, many service providers or
+ data-center operators require vendors to put in as much buffer as
+ possible. Because of the rapid decrease in memory chip prices, these
+ requests are easily accommodated to keep customers happy. While this
+ solution succeeds in assuring low packet loss and high TCP
+ throughput, it suffers from a major downside. TCP continuously
+ increases its sending rate and causes network buffers to fill up.
+ TCP cuts its rate only when it receives a packet drop or mark that is
+ interpreted as a congestion signal. However, drops and marks usually
+ occur when network buffers are full or almost full. As a result,
+ excess buffers, initially designed to avoid packet drops, would lead
+ to highly elevated queuing latency and latency variation. Designing
+ a queue management scheme is a delicate balancing act: it not only
+ should allow short-term bursts to smoothly pass but also should
+ control the average latency in the presence of long-running greedy
+ flows.
+
+
+
+Pan, et al. Experimental [Page 3]
+
+RFC 8033 PIE February 2017
+
+
+ Active Queue Management (AQM) schemes could potentially solve the
+ aforementioned problem. AQM schemes, such as Random Early Detection
+ (RED) [RED] as suggested in [RFC2309] (which is now obsoleted by
+ [RFC7567]), have been around for well over a decade. RED is
+ implemented in a wide variety of network devices, both in hardware
+ and software. Unfortunately, due to the fact that RED needs careful
+ tuning of its parameters for various network conditions, most network
+ operators don't turn RED on. In addition, RED is designed to control
+ the queue length, which would affect latency implicitly. It does not
+ control latency directly. Hence, the Internet today still lacks an
+ effective design that can control buffer latency to improve the
+ quality of experience to latency-sensitive applications. The more
+ recently published RFC 7567 calls for new methods of controlling
+ network latency.
+
+ New algorithms are beginning to emerge to control queuing latency
+ directly to address the bufferbloat problem [CoDel]. Along these
+ lines, Proportional Integral controller Enhanced (PIE) also aims to
+ keep the benefits of RED, including easy implementation and
+ scalability to high speeds. Similar to RED, PIE randomly drops an
+ incoming packet at the onset of congestion. Congestion detection,
+ however, is based on the queuing latency instead of the queue length
+ (as with RED). Furthermore, PIE also uses the derivative (rate of
+ change) of the queuing latency to help determine congestion levels
+ and an appropriate response. The design parameters of PIE are chosen
+ via control theory stability analysis. While these parameters can be
+ fixed to work in various traffic conditions, they could be made
+ self-tuning to optimize system performance.
+
+ Separately, it is assumed that any latency-based AQM scheme would be
+ applied over a Fair Queuing (FQ) structure or one of its approximate
+ designs, Flow Queuing or Class-Based Queuing (CBQ). FQ is one of the
+ most studied scheduling algorithms since it was first proposed in
+ 1985 [RFC970]. CBQ has been a standard feature in most network
+ devices today [CBQ]. Any AQM scheme that is built on top of FQ or
+ CBQ could benefit from these advantages. Furthermore, these
+ advantages, such as per-flow or per-class fairness, are orthogonal to
+ the AQM design whose primary goal is to control latency for a given
+ queue. For flows that are classified into the same class and put
+ into the same queue, one needs to ensure that their latency is better
+ controlled and that their fairness is not worse than those under the
+ standard DropTail or RED design. More details about the relationship
+ between FQ and AQM can be found in [RFC7806].
+
+ In October 2013, CableLabs' Data-Over-Cable Service Interface
+ Specification 3.1 (DOCSIS 3.1) specification [DOCSIS_3.1] mandated
+ that cable modems implement a specific variant of the PIE design as
+ the active queue management algorithm. In addition to cable-specific
+
+
+
+Pan, et al. Experimental [Page 4]
+
+RFC 8033 PIE February 2017
+
+
+ improvements, the PIE design in DOCSIS 3.1 [RFC8034] has improved the
+ original design in several areas, including derandomization of coin
+ tosses and enhanced burst protection.
+
+ This document describes the design of PIE and separates it into basic
+ elements and optional components that may be implemented to enhance
+ the performance of PIE.
+
+2. Terminology
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in RFC 2119 [RFC2119].
+
+3. Design Goals
+
+ A queue management framework is designed to improve the performance
+ of interactive and latency-sensitive applications. It should follow
+ the general guidelines set by the AQM working group document "IETF
+ Recommendations Regarding Active Queue Management" [RFC7567]. More
+ specifically, the PIE design has the following basic criteria.
+
+ * First, queuing latency, instead of queue length, is controlled.
+ Queue sizes change with queue draining rates and various flows'
+ round-trip times. Latency bloat is the real issue that needs to
+ be addressed, as it impairs real-time applications. If latency
+ can be controlled, bufferbloat is not an issue. In fact, once
+ latency is under control, it frees up buffers for sporadic bursts.
+
+ * Secondly, PIE aims to attain high link utilization. The goal of
+ low latency shall be achieved without suffering link
+ underutilization or losing network efficiency. An early
+ congestion signal could cause TCP to back off and avoid queue
+ buildup. On the other hand, however, TCP's rate reduction could
+ result in link underutilization. There is a delicate balance
+ between achieving high link utilization and low latency.
+
+ * Furthermore, the scheme should be simple to implement and easily
+ scalable in both hardware and software. PIE strives to maintain
+ design simplicity similar to that of RED, which has been
+ implemented in a wide variety of network devices.
+
+ * Finally, the scheme should ensure system stability for various
+ network topologies and scale well across an arbitrary number of
+ streams. Design parameters shall be set automatically. Users
+ only need to set performance-related parameters such as target
+ queue latency, not design parameters.
+
+
+
+
+Pan, et al. Experimental [Page 5]
+
+RFC 8033 PIE February 2017
+
+
+ In the following text, the design of PIE and its operation are
+ described in detail.
+
+4. The Basic PIE Scheme
+
+ As illustrated in Figure 1, PIE is comprised of three simple basic
+ components: a) random dropping at enqueuing, b) periodic drop
+ probability updates, and c) latency calculation. When a packet
+ arrives, a random decision is made regarding whether to drop the
+ packet. The drop probability is updated periodically based on how
+ far the current latency is away from the target value and whether the
+ queuing latency is currently trending up or down. The queuing
+ latency can be obtained using direct measurements or using
+ estimations calculated from the queue length and the dequeue rate.
+
+ The detailed definition of parameters can be found in Appendix A of
+ this document ("The Basic PIE Pseudocode"). Any state variables that
+ PIE maintains are noted using "PIE->". For a full description of the
+ algorithm, one can refer to the full paper [HPSR-PIE].
+
+ Random Drop
+ / --------------
+ -------/ --------------> | | | | | -------------->
+ /|\ | | | | |
+ | --------------
+ | Queue Buffer \
+ | | \
+ | |Queue \
+ | |Length \
+ | | \
+ | \|/ \/
+ | ----------------- -------------------
+ | | Drop | | |
+ -----<-----| Probability |<---| Latency |
+ | Calculation | | Calculation |
+ ----------------- -------------------
+
+ Figure 1: The PIE Structure
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 6]
+
+RFC 8033 PIE February 2017
+
+
+4.1. Random Dropping
+
+ PIE randomly drops a packet upon its arrival to a queue according to
+ a drop probability, PIE->drop_prob_, that is obtained from the
+ drop-probability-calculation component. The random drop is triggered
+ by a packet's arrival before enqueuing into a queue.
+
+ * Upon a packet enqueue:
+
+ randomly drop the packet with a probability of PIE->drop_prob_.
+
+ To ensure that PIE is "work conserving", we bypass the random drop if
+ the latency sample, PIE->qdelay_old_, is smaller than half of the
+ target latency value (QDELAY_REF) when the drop probability is not
+ too high (i.e., PIE->drop_prob_ < 0.2), or if the queue has less than
+ a couple of packets.
+
+ * Upon a packet enqueue, PIE does the following:
+
+ //Safeguard PIE to be work conserving
+ if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
+ || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) )
+ return ENQUE;
+ else
+ randomly drop the packet with a probability of
+ PIE->drop_prob_.
+
+ PIE optionally supports Explicit Congestion Notification (ECN); see
+ Section 5.1.
+
+4.2. Drop Probability Calculation
+
+ The PIE algorithm periodically updates the drop probability based on
+ the latency samples -- not only the current latency sample but also
+ whether the latency is trending up or down. This is the classical
+ Proportional Integral (PI) controller method, which is known for
+ eliminating steady-state errors. This type of controller has been
+ studied before for controlling the queue length [PI] [QCN]. PIE
+ adopts the PI controller for controlling latency. The algorithm also
+ auto-adjusts the control parameters based on how heavy the congestion
+ is, which is reflected in the current drop probability. Note that
+ the current drop probability is a direct measure of the current
+ congestion level; there is no need to measure the arrival rate and
+ dequeue rate mismatches.
+
+ When a congestion period ends, we might be left with a high drop
+ probability with light packet arrivals. Hence, the PIE algorithm
+ includes a mechanism by which the drop probability decays
+
+
+
+Pan, et al. Experimental [Page 7]
+
+RFC 8033 PIE February 2017
+
+
+ exponentially (rather than linearly) when the system is not
+ congested. This would help the drop probability converge to 0 more
+ quickly, while the PI controller ensures that it would eventually
+ reach zero. The decay parameter of 2% gives us a time constant
+ around 50 * T_UPDATE.
+
+ Specifically, the PIE algorithm periodically adjusts the drop
+ probability every T_UPDATE interval:
+
+ * calculate drop probability PIE->drop_prob_, and autotune it as
+ follows:
+
+ p = alpha * (current_qdelay - QDELAY_REF) +
+ beta * (current_qdelay - PIE->qdelay_old_);
+
+ if (PIE->drop_prob_ < 0.000001) {
+ p /= 2048;
+ } else if (PIE->drop_prob_ < 0.00001) {
+ p /= 512;
+ } else if (PIE->drop_prob_ < 0.0001) {
+ p /= 128;
+ } else if (PIE->drop_prob_ < 0.001) {
+ p /= 32;
+ } else if (PIE->drop_prob_ < 0.01) {
+ p /= 8;
+ } else if (PIE->drop_prob_ < 0.1) {
+ p /= 2;
+ } else {
+ p = p;
+ }
+ PIE->drop_prob_ += p;
+
+ * decay the drop probability exponentially:
+
+ if (current_qdelay == 0 && PIE->qdelay_old_ == 0) {
+ PIE->drop_prob_ = PIE->drop_prob_ * 0.98;
+ //1 - 1/64 is
+ //sufficient
+ }
+
+ * bound the drop probability:
+
+ if (PIE->drop_prob_ < 0)
+ PIE->drop_prob_ = 0.0
+ if (PIE->drop_prob_ > 1)
+ PIE->drop_prob_ = 1.0
+
+
+
+
+
+Pan, et al. Experimental [Page 8]
+
+RFC 8033 PIE February 2017
+
+
+ * store the current latency value:
+
+ PIE->qdelay_old_ = current_qdelay.
+
+ The update interval, T_UPDATE, is defaulted to be 15 milliseconds.
+ It MAY be reduced on high-speed links in order to provide smoother
+ response. The target latency value, QDELAY_REF, SHOULD be set to 15
+ milliseconds. The variables current_qdelay and PIE->qdelay_old_
+ represent the current and previous samples of the queuing latency,
+ which are calculated by the "latency calculation" component (see
+ Section 4.3). The variable current_qdelay is actually a temporary
+ variable, while PIE->qdelay_old_ is a state variable that PIE keeps.
+ The drop probability is a value between 0 and 1. However,
+ implementations can certainly use integers.
+
+ The controller parameters, alpha and beta (expressed in Hz), are
+ designed using feedback loop analysis, where TCP's behaviors are
+ modeled using the results from well-studied prior art [TCP-Models].
+ Note that the above adjustment of 'p' effectively scales the alpha
+ and beta parameters based on the current congestion level indicated
+ by the drop probability.
+
+ The theoretical analysis of PIE can be found in [HPSR-PIE]. As a
+ rule of thumb, to keep the same feedback loop dynamics, if we cut
+ T_UPDATE in half, we should also cut alpha by half and increase beta
+ by alpha/4. If the target latency is reduced, e.g., for data-center
+ use, the values of alpha and beta should be increased by the same
+ order of magnitude by which the target latency is reduced. For
+ example, if QDELAY_REF is reduced and changed from 15 milliseconds to
+ 150 microseconds -- a reduction of two orders of magnitude -- then
+ alpha and beta values should be increased to alpha * 100 and
+ beta * 100.
+
+4.3. Latency Calculation
+
+ The PIE algorithm uses latency to calculate drop probability in one
+ of two ways:
+
+ * It estimates the current queuing latency using Little's law (see
+ Section 5.2 for details):
+
+ current_qdelay = queue_.byte_length()/dequeue_rate;
+
+ * It may use other techniques for calculating queuing latency, e.g.,
+ time-stamp the packets at enqueue, and use the timestamps to
+ calculate latency during dequeue.
+
+
+
+
+
+Pan, et al. Experimental [Page 9]
+
+RFC 8033 PIE February 2017
+
+
+4.4. Burst Tolerance
+
+ PIE does not penalize short-term packet bursts as suggested in
+ [RFC7567]. PIE allows bursts of traffic that create finite-duration
+ events in which current queuing latency exceeds QDELAY_REF without
+ triggering packet drops. This document introduces a parameter called
+ "MAX_BURST"; MAX_BURST defines the burst duration that will be
+ protected. By default, the parameter SHOULD be set to 150
+ milliseconds. For simplicity, the PIE algorithm MAY effectively
+ round MAX_BURST up to an integer multiple of T_UPDATE.
+
+ To implement the burst tolerance function, two basic components of
+ PIE are involved: "random dropping" and "drop probability
+ calculation". The PIE algorithm does the following:
+
+ * In the "random dropping" block and upon packet arrival, PIE checks
+ the following:
+
+ Upon a packet enqueue:
+ if PIE->burst_allowance_ > 0
+ enqueue packet;
+ else
+ randomly drop a packet with a probability of
+ PIE->drop_prob_.
+
+ if (PIE->drop_prob_ == 0 and current_qdelay < QDELAY_REF/2 and
+ PIE->qdelay_old_ < QDELAY_REF/2)
+ PIE->burst_allowance_ = MAX_BURST;
+
+ * In the "drop probability calculation" block, PIE additionally
+ calculates:
+
+ PIE->burst_allowance_ = max(0,PIE->burst_allowance_ - T_UPDATE);
+
+ The burst allowance, noted by PIE->burst_allowance_, is initialized
+ to MAX_BURST. As long as PIE->burst_allowance_ is above zero, an
+ incoming packet will be enqueued, bypassing the random drop process.
+ During each update instance, the value of PIE->burst_allowance_ is
+ decremented by the update period, T_UPDATE, and is bottomed at 0.
+ When the congestion goes away -- defined here as PIE->drop_prob_
+ equals 0 and both the current and previous samples of estimated
+ latency are less than half of QDELAY_REF -- PIE->burst_allowance_ is
+ reset to MAX_BURST.
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 10]
+
+RFC 8033 PIE February 2017
+
+
+5. Optional Design Elements of PIE
+
+ There are several enhancements that are added to further augment the
+ performance of the basic algorithm. For purposes of clarity, they
+ are included in this section.
+
+5.1. ECN Support
+
+ PIE MAY support ECN by marking (rather than dropping) ECN-capable
+ packets [ECN]. This document introduces an additional threshold
+ called "mark_ecnth", which acts as a safeguard: if the calculated
+ drop probability exceeds mark_ecnth, PIE reverts to packet-dropping
+ for ECN-capable packets. The variable mark_ecnth SHOULD be set to
+ 0.1 (10%).
+
+ * To support ECN, the "random drop with a probability of
+ PIE->drop_prob_" function in the "random dropping" block is
+ changed to the following:
+
+ * Upon a packet enqueue:
+
+ if rand() < PIE->drop_prob_:
+ if PIE->drop_prob_ < mark_ecnth && ecn_capable_packet == TRUE:
+ mark packet;
+ else
+ drop packet;
+
+5.2. Dequeue Rate Estimation
+
+ Using timestamps, a latency sample can only be obtained when a packet
+ reaches the head of a queue. When a quick response time is desired
+ or a direct latency sample is not available, one may obtain latency
+ through measuring the dequeue rate. The draining rate of a queue in
+ the network often varies either because other queues are sharing the
+ same link or because the link capacity fluctuates. Rate fluctuation
+ is particularly common in wireless networks. One may measure
+ directly at the dequeue operation. Short, non-persistent bursts of
+ packets result in empty queues from time to time; this would make the
+ measurement less accurate. PIE only measures latency when there is
+ sufficient data in the buffer, i.e., when the queue length is over a
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 11]
+
+RFC 8033 PIE February 2017
+
+
+ certain threshold (DQ_THRESHOLD). PIE measures how long it takes to
+ drain DQ_THRESHOLD packets. More specifically, the rate estimation
+ can be implemented as follows:
+
+ current_qdelay = queue_.byte_length() *
+ PIE->avg_dq_time_/DQ_THRESHOLD;
+
+ * Upon a packet dequeue:
+
+ if PIE->in_measurement_ == FALSE and queue.byte_length() >=
+ DQ_THRESHOLD:
+ PIE->in_measurement_ = TRUE;
+ PIE->measurement_start_ = now;
+ PIE->dq_count_ = 0;
+
+ if PIE->in_measurement_ == TRUE:
+ PIE->dq_count_ = PIE->dq_count_ + deque_pkt_size;
+ if PIE->dq_count_ >= DQ_THRESHOLD then
+ weight = DQ_THRESHOLD/2^16
+ PIE->avg_dq_time_ = (now - PIE->measurement_start_) *
+ weight + PIE->avg_dq_time_ *
+ (1 - weight);
+ PIE->dq_count_ = 0;
+ PIE->measurement_start_ = now
+ else
+ PIE->in_measurement_ = FALSE;
+
+ The parameter PIE->dq_count_ represents the number of bytes departed
+ since the last measurement. Once PIE->dq_count_ is over
+ DQ_THRESHOLD, a measurement sample is obtained. It is recommended
+ that the threshold be set to 16 KB, assuming a typical packet size of
+ around 1 KB or 1.5 KB. This threshold would allow sufficient data to
+ obtain an average draining rate but would also be fast enough (< 64
+ KB) to reflect sudden changes in the draining rate. If DQ_THRESHOLD
+ is smaller than 64 KB, a small weight is used to smooth out the
+ dequeue time and obtain PIE->avg_dq_time_. The dequeue rate is
+ simply DQ_THRESHOLD divided by PIE->avg_dq_time_. This threshold is
+ not crucial for the system's stability. Please note that the update
+ interval for calculating the drop probability is different from the
+ rate measurement cycle. The drop probability calculation is done
+ periodically per Section 4.2, and it is done even when the algorithm
+ is not in a measurement cycle; in this case, the previously latched
+ value of PIE->avg_dq_time_ is used.
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 12]
+
+RFC 8033 PIE February 2017
+
+
+ Random Drop
+ / --------------
+ -------/ --------------------> | | | | | -------------->
+ /|\ | | | | | |
+ | | --------------
+ | | Queue Buffer
+ | | |
+ | | |Queue
+ | | |Length
+ | | |
+ | \|/ \|/
+ | ------------------------------
+ | | Dequeue Rate |
+ -----<-----| & Drop Probability |
+ | Calculation |
+ ------------------------------
+
+ Figure 2: The Enqueue-Based PIE Structure
+
+ In some platforms, enqueuing and dequeuing functions belong to
+ different modules that are independent of each other. In such
+ situations, a pure enqueue-based design can be developed. An
+ enqueue-based design is depicted in Figure 2. The dequeue rate is
+ deduced from the number of packets enqueued and the queue length.
+ The design is based on the following key observation: over a certain
+ time interval, the number of dequeued packets = the number of
+ enqueued packets minus the number of remaining packets in the queue.
+ In this design, everything can be triggered by packet arrival,
+ including the background update process. The design complexity here
+ is similar to the original design.
+
+5.3. Setting PIE Active and Inactive
+
+ Traffic naturally fluctuates in a network. It would be preferable
+ not to unnecessarily drop packets due to a spurious uptick in queuing
+ latency. PIE has an optional feature of automatically becoming
+ active/inactive. To implement this feature, PIE may choose to only
+ become active (from inactive) when the buffer occupancy is over a
+ certain threshold, which may be set to 1/3 of the tail drop
+ threshold. PIE becomes inactive when congestion ends; i.e., when the
+ drop probability reaches 0, current and previous latency samples are
+ all below half of QDELAY_REF.
+
+ Ideally, PIE should become active/inactive based on latency.
+ However, calculating latency when PIE is inactive would introduce
+ unnecessary packet-processing overhead. Weighing the trade-offs,
+ we decided to compare against the tail drop threshold to keep things
+ simple.
+
+
+
+Pan, et al. Experimental [Page 13]
+
+RFC 8033 PIE February 2017
+
+
+ When PIE optionally becomes active/inactive, the burst protection
+ logic described in Section 4.4 is modified as follows:
+
+ * "Random dropping" block: PIE adds the following:
+
+ Upon packet arrival:
+
+ if PIE->active_ == FALSE && queue_length >= TAIL_DROP/3:
+ PIE->active_ = TRUE;
+ PIE->burst_allowance_ = MAX_BURST;
+
+ if PIE->burst_allowance_ > 0
+ enqueue packet;
+ else
+ randomly drop a packet with a probability of
+ PIE->drop_prob_.
+
+ if (PIE->drop_prob_ == 0 and current_qdelay < QDELAY_REF/2 and
+ PIE->qdelay_old_ < QDELAY_REF/2)
+ PIE->active_ = FALSE;
+ PIE->burst_allowance_ = MAX_BURST;
+
+ * "Drop probability calculation" block: PIE does the following:
+ if PIE->active_ == TRUE:
+ PIE->burst_allowance_ =
+ max(0,PIE->burst_allowance_ - T_UPDATE);
+
+5.4. Derandomization
+
+ Although PIE adopts random dropping to achieve latency control,
+ independent coin tosses could introduce outlier situations where
+ packets are dropped too close to each other or too far from each
+ other. This would cause the real drop percentage to temporarily
+ deviate from the intended value PIE->drop_prob_. In certain
+ scenarios, such as a small number of simultaneous TCP flows, these
+ deviations can cause significant deviations in link utilization and
+ queuing latency. PIE may use a derandomization mechanism to avoid
+ such situations. A parameter called "PIE->accu_prob_" is reset to 0
+ after a drop. Upon packet arrival, PIE->accu_prob_ is incremented by
+ the amount of drop probability, PIE->drop_prob_. If PIE->accu_prob_
+ is less than a low threshold, e.g., 0.85, the arriving packet is
+ enqueued; on the other hand, if PIE->accu_prob_ is more than a high
+ threshold, e.g., 8.5, and the queue is congested, the arrival packet
+ is forced to be dropped. A packet is only randomly dropped if
+ PIE->accu_prob_ falls between the two thresholds. Since
+ PIE->accu_prob_ is reset to 0 after a drop, another drop will not
+ happen until 0.85/PIE->drop_prob_ packets later. This avoids packets
+ being dropped too close to each other. In the other extreme case
+
+
+
+Pan, et al. Experimental [Page 14]
+
+RFC 8033 PIE February 2017
+
+
+ where 8.5/PIE->drop_prob_ packets have been enqueued without
+ incurring a drop, PIE would force a drop in order to prevent the
+ drops from being spaced too far apart. Further analysis can be found
+ in [RFC8034].
+
+5.5. Cap Drop Adjustment
+
+ In the case of a single TCP flow, during the slow-start phase the
+ queue could quickly increase, which could result in a very rapid
+ increase in drop probability. In order to prevent an excessive
+ ramp-up that could negatively impact the throughput in this scenario,
+ PIE can cap the maximum drop probability increase in each step.
+
+ * "Drop probability calculation" block: PIE adds the following:
+
+ if (PIE->drop_prob_ >= 0.1 && p > 0.02) {
+ p = 0.02;
+ }
+
+6. Implementation Cost
+
+ PIE can be applied to existing hardware or software solutions. There
+ are three steps involved in PIE, as discussed in Section 4. Their
+ complexities are examined below.
+
+ Upon packet arrival, the algorithm simply drops a packet randomly,
+ based on the drop probability. This step is straightforward and
+ requires no packet header examination and manipulation. If the
+ implementation doesn't rely on packet timestamps for calculating
+ latency, PIE does not require extra memory. Furthermore, the input
+ side of a queue is typically under software control while the output
+ side of a queue is hardware based. Hence, a drop at enqueuing can be
+ readily retrofitted into existing or software implementations.
+
+ The drop probability calculation is done in the background, and it
+ occurs every T_UPDATE interval. Given modern high-speed links, this
+ period translates into once every tens, hundreds, or even thousands
+ of packets. Hence, the calculation occurs at a much slower time
+ scale than the packet-processing time -- at least an order of
+ magnitude slower. The calculation of drop probability involves
+ multiplications using alpha and beta. Since PIE's control law is
+ robust to minor changes in alpha and beta values, an implementation
+ MAY choose these values to the closest multiples of 2 or 1/2 (e.g.,
+ alpha = 1/8, beta = 1 + 1/4) such that the multiplications can be
+ done using simple adds and shifts. As no complicated functions are
+ required, PIE can be easily implemented in both hardware and
+
+
+
+
+
+Pan, et al. Experimental [Page 15]
+
+RFC 8033 PIE February 2017
+
+
+ software. The state requirement is only three variables per queue:
+ burst_allowance_, PIE->drop_prob_, and PIE->qdelay_old_. Hence, the
+ memory overhead is small.
+
+ If one chooses to implement the departure rate estimation, PIE uses a
+ counter to keep track of the number of bytes departed for the current
+ interval. This counter is incremented per packet departure. Every
+ T_UPDATE, PIE calculates latency using the departure rate, which can
+ be implemented using a single multiply operation. Note that many
+ network devices keep track of an interface's departure rate. In this
+ case, PIE might be able to reuse this information and simply skip the
+ third step of the algorithm; hence, it would incur no extra cost. If
+ a platform already leverages packet timestamps for other purposes,
+ PIE can make use of these packet timestamps for latency calculation
+ instead of estimating the departure rate.
+
+ Flow queuing can also be combined with PIE to provide isolation
+ between flows. In this case, it is preferable to have an independent
+ value of drop probability per queue. This allows each flow to
+ receive the most appropriate level of congestion signal and ensures
+ that sparse flows are protected from experiencing packet drops.
+ However, running the entire PIE algorithm independently on each queue
+ in order to calculate the drop probability may be overkill.
+ Furthermore, in the case where departure rate estimation is used to
+ predict queuing latency, it is not possible to calculate an accurate
+ per-queue departure rate upon which to implement the PIE drop
+ probability calculation. Instead, it has been proposed [DOCSIS-AQM]
+ that a single implementation of the PIE drop probability calculation
+ based on the overall latency estimate be used, followed by a
+ per-queue scaling of drop probability based on the ratio of
+ queue depth between the queue in question and the current largest
+ queue. This scaling is reasonably simple and has a couple of nice
+ properties:
+
+ * If a packet is arriving to an empty queue, it is given immunity
+ from packet drops altogether, regardless of the state of the other
+ queues.
+
+ * In the situation where only a single queue is in use, the
+ algorithm behaves exactly like the single-queue PIE algorithm.
+
+ In summary, PIE is simple enough to be implemented in both software
+ and hardware.
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 16]
+
+RFC 8033 PIE February 2017
+
+
+7. Scope of Experimentation
+
+ The design of the PIE algorithm is presented in this document. The
+ PIE algorithm effectively controls the average queuing latency to a
+ target value. The following areas can be used for further study and
+ experimentation:
+
+ * Autotuning of target latency without losing utilization.
+
+ * Autotuning for the average round-trip time of traffic.
+
+ * The proper threshold to transition smoothly between ECN marking
+ and dropping.
+
+ * The enhancements described in Section 5, which can be used in
+ experiments to see if they would be of more value in the real
+ world. If so, they will be incorporated into the basic PIE
+ algorithm.
+
+ * The PIE design, which is separated into the data path and the
+ control path. The control path can be implemented in software.
+ Field tests of other control laws can be performed to experiment
+ with further improvements to PIE's performance.
+
+ Although all network nodes cannot be changed altogether to adopt
+ latency-based AQM schemes such as PIE, a gradual adoption would
+ eventually lead to end-to-end low-latency service for all
+ applications.
+
+8. Incremental Deployment
+
+ From testbed experiments and large-scale simulations of PIE so far,
+ PIE has been shown to be effective across a diverse range of network
+ scenarios. There is no indication that PIE would be harmful to
+ deploy.
+
+ The PIE scheme can be independently deployed and managed without a
+ need for interoperability between different network devices. In
+ addition, any individual buffer queue can be incrementally upgraded
+ to PIE, as it can coexist with existing AQM schemes such as
+ Weighted RED (WRED).
+
+ PIE is intended to be self-configuring. Users should not need to
+ configure any design parameters. Upon installation, the two
+ user-configurable parameters -- QDELAY_REF and MAX_BURST -- will be
+ defaulted to 15 milliseconds and 150 milliseconds for non-data-center
+ network devices and to 15 microseconds and 150 microseconds for
+ data-center switches, respectively.
+
+
+
+Pan, et al. Experimental [Page 17]
+
+RFC 8033 PIE February 2017
+
+
+ Since the data path of the algorithm needs only a simple coin toss
+ and the control-path calculation happens in a much slower time scale,
+ we don't foresee any scaling issues associated with the algorithm as
+ the link speed scales up.
+
+9. Security Considerations
+
+ This document describes PIE, an active queue management algorithm
+ based on implementations in different products. The PIE algorithm
+ introduces no specific security exposures.
+
+10. References
+
+10.1. Normative References
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119,
+ DOI 10.17487/RFC2119, March 1997,
+ <http://www.rfc-editor.org/info/rfc2119>.
+
+10.2. Informative References
+
+ [RFC970] Nagle, J., "On Packet Switches With Infinite Storage",
+ RFC 970, DOI 10.17487/RFC0970, December 1985,
+ <http://www.rfc-editor.org/info/rfc970>.
+
+ [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
+ S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
+ Partridge, C., Peterson, L., Ramakrishnan, K., Shenker,
+ S., Wroclawski, J., and L. Zhang, "Recommendations on
+ Queue Management and Congestion Avoidance in the
+ Internet", RFC 2309, DOI 10.17487/RFC2309, April 1998,
+ <http://www.rfc-editor.org/info/rfc2309>.
+
+ [RFC7567] Baker, F., Ed., and G. Fairhurst, Ed., "IETF
+ Recommendations Regarding Active Queue Management",
+ BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015,
+ <http://www.rfc-editor.org/info/rfc7567>.
+
+ [RFC7806] Baker, F. and R. Pan, "On Queuing, Marking, and Dropping",
+ RFC 7806, DOI 10.17487/RFC7806, April 2016,
+ <http://www.rfc-editor.org/info/rfc7806>.
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 18]
+
+RFC 8033 PIE February 2017
+
+
+ [RFC8034] White, G. and R. Pan, "Active Queue Management (AQM) Based
+ on Proportional Integral Controller Enhanced (PIE) for
+ Data-Over-Cable Service Interface Specifications (DOCSIS)
+ Cable Modems", RFC 8034, DOI 10.17487/RFC8034,
+ February 2017, <http://www.rfc-editor.org/info/rfc8034>.
+
+ [CBQ] Cisco, "Class-Based Weighted Fair Queueing",
+ <http://www.cisco.com/en/US/docs/ios/12_0t/12_0t5/
+ feature/guide/cbwfq.html>.
+
+ [CoDel] Nichols, K. and V. Jacobson, "Controlling Queue Delay",
+ Communications of the ACM, Volume 55, Issue 7, pp. 42-50,
+ DOI 10.1145/2209249.2209264, July 2012.
+
+ [DOCSIS_3.1]
+ CableLabs, "MAC and Upper Layer Protocols Interface
+ Specification", DOCSIS 3.1, January 2017,
+ <https://apps.cablelabs.com/specification/
+ CM-SP-MULPIv3.1>.
+
+ [DOCSIS-AQM]
+ White, G., "Active Queue Management in DOCSIS 3.x Cable
+ Modems", May 2014, <http://www.cablelabs.com/wp-content/
+ uploads/2014/06/DOCSIS-AQM_May2014.pdf>.
+
+ [ECN] Briscoe, B., Kaippallimalil, J., and P. Thaler,
+ "Guidelines for Adding Congestion Notification to
+ Protocols that Encapsulate IP", Work in Progress,
+ draft-ietf-tsvwg-ecn-encap-guidelines-07, July 2016.
+
+ [HPSR-PIE] Pan, R., Natarajan, P., Piglione, C., Prabhu, M.S.,
+ Subramanian, V., Baker, F., and B. Ver Steeg, "PIE: A
+ lightweight control scheme to address the bufferbloat
+ problem", IEEE HPSR, DOI 10.1109/HPSR.2013.6602305, 2013,
+ <https://www.researchgate.net/publication/
+ 261134127_PIE_A_lightweight_control_scheme_to_address_
+ the_bufferbloat_problem?origin=mail>.
+
+ [PI] Hollot, C.V., Misra, V., Towsley, D., and W. Gong, "On
+ designing improved controllers for AQM routers supporting
+ TCP flows", INFOCOM 2001, DOI 10.1109/INFCOM.2001.916670,
+ April 2001.
+
+ [QCN] IEEE, "IEEE Standard for Local and Metropolitan Area
+ Networks--Virtual Bridged Local Area Networks -
+ Amendment: 10: Congestion Notification", IEEE 802.1Qau,
+ <http://www.ieee802.org/1/pages/802.1au.html>.
+
+
+
+
+Pan, et al. Experimental [Page 19]
+
+RFC 8033 PIE February 2017
+
+
+ [RED] Floyd, S. and V. Jacobson, "Random Early Detection (RED)
+ Gateways for Congestion Avoidance", IEEE/ACM Transactions
+ on Networking, Volume 1, Issue 4, DOI 10.1109/90.251892,
+ August 1993.
+
+ [TCP-Models]
+ Misra, V., Gong, W., and D. Towsley, "Fluid-based analysis
+ of a network of AQM routers supporting TCP flows with an
+ application to RED", SIGCOMM 2000, Volume 30, Issue 4,
+ pp. 151-160, DOI 10.1145/347057.347421, October 2000.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 20]
+
+RFC 8033 PIE February 2017
+
+
+Appendix A. The Basic PIE Pseudocode
+
+ Configurable parameters:
+ - QDELAY_REF. AQM Latency Target (default: 15 milliseconds)
+ - MAX_BURST. AQM Max Burst Allowance (default: 150 milliseconds)
+
+ Internal parameters:
+ - Weights in the drop probability calculation (1/s):
+ alpha (default: 1/8), beta (default: 1 + 1/4)
+ - T_UPDATE: a period to calculate drop probability
+ (default: 15 milliseconds)
+
+ Table that stores status variables (ending with "_"):
+ - burst_allowance_: current burst allowance
+ - drop_prob_: The current packet drop probability. Reset to 0
+ - qdelay_old_: The previous queue delay. Reset to 0
+
+ Public/system functions:
+ - queue_. Holds the pending packets
+ - drop(packet). Drops/discards a packet
+ - now(). Returns the current time
+ - random(). Returns a uniform r.v. in the range 0 ~ 1
+ - queue_.byte_length(). Returns current queue_ length in bytes
+ - queue_.enque(packet). Adds packet to tail of queue_
+ - queue_.deque(). Returns the packet from the head of queue_
+ - packet.size(). Returns size of packet
+ - packet.timestamp_delay(). Returns timestamped packet latency
+
+ ============================
+
+ //Called on each packet arrival
+ enque(Packet packet) {
+ if (PIE->drop_prob_ == 0 && current_qdelay < QDELAY_REF/2
+ && PIE->qdelay_old_ < QDELAY_REF/2) {
+ PIE->burst_allowance_ = MAX_BURST;
+ }
+ if (PIE->burst_allowance_ == 0 && drop_early() == DROP) {
+ drop(packet);
+ } else {
+ queue_.enque(packet);
+ }
+ }
+
+ ============================
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 21]
+
+RFC 8033 PIE February 2017
+
+
+ drop_early() {
+
+ //Safeguard PIE to be work conserving
+ if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
+ || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
+ return ENQUE;
+ }
+
+ double u = random();
+ if (u < PIE->drop_prob_) {
+ return DROP;
+ } else {
+
+ return ENQUE;
+ }
+ }
+
+ ============================
+
+ //We choose the timestamp option of obtaining latency for clarity
+ //Rate estimation method can be found in the extended PIE pseudocode
+
+ deque(Packet packet) {
+
+ current_qdelay = packet.timestamp_delay();
+ }
+
+ ============================
+
+ //Update periodically, T_UPDATE = 15 milliseconds
+
+ calculate_drop_prob() {
+
+ //Can be implemented using integer multiply
+
+ p = alpha * (current_qdelay - QDELAY_REF) + \
+ beta * (current_qdelay - PIE->qdelay_old_);
+
+ if (PIE->drop_prob_ < 0.000001) {
+ p /= 2048;
+ } else if (PIE->drop_prob_ < 0.00001) {
+ p /= 512;
+ } else if (PIE->drop_prob_ < 0.0001) {
+ p /= 128;
+ } else if (PIE->drop_prob_ < 0.001) {
+ p /= 32;
+ } else if (PIE->drop_prob_ < 0.01) {
+ p /= 8;
+
+
+
+Pan, et al. Experimental [Page 22]
+
+RFC 8033 PIE February 2017
+
+
+ } else if (PIE->drop_prob_ < 0.1) {
+ p /= 2;
+ } else {
+ p = p;
+ }
+
+ PIE->drop_prob_ += p;
+
+ //Exponentially decay drop prob when congestion goes away
+ if (current_qdelay == 0 && PIE->qdelay_old_ == 0) {
+ PIE->drop_prob_ *= 0.98; //1 - 1/64 is
+ //sufficient
+ }
+
+ //Bound drop probability
+ if (PIE->drop_prob_ < 0)
+ PIE->drop_prob_ = 0.0
+ if (PIE->drop_prob_ > 1)
+ PIE->drop_prob_ = 1.0
+
+ PIE->qdelay_old_ = current_qdelay;
+
+ PIE->burst_allowance_ =
+ max(0,PIE->burst_allowance_ - T_UPDATE);
+ }
+ }
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 23]
+
+RFC 8033 PIE February 2017
+
+
+Appendix B. Pseudocode for PIE with Optional Enhancement
+
+ Configurable parameters:
+ - QDELAY_REF. AQM Latency Target (default: 15 milliseconds)
+ - MAX_BURST. AQM Max Burst Allowance (default: 150 milliseconds)
+ - MAX_ECNTH. AQM Max ECN Marking Threshold (default: 10%)
+
+ Internal parameters:
+ - Weights in the drop probability calculation (1/s):
+ alpha (default: 1/8), beta (default: 1 + 1/4)
+ - DQ_THRESHOLD: (in bytes, default: 2^14 (in a power of 2) )
+ - T_UPDATE: a period to calculate drop probability
+ (default: 15 milliseconds)
+ - TAIL_DROP: the tail drop threshold (max allowed queue depth)
+ for the queue
+
+ Table that stores status variables (ending with "_"):
+ - active_: INACTIVE/ACTIVE
+ - burst_allowance_: current burst allowance
+ - drop_prob_: The current packet drop probability. Reset to 0
+ - accu_prob_: Accumulated drop probability. Reset to 0
+ - qdelay_old_: The previous queue delay estimate. Reset to 0
+ - last_timestamp_: Timestamp of previous status update
+ - dq_count_, measurement_start_, in_measurement_, avg_dq_time_.
+ Variables for measuring average dequeue rate
+
+ Public/system functions:
+ - queue_. Holds the pending packets
+ - drop(packet). Drops/discards a packet
+ - mark(packet). Marks ECN for a packet
+ - now(). Returns the current time
+ - random(). Returns a uniform r.v. in the range 0 ~ 1
+ - queue_.byte_length(). Returns current queue_ length in bytes
+ - queue_.enque(packet). Adds packet to tail of queue_
+ - queue_.deque(). Returns the packet from the head of queue_
+ - packet.size(). Returns size of packet
+ - packet.ecn(). Returns whether packet is ECN capable or not
+
+ ============================
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 24]
+
+RFC 8033 PIE February 2017
+
+
+ //Called on each packet arrival
+ enque(Packet packet) {
+ if (queue_.byte_length() + packet.size() > TAIL_DROP) {
+ drop(packet);
+ PIE->accu_prob_ = 0;
+ } else if (PIE->active_ == TRUE && drop_early() == DROP
+ && PIE->burst_allowance_ == 0) {
+ if (PIE->drop_prob_ < MAX_ECNTH && packet.ecn() ==
+ TRUE)
+ mark(packet);
+ else
+ drop(packet);
+ PIE->accu_prob_ = 0;
+ } else {
+ queue_.enque(packet);
+ }
+
+ //If the queue is over a certain threshold, turn on PIE
+ if (PIE->active_ == INACTIVE
+ && queue_.byte_length() >= TAIL_DROP/3) {
+ PIE->active_ = ACTIVE;
+ PIE->qdelay_old_ = 0;
+ PIE->drop_prob_ = 0;
+ PIE->in_measurement_ = TRUE;
+ PIE->dq_count_ = 0;
+ PIE->avg_dq_time_ = 0;
+ PIE->last_timestamp_ = now;
+ PIE->burst_allowance_ = MAX_BURST;
+ PIE->accu_prob_ = 0;
+ PIE->measurement_start_ = now;
+ }
+
+ //If the queue has been idle for a while, turn off PIE
+ //Reset counters when accessing the queue after some idle
+ //period if PIE was active before
+ if ( PIE->drop_prob_ == 0 && PIE->qdelay_old_ == 0
+ && current_qdelay == 0) {
+ PIE->active_ = INACTIVE;
+ PIE->in_measurement_ = FALSE;
+ }
+
+ }
+
+ ============================
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 25]
+
+RFC 8033 PIE February 2017
+
+
+ drop_early() {
+
+ //PIE is active but the queue is not congested: return ENQUE
+ if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
+ || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
+ return ENQUE;
+ }
+
+
+ if (PIE->drop_prob_ == 0) {
+ PIE->accu_prob_ = 0;
+ }
+
+ //For practical reasons, drop probability can be further scaled
+ //according to packet size, but one needs to set a bound to
+ //avoid unnecessary bias
+
+ //Random drop
+ PIE->accu_prob_ += PIE->drop_prob_;
+ if (PIE->accu_prob_ < 0.85)
+ return ENQUE;
+ if (PIE->accu_prob_ >= 8.5)
+ return DROP;
+ double u = random();
+ if (u < PIE->drop_prob_) {
+ PIE->accu_prob_ = 0;
+ return DROP;
+ } else {
+ return ENQUE;
+ }
+ }
+
+ ============================
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 26]
+
+RFC 8033 PIE February 2017
+
+
+ //Update periodically, T_UPDATE = 15 milliseconds
+ calculate_drop_prob() {
+ if ( (now - PIE->last_timestamp_) >= T_UPDATE &&
+ PIE->active_ == ACTIVE) {
+
+ //Can be implemented using integer multiply
+ //DQ_THRESHOLD is power of 2 value
+ current_qdelay = queue_.byte_length() *
+ PIE->avg_dq_time_/DQ_THRESHOLD;
+
+ p = alpha * (current_qdelay - QDELAY_REF) + \
+ beta * (current_qdelay - PIE->qdelay_old_);
+
+ if (PIE->drop_prob_ < 0.000001) {
+ p /= 2048;
+ } else if (PIE->drop_prob_ < 0.00001) {
+ p /= 512;
+ } else if (PIE->drop_prob_ < 0.0001) {
+ p /= 128;
+ } else if (PIE->drop_prob_ < 0.001) {
+ p /= 32;
+ } else if (PIE->drop_prob_ < 0.01) {
+ p /= 8;
+ } else if (PIE->drop_prob_ < 0.1) {
+ p /= 2;
+ } else {
+ p = p;
+ }
+
+ if (PIE->drop_prob_ >= 0.1 && p > 0.02) {
+ p = 0.02;
+ }
+ PIE->drop_prob_ += p;
+
+ //Exponentially decay drop prob when congestion goes away
+ if (current_qdelay < QDELAY_REF/2 && PIE->qdelay_old_ <
+ QDELAY_REF/2) {
+ PIE->drop_prob_ *= 0.98; //1 - 1/64 is
+ //sufficient
+ }
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 27]
+
+RFC 8033 PIE February 2017
+
+
+ //Bound drop probability
+ if (PIE->drop_prob_ < 0)
+ PIE->drop_prob_ = 0
+ if (PIE->drop_prob_ > 1)
+ PIE->drop_prob_ = 1
+
+ PIE->qdelay_old_ = current_qdelay;
+ PIE->last_timestamp_ = now;
+ PIE->burst_allowance_ = max(0,PIE->burst_allowance_ -
+ T_UPDATE);
+ }
+ }
+
+ ============================
+
+ //Called on each packet departure
+ deque(Packet packet) {
+
+ //Dequeue rate estimation
+ if (PIE->in_measurement_ == TRUE) {
+ PIE->dq_count_ = packet.size() + PIE->dq_count_;
+ //Start a new measurement cycle if we have enough packets
+ if ( PIE->dq_count_ >= DQ_THRESHOLD) {
+ dq_time = now - PIE->measurement_start_;
+ if (PIE->avg_dq_time_ == 0) {
+ PIE->avg_dq_time_ = dq_time;
+ } else {
+ weight = DQ_THRESHOLD/2^16
+ PIE->avg_dq_time_ = dq_time * weight +
+ PIE->avg_dq_time_ * (1 - weight);
+ }
+ PIE->in_measurement_ = FALSE;
+ }
+ }
+
+ //Start a measurement if we have enough data in the queue
+ if (queue_.byte_length() >= DQ_THRESHOLD &&
+ PIE->in_measurement_ == FALSE) {
+ PIE->in_measurement_ = TRUE;
+ PIE->measurement_start_ = now;
+ PIE->dq_count_ = 0;
+ }
+ }
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 28]
+
+RFC 8033 PIE February 2017
+
+
+Contributors
+
+ Bill Ver Steeg
+ Comcast Cable
+ Email: William_VerSteeg@comcast.com
+
+ Mythili Prabhu*
+ Akamai Technologies
+ 3355 Scott Blvd.
+ Santa Clara, CA 95054
+ United States of America
+ Email: mythili@akamai.com
+
+ Chiara Piglione*
+ Broadcom Corporation
+ 3151 Zanker Road
+ San Jose, CA 95134
+ United States of America
+ Email: chiara@broadcom.com
+
+ Vijay Subramanian*
+ PLUMgrid, Inc.
+ 350 Oakmead Parkway
+ Suite 250
+ Sunnyvale, CA 94085
+ United States of America
+ Email: vns@plumgrid.com
+ * Formerly at Cisco Systems
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 29]
+
+RFC 8033 PIE February 2017
+
+
+Authors' Addresses
+
+ Rong Pan
+ Cisco Systems
+ 3625 Cisco Way
+ San Jose, CA 95134
+ United States of America
+
+ Email: ropan@cisco.com
+
+
+ Preethi Natarajan
+ Cisco Systems
+ 725 Alder Drive
+ Milpitas, CA 95035
+ United States of America
+
+ Email: prenatar@cisco.com
+
+
+ Fred Baker
+ Santa Barbara, CA 93117
+ United States of America
+
+ Email: FredBaker.IETF@gmail.com
+
+
+ Greg White
+ CableLabs
+ 858 Coal Creek Circle
+ Louisville, CO 80027
+ United States of America
+
+ Email: g.white@cablelabs.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Pan, et al. Experimental [Page 30]
+