summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc5129.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc5129.txt')
-rw-r--r--doc/rfc/rfc5129.txt1179
1 files changed, 1179 insertions, 0 deletions
diff --git a/doc/rfc/rfc5129.txt b/doc/rfc/rfc5129.txt
new file mode 100644
index 0000000..2f599d7
--- /dev/null
+++ b/doc/rfc/rfc5129.txt
@@ -0,0 +1,1179 @@
+
+
+
+
+
+
+Network Working Group B. Davie
+Request for Comments: 5129 Cisco Systems, Inc.
+Category: Standards Track B. Briscoe
+ J. Tay
+ BT Research
+ January 2008
+
+
+ Explicit Congestion Marking in MPLS
+
+Status of This Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Abstract
+
+ RFC 3270 defines how to support the Diffserv architecture in MPLS
+ networks, including how to encode Diffserv Code Points (DSCPs) in an
+ MPLS header. DSCPs may be encoded in the EXP field, while other uses
+ of that field are not precluded. RFC 3270 makes no statement about
+ how Explicit Congestion Notification (ECN) marking might be encoded
+ in the MPLS header. This document defines how an operator might
+ define some of the EXP codepoints for explicit congestion
+ notification, without precluding other uses.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 1]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+Table of Contents
+
+ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
+ 1.1. Background . . . . . . . . . . . . . . . . . . . . . . . . 3
+ 1.2. Intent . . . . . . . . . . . . . . . . . . . . . . . . . . 4
+ 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4
+ 2. Use of MPLS EXP Field for ECN . . . . . . . . . . . . . . . . 5
+ 3. Per-Domain ECT Checking . . . . . . . . . . . . . . . . . . . 7
+ 4. ECN-Enabled MPLS Domain . . . . . . . . . . . . . . . . . . . 8
+ 4.1. Pushing (Adding) One or More Labels to an IP Packet . . . 8
+ 4.2. Pushing One or More Labels onto an MPLS Labeled Packet . . 8
+ 4.3. Congestion Experienced in an Interior MPLS Node . . . . . 8
+ 4.4. Crossing a Diffserv Domain Boundary . . . . . . . . . . . 8
+ 4.5. Popping an MPLS Label (Not the End of the Stack) . . . . . 9
+ 4.6. Popping the Last MPLS Label in the Stack . . . . . . . . . 9
+ 4.7. Diffserv Tunneling Models . . . . . . . . . . . . . . . . 10
+ 5. ECN-Disabled MPLS Domain . . . . . . . . . . . . . . . . . . . 10
+ 6. The Use of More Codepoints with E-LSPs and L-LSPs . . . . . . 10
+ 7. Relationship to Tunnel Behavior in RFC 3168 . . . . . . . . . 11
+ 8. Deployment Considerations . . . . . . . . . . . . . . . . . . 11
+ 8.1. Marking Non-ECN-Capable Packets . . . . . . . . . . . . . 11
+ 8.2. Non-ECN-Capable Routers in an MPLS Domain . . . . . . . . 12
+ 9. Example Uses . . . . . . . . . . . . . . . . . . . . . . . . . 13
+ 9.1. RFC 3168-Style ECN . . . . . . . . . . . . . . . . . . . . 13
+ 9.2. ECN Co-Existence with Diffserv E-LSPs . . . . . . . . . . 13
+ 9.3. Congestion-Feedback-Based Traffic Engineering . . . . . . 14
+ 9.4. PCN Flow Admission Control and Flow Termination . . . . . 14
+ 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14
+ 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 15
+ Appendix A. Extension to Pre-Congestion Notification . . . . . . 16
+ A.1. Label Push onto IP Packet . . . . . . . . . . . . . . . . . 16
+ A.2. Pushing Additional MPLS Labels . . . . . . . . . . . . . . 16
+ A.3. Admission Control or Flow Termination Marking Inside
+ MPLS Domain . . . . . . . . . . . . . . . . . . . . . . . . 17
+ A.4. Popping an MPLS Label (Not End of Stack) . . . . . . . . . 17
+ A.5. Popping the Last MPLS Label to Expose IP Header . . . . . . 17
+ Normative References . . . . . . . . . . . . . . . . . . . . . . . 18
+ Informative References . . . . . . . . . . . . . . . . . . . . . . 18
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 2]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+1. Introduction
+
+1.1. Background
+
+ [RFC3168] defines Explicit Congestion Notification (ECN) for IP. The
+ primary purpose of ECN is to allow congestion to be signalled without
+ dropping packets.
+
+ [RFC3270] defines how to support the Diffserv architecture in MPLS
+ networks, including how to encode Diffserv Code Points (DSCPs) in an
+ MPLS header. DSCPs may be encoded in the EXP field, while other uses
+ of that field are not precluded. RFC 3270 makes no statement about
+ how Explicit Congestion Notification (ECN) marking might be encoded
+ in the MPLS header.
+
+ This document defines how an operator might define some of the EXP
+ codepoints for explicit congestion notification, without precluding
+ other uses. In parallel to the activity defining the addition of ECN
+ to IP [RFC3168], two proposals were made to add ECN to MPLS
+ [Floyd][Shayman]. These proposals, however, fell by the wayside.
+ With ECN for IP now being a proposed standard, and developing
+ interest in using pre-congestion notification (PCN) for admission
+ control and flow termination [PCN], there is consequent interest in
+ being able to support ECN across IP networks consisting of MPLS-
+ enabled domains. Therefore, it is necessary to specify the protocol
+ for including ECN in the MPLS shim header and the protocol behavior
+ of edge MPLS nodes.
+
+ We note that in [RFC3168], there are four codepoints used for ECN
+ marking, which are encoded using two bits of the IP header. The MPLS
+ EXP field is the logical place to encode ECN codepoints, but with
+ only 3 bits (8 codepoints) available, and with the same field being
+ used to convey DSCP information as well, there is a clear incentive
+ to conserve the number of codepoints consumed for ECN purposes.
+ Efficient use of the EXP field has been a focus of prior documents
+ [Floyd] [Shayman], and we draw on those efforts in this document as
+ well.
+
+ We also note that [RFC3168] defines default usage of the ECN field,
+ but it allows for the possibility that some Diffserv Per Hop
+ Behaviors (PHBs) might include different specifications on how the
+ ECN field is to be used. This document seeks to preserve that
+ capability.
+
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 3]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+1.2. Intent
+
+ Our intent is to specify how the MPLS shim header [RFC3032] should
+ denote ECN marking and how MPLS nodes should understand whether the
+ transport for a packet will be ECN capable. We offer this as a
+ building block, from which to build different congestion-notification
+ systems. We do not intend to specify how the resulting congestion
+ notification is fed back to an upstream node that can mitigate
+ congestion. For instance, unlike [Shayman], we do not specify edge-
+ to-edge MPLS domain feedback, but we also do not preclude it.
+ Nonetheless, we do specify how the egress node of an MPLS domain
+ should copy congestion notification from the MPLS shim into the
+ encapsulated IP header if the ECN is to be carried onward towards the
+ IP receiver; but we do *not* mandate that MPLS congestion
+ notification must be copied into the IP header for onward
+ transmission. This document aims to be generic for any use of
+ congestion notification in MPLS. Support of [RFC3168] is our primary
+ motivation; some additional potential applications to illustrate the
+ flexibility of our approach are described in Section 9. In
+ particular, we aim to support possible future schemes that may use
+ more than one level of congestion marking.
+
+1.3. Terminology
+
+ This document draws freely on the terminology of ECN [RFC3168] and
+ MPLS [RFC3031]. For ease of reference, we have included some
+ definitions here, but refer the reader to the references above for
+ complete specifications of the relevant technologies:
+
+ o CE: Congestion Experienced. One of the states with which a packet
+ may be marked in a network supporting ECN. A packet is marked in
+ this state by an ECN-capable router to indicate that this router
+ was experiencing congestion at the time the packet arrived.
+
+ o ECT: ECN-capable Transport. One of the ECN states that a packet
+ may be in when it is sent by an end system. An end system marks a
+ packet with an ECT codepoint to indicate that the endpoints of the
+ transport protocol are ECN-capable. A router may not mark a
+ packet as CE unless the packet was marked ECT when it arrived.
+
+ o Not-ECT: Not ECN-capable transport. An end system marks a packet
+ with this codepoint to indicate that the endpoints of the
+ transport protocol are not ECN-capable. A congested router cannot
+ mark such packets as CE, and thus it can only drop them to
+ indicate congestion.
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 4]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ o EXP field. A 3-bit field in the MPLS label header [RFC3032] that
+ may be used to convey Diffserv information (and is also used in
+ this document to carry ECN information).
+
+ o PHP. Penultimate Hop Popping. An MPLS operation in which the
+ penultimate Label Switching Router (LSR) on a Label Switched Path
+ (LSP) removes the top label from the packet before forwarding the
+ packet to the final LSR on the LSP.
+
+ Requirements Language
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in RFC 2119 [RFC2119].
+
+2. Use of MPLS EXP Field for ECN
+
+ We propose that LSRs configured for explicit congestion notification
+ should use the EXP field in the MPLS shim header. However, [RFC3270]
+ already defines use of codepoints in the EXP field for differentiated
+ services. Although it does not preclude other compatible uses of the
+ EXP field, this clearly seems to limit the space available for ECN,
+ given the field is only 3 bits (8 codepoints).
+
+ [RFC3270] defines two possible approaches for requesting
+ differentiated service treatment from an LSR:
+
+ o In the EXP-Inferred-PSC LSP (E-LSP) approach, different codepoints
+ of the EXP field in the MPLS shim header are used to indicate the
+ packet's per hop behavior (PHB).
+
+ o In the Label-Only-Inferred-PSC LSP (L-LSP) approach, an MPLS label
+ is assigned for each PHB scheduling class (PSC, as defined in
+ [RFC3260], so that an LSR determines both its forwarding and its
+ scheduling behavior from the label.
+
+ If an MPLS domain uses the L-LSP approach, there is likely to be
+ space in the EXP field for ECN codepoint(s). Where the E-LSP
+ approach is used, codepoint space in the EXP field is likely to be
+ scarce. This document focuses on interworking ECN marking with the
+ E-LSP approach, as it is the tougher problem. Consequently, the same
+ approach can also be applied with L-LSPs.
+
+ We recommend that explicit congestion notification in MPLS should use
+ codepoints instead of bits in the EXP field. Since not every PHB
+ will necessarily require an associated ECN codepoint, it would be
+
+
+
+
+
+Davie, et al. Standards Track [Page 5]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ wasteful to assign a dedicated bit for ECN. (There may also be cases
+ where a given PHB might need more than one ECN-like codepoint; see
+ Section 9.4 for an example).
+
+ For each PHB that uses ECN marking, we assume one EXP codepoint will
+ be defined as not congestion marked (Not-CM), and at least one other
+ codepoint will be defined as congestion marked (CM). Therefore, each
+ PHB that uses ECN marking will consume at least two EXP codepoints,
+ but PHBs that do not use ECN marking will only consume one.
+
+ Further, we wish to use minimal space in the MPLS shim header to tell
+ interior LSRs whether each packet will be received by an ECN-capable
+ transport (ECT). Nonetheless, we must ensure that an endpoint that
+ would not understand an ECN mark will not receive one, otherwise it
+ will not be able to respond to congestion as it should. In the past,
+ three solutions to this problem have been proposed:
+
+ o One possible approach is for congested LSRs to mark the ECN field
+ in the underlying IP header at the bottom of the label stack.
+ Although many commercial LSRs routinely access the IP header for
+ other reasons (equal cost multi-path - ECMP), there are numerous
+ drawbacks to attempting to find an IP header beneath an MPLS label
+ stack. Notably, there is the challenge of detecting the absence
+ of an IP header when non-IP packets are carried on an LSP.
+ Therefore, we will not consider this approach further.
+
+ o In the scheme suggested by [Floyd], ECT and CE are overloaded into
+ one bit, so that a 0 means ECT while a 1 might either mean Not-ECT
+ or it might mean CE. A packet that has been marked as having
+ experienced congestion upstream, and then is picked out for
+ marking at a second congested LSR, will be dropped by the second
+ LSR since it cannot determine whether the packet has previously
+ experienced congestion or if ECN is not supported by the
+ transport.
+
+ While such an approach seemed potentially palatable, we do not
+ recommend it here for the following reasons. In some cases, we
+ wish to be able to use ECN marking long before actual congestion
+ (e.g., pre-congestion notification). In these circumstances,
+ marking rates at each LSR might be non-negligible most of the
+ time, so the chances of a previously marked packet encountering an
+ LSR that wants to mark it again will also be non-negligible. In
+ the case where CE and not-ECT are indistinguishable to core
+ routers, such a scenario could lead to unacceptable drop rates.
+ If the typical marking rate at every router or LSR is p, and the
+ typical diameter of the network of LSRs is d, then the probability
+ that a marked packet will be chosen for marking more than once is
+
+
+
+
+Davie, et al. Standards Track [Page 6]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ 1-[Pr(never marked) + Pr(marked at exactly one hop)] = 1- [(1-p)^d
+ + dp(1-p)^(d-1)]. For instance, with 6 LSRs in a row, each
+ marking ECN with 1% probability, the chances of a packet that is
+ already marked being chosen for marking a second time is 0.15%.
+ The bit-overloading scheme would therefore introduce a drop rate
+ of 0.15% unnecessarily. Given that most modern core networks are
+ sized to introduce near-zero packet drop, it may be unacceptable
+ to drop over one in a thousand packets unnecessarily.
+
+ o A third possible approach was suggested by [Shayman]. In this
+ scheme, interior LSRs assume that the endpoints are ECN-capable,
+ but this assumption is checked when the final label is popped. If
+ an interior LSR has marked ECN in the EXP field of the shim
+ header, but the IP header says the endpoints are not ECN-capable,
+ the edge router (or penultimate router, if using penultimate hop
+ popping) drops the packet. We recommend this scheme, which we
+ call `per-domain ECT checking', and define it more precisely in
+ the following section. Its chief drawback is that it can cause
+ packets to be forwarded after encountering congestion only to be
+ dropped at the egress of the MPLS domain. The rationale for this
+ decision is given in Section 8.1.
+
+3. Per-Domain ECT Checking
+
+ For the purposes of this discussion, we define the egress nodes of an
+ MPLS domain as the nodes that pop the last MPLS label from the label
+ stack, exposing the IP (or, potentially non-IP) header. Note that
+ such a node may be the ultimate or penultimate hop of an LSP,
+ depending on whether penultimate hop popping (PHP) is employed.
+
+ In the per-domain ECT checking approach, the egress nodes take
+ responsibility for checking whether the transport is ECN-capable.
+ This document does not specify how these nodes should pass on
+ congestion notification because different approaches are likely in
+ different scenarios. However, if congestion notification in the MPLS
+ header is copied into the IP header, the procedure MUST conform to
+ the specification given here.
+
+ If congestion notification is passed to the transport without first
+ passing it onward in the IP header, the approach used must take
+ similar care to check that the transport is ECN-capable before
+ passing it ECN markings. Specifically, if the transport for a
+ particular congestion marked MPLS packet is found not to be ECN-
+ capable, the packet MUST be dropped at this egress node.
+
+ In the per-domain ECT checking approach, only the egress nodes check
+ whether an IP packet is destined for an ECN-capable transport.
+ Therefore, any single LSR within an MPLS domain MUST NOT be
+
+
+
+Davie, et al. Standards Track [Page 7]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ configured to enable ECN marking unless all the egress LSRs
+ surrounding it are already configured to handle ECN marking.
+
+ We call a domain surrounded by ECN-capable egress LSRs an ECN-enabled
+ MPLS domain. This term only implies that all the egress LSRs are
+ ECN-enabled; some interior LSRs may not be ECN-enabled. For
+ instance, it would be possible to use some legacy LSRs incapable of
+ supporting ECN in the interior of an MPLS domain as long as all the
+ egress LSRs were ECN-capable. Note that if PHP is used, the
+ "penultimate hop" routers that perform the pop operation do need to
+ be ECN-enabled since they are acting in this context as egress LSRs.
+
+4. ECN-Enabled MPLS Domain
+
+ In the following subsections, we describe various operations
+ affecting the ECN marking of a packet that may be performed at MPLS-
+ edge and core LSRs.
+
+4.1. Pushing (Adding) One or More Labels to an IP Packet
+
+ On encapsulating an IP packet with an MPLS label stack, the ECN field
+ must be translated from the IP packet into the MPLS EXP field. The
+ Not-CM (not congestion marked) state is set in the MPLS EXP field if
+ the ECN status of the IP packet is Not-ECT or ECT(1) or ECT(0). The
+ CM state is set if the ECN status of the IP packet is CE. If more
+ than one label is pushed at one time, the same value should be placed
+ in the EXP value of all label stack entries.
+
+4.2. Pushing One or More Labels onto an MPLS Labeled Packet
+
+ The EXP field is copied directly from the topmost label before the
+ push to the newly added outer label. If more than one label is being
+ pushed, the same EXP value is copied to all label-stack entries.
+
+4.3. Congestion Experienced in an Interior MPLS Node
+
+ If the EXP codepoint of the packet maps to a PHB that uses ECN
+ marking, and the marking algorithm requires the packet to be marked,
+ the CM state is set (irrespective of whether it is already in the CM
+ state).
+
+ If the buffer is full, a packet is dropped.
+
+4.4. Crossing a Diffserv Domain Boundary
+
+ If an MPLS-encapsulated packet crosses a Diffserv domain boundary, it
+ may be the case that the two domains use different encodings of the
+ same PHB in the EXP field. In such cases, the EXP field must be
+
+
+
+Davie, et al. Standards Track [Page 8]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ rewritten at the domain boundary. If the PHB is one that supports
+ ECN, then the appropriate ECN marking should also be preserved when
+ the EXP field is mapped at the boundary.
+
+ If an MPLS-encapsulated packet that is in the CM state crosses from a
+ domain that is ECN-enabled (as defined in Section 3) to a domain that
+ is not ECN-enabled, then it is necessary to perform the egress
+ checking procedures at the egress LSR of the ECN-enabled domain.
+ This means that if the encapsulated packet is not ECN-capable, the
+ packet MUST be dropped. Note that this implies the egress LSR must
+ be able to look beneath the MPLS header without popping the label
+ stack.
+
+ The related issue of Diffserv tunnel models is discussed in
+ Section 4.7.
+
+4.5. Popping an MPLS Label (Not the End of the Stack)
+
+ When a packet has more than one MPLS label in the stack and the top
+ label is popped, another MPLS label is exposed. In this case, the
+ ECN information should be transferred from the outer EXP field to the
+ inner MPLS label in the following manner. If the inner EXP field is
+ Not-CM, the inner EXP field is set to the same CM or Not-CM state as
+ the outer EXP field. If the inner EXP field is CM, it remains
+ unchanged whatever the outer EXP field. Note that an inner value of
+ CM and an outer value of not-CM should be considered anomalous, and
+ SHOULD be logged in some way by the LSR.
+
+4.6. Popping the Last MPLS Label in the Stack
+
+ When the last MPLS label is popped from the packet, its payload is
+ exposed. If that packet is not IP, and does not have any capability
+ equivalent to ECT, it is assumed Not-ECT, and it is treated as such.
+ That means that if the EXP value of the MPLS header is CM, the packet
+ MUST be dropped.
+
+ Assuming an IP packet was exposed, we have to examine whether or not
+ that packet is ECT. A Not-ECT packet MUST be dropped if the EXP
+ field is CM.
+
+ For the remainder of this section, we describe the behavior that is
+ required if the ECN information is to be transferred from the MPLS
+ header into the exposed IP header for onward transmission. As noted
+ in Section 1.2, such behavior is not mandated by this document, but
+ may be selected by an operator.
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 9]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ If the inner IP packet is Not-ECT, its ECN field remains unchanged if
+ the EXP field is Not-CM. If the ECN field of the inner packet is set
+ to ECT(0), ECT(1), or CE, the ECN field remains unchanged if the EXP
+ field is set to Not-CM. The ECN field is set to CE if the EXP field
+ is CM. Note that an inner value of CE and an outer value of not-CM
+ should be considered anomalous, and SHOULD be logged in some way by
+ the LSR.
+
+4.7. Diffserv Tunneling Models
+
+ [RFC3270] describes three tunneling models for Diffserv support
+ across MPLS Domains, referred to as the uniform, short pipe, and pipe
+ models. The differences between these models lie in whether the
+ Diffserv treatment that applies to a packet while it travels along a
+ particular LSP is carried to the ingress of the last hop, to the
+ egress of the last hop, or beyond the last hop. Depending on which
+ mode is preferred by an operator, the EXP value or DSCP value of an
+ exposed header following a label pop may or may not be dependent on
+ the EXP value of the label that is removed by the pop operation. We
+ believe that, in the case of ECN marking, the use of these models
+ should only apply to the encoding of the Diffserv PHB in the EXP
+ value, and that the choice of codepoint for ECN should always be made
+ based on the procedures described above, independent of the tunneling
+ model.
+
+5. ECN-Disabled MPLS Domain
+
+ If ECN is not enabled on all the egress LSRs of a domain, ECN MUST
+ NOT be enabled on any LSRs throughout the domain. If congestion is
+ experienced on any LSR in an ECN-disabled MPLS domain, packets MUST
+ be dropped; they MUST NOT be marked. The exact algorithm for
+ deciding when to drop packets during congestion (e.g., tail-drop,
+ RED, etc.) is a local matter for the operator of the domain.
+
+6. The Use of More Codepoints with E-LSPs and L-LSPs
+
+ [RFC3270] gives different options with E-LSPs and L-LSPs, and some of
+ those could potentially provide ample EXP codepoints for ECN.
+ However, deploying L-LSPs vs. E-LSPs has many implications, such as
+ platform support and operational complexity. The above ECN MPLS
+ solution should provide some flexibility. If the operator has
+ deployed one L-LSP per PHB scheduling class, then EXP space will be a
+ non-issue, and it could be used to achieve more sophisticated ECN
+ behavior if required. If the operator wants to stick to E-LSPs and
+ uses a handful of EXP codepoints for Diffserv, it may be desirable to
+ operate with a minimum number of extra ECN codepoints, even if this
+ comes with some compromise on ECN optimality. See Section 9 for
+ discussion of some possible deployment scenarios.
+
+
+
+Davie, et al. Standards Track [Page 10]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ We note that in a network where L-LSPs are used, ECN marking SHOULD
+ NOT cause packets from the same microflow, but with different ECN
+ markings, to be sent on different LSPs. As discussed in [RFC3270],
+ packets of a single microflow should always travel on the same LSP to
+ avoid possible misordering. Thus, ECN marking of packets on L-LSPs
+ SHOULD only affect the EXP value of the packets.
+
+7. Relationship to Tunnel Behavior in RFC 3168
+
+ [RFC3168] defines two modes of encapsulating ECN-marked IP packets
+ inside additional IP headers when tunnels are used. The two modes
+ are the "full functionality" and "limited functionality" modes. In
+ the full functionality mode, the ECT information from the inner
+ header is copied to the outer header at the tunnel ingress, but the
+ CE information is not. In the limited functionality mode, neither
+ ECT nor CE information is copied to the outer header, and thus ECN
+ cannot be applied to the encapsulated packet.
+
+ The behavior that is specified in Section 4 of this document
+ resembles the "full functionality" mode in the sense that it conveys
+ some information from inner to outer header, and in the sense that it
+ enables full ECN support along the MPLS LSP (which is analogous to an
+ IP tunnel in this context). However it differs in one respect, which
+ is that the CE information is conveyed from the inner header to the
+ outer header. Our original reason for this different design choice
+ was to give interior routers and LSRs more information about upstream
+ marking in multi-bottleneck cases. For instance, the flow
+ termination marking mechanism proposed for PCN works by only
+ considering packets for marking that have not already been marked
+ upstream. Unless existing flow termination marking is copied from
+ the inner to the outer header at tunnel ingress, the mechanism
+ doesn't terminate enough traffic in cases where anomalous events hit
+ multiple domains at once. [RFC3168] does not give any reasons
+ against conveying CE information from the inner header to the outer
+ in the "full functionality" mode. Furthermore, [RFC4301] specifies
+ that the ECN marking should be copied from inner header to outer
+ header in IPSEC tunnels, consistent with the approach defined here.
+ [BRISCOE-ECN] discusses this issue in more detail. In summary, the
+ approach described in Section 4 appears to be both a sound technical
+ choice and consistent with the current state of thinking in the IETF.
+
+8. Deployment Considerations
+
+8.1. Marking Non-ECN-Capable Packets
+
+ What are the consequences of marking a packet that is not ECN-
+ capable? Even if it will be dropped before leaving the domain,
+ doesn't this consume resources unnecessarily?
+
+
+
+Davie, et al. Standards Track [Page 11]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ The problem only arises if there is congestion downstream of an
+ earlier congested queue in the same MPLS domain. Congested LSRs
+ downstream might forward packets already marked, even though they
+ will be dropped later when the inner IP header is found to be Not-ECT
+ on decapsulation. Such packets might cause the downstream LSRs to
+ mark (or drop) other packets that they would otherwise not have had
+ to.
+
+ We expect congestion will typically be rare in MPLS networks, but it
+ might not be. The extra unnecessary load at downstream LSRs will not
+ be more than the fraction of marked packets from upstream LSRs, even
+ in the worst case where no transports are ECN-capable. Therefore,
+ the amount of unnecessary marking (or drop) on an LSR will not be
+ more than the product of its local marking rate and the marking rate
+ due to upstream LSRs within the same domain -- typically the product
+ of two small (often zero) probabilities.
+
+ This is why we decided to use the per-domain ECT checking approach --
+ because the most likely effect would be a very slightly increased
+ marking rate, which would result in very slightly higher drop only
+ for non-ECN-capable transports. We chose not to use the [Floyd]
+ alternative, which introduced a low but persistent level of
+ unnecessary packet drop for all time, even for ECN-capable
+ transports. Although that scheme did not carry traffic to the edge
+ of the MPLS domain only to be dropped on decapsulation, we felt our
+ minor inefficiency was a small price to pay; and it would get smaller
+ still if ECN deployment widened.
+
+ A partial solution would be to preferentially drop packets arriving
+ at a congested router that were already marked. There is no solution
+ to the problem of marking a packet when congestion is caused by
+ another packet that should have been dropped. However, the chance of
+ such an occurrence is very low, and the consequences are not
+ significant. It merely causes an application to very occasionally
+ slow down its rate when it did not have to.
+
+8.2. Non-ECN-Capable Routers in an MPLS Domain
+
+ What if an MPLS domain wants to use ECN, but not all legacy routers
+ are able to support it?
+
+ If the legacy router(s) are used in the interior, this is not a
+ problem. They will simply have to drop the packets if they are
+ congested, rather than mark them, which is the standard behavior for
+ IP routers that are not ECN-enabled.
+
+ If the legacy router were used as an egress router, it would not be
+ able to check the ECN-capability of the transport correctly. An
+
+
+
+Davie, et al. Standards Track [Page 12]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ operator in this position would not be able to use this solution and
+ therefore MUST NOT enable ECN unless all egress routers are ECN-
+ capable.
+
+9. Example Uses
+
+9.1. RFC 3168-Style ECN
+
+ [RFC3168] proposes the use of ECN in TCP, and it introduces the use
+ of ECN-Echo and Congestion Window Reduced (CWR) flags in the TCP
+ header for initialization. The TCP sender responds accordingly (such
+ as not increasing the congestion window) when it receives an ECN-Echo
+ (ECE) ACK packet (that is, an ACK packet with ECN-Echo flag set in
+ the TCP header), then the sender knows that congestion was
+ encountered in the network on the path from the sender to the
+ receiver.
+
+ It would be possible to enable ECN in an MPLS domain for Diffserv
+ PHBs like AF and best efforts that are expected to be used by TCP and
+ similar transports (e.g., DCCP [RFC4340]). Then, end-to-end
+ congestion control in transports capable of understanding ECN would
+ be able to respond to approaching congestion on LSRs without having
+ to rely on packet discard to signal congestion.
+
+9.2. ECN Co-Existence with Diffserv E-LSPs
+
+ Many operators today have deployed Diffserv using the E-LSP approach
+ of [RFC3270]. In many cases, the number of PHBs used is less than 8,
+ and hence there remain available codepoints in the EXP space. If an
+ operator wished to support ECN for a single PHB, this could be
+ accomplished by simply allocating a second codepoint to the PHB for
+ the CM state of that PHB and retaining the old codepoint for the
+ not-CM state. An operator with only four deployed PHBs could, of
+ course, enable ECN marking on all those PHBs. It is easy to imagine
+ cases where some PHBs might benefit more from ECN than others -- for
+ example, an operator might use ECN on a premium data service but not
+ on a PHB used for best-effort Internet traffic.
+
+ As an illustrative example of how the EXP field might be used in this
+ case, consider the example of an operator who is using the aggregated
+ service classes proposed in [TSVWG]. He may choose to support ECN
+ only for the Assured Elastic Treatment Aggregate, using the EXP
+ codepoint 010 for the not-CM state and 011 for the CM state. All
+ other codepoints could be the same as in [TSVWG]. Of course, any
+ other combination of EXP values can be used according to the specific
+ set of PHBs and marking conventions used within that operator's
+ network.
+
+
+
+
+Davie, et al. Standards Track [Page 13]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+9.3. Congestion-Feedback-Based Traffic Engineering
+
+ Shayman's traffic engineering [Shayman] presents another example
+ application of ECN feedback in an MPLS domain. Shayman proposed the
+ use of ECN by an egress LSR feeding back congestion to an ingress LSR
+ to mitigate congestion by employing dynamic traffic engineering
+ techniques, such as shifting flows to an alternate path. It proposed
+ a new Resource Reservation Protocol (RSVP) message, which was sent by
+ the egress LSR to the ingress LSR (and ignored by transit LSRs) to
+ indicate congestion along the path. Thus, rather than providing the
+ same style of congestion notification to endpoints as defined in
+ [RFC3168], [Shayman] limits its scope to the MPLS domain only. This
+ application of ECN in an MPLS domain could make use of the ECN
+ encoding in the MPLS header that is defined in this document.
+
+9.4. PCN Flow Admission Control and Flow Termination
+
+ [PCN] proposes using pre-congestion notification (PCN) on routers
+ within an edge-to-edge Diffserv region to control admission of new
+ flows to the region and, if necessary, to terminate existing flows in
+ response to disasters and other anomalous routing events. In this
+ approach, the current level of PCN marking is picked up by the
+ signaling used to initiate each flow in order to inform the admission
+ control decision for the whole region at once. For example,
+ extensions to RSVP [LEFAUCHEUR] and Next Steps in Signaling (NSIS)
+ [NSIS], [ARUMAITHURAI] have been proposed.
+
+ If LSRs are able to mark packets to signify congestion in MPLS, PCN
+ marking could be used for admission control and flow termination
+ across a Diffserv region, irrespective of whether it contained pure
+ IP routers, MPLS LSRs, or both. Indeed, the solution could be
+ somewhat more efficient to implement if aggregates could identify
+ themselves by their MPLS label. Appendix A describes the mechanisms
+ by which the necessary markings for PCN could be carried in the MPLS
+ header.
+
+10. Security Considerations
+
+ We believe no new vulnerabilities are introduced by this document.
+
+ We have considered whether malicious sources might be able to exploit
+ the fact that interior LSRs will mark packets that are Not-ECT,
+ relying on their egress LSR to drop them. Although this might allow
+ sources to engineer a situation where more traffic is carried across
+ an MPLS domain than should be, we figured that even if we hadn't
+ introduced this feature, these sources would have been able to
+ prevent these LSRs dropping this traffic anyway, simply by setting
+ ECT in the first place.
+
+
+
+Davie, et al. Standards Track [Page 14]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ An ECN sender can use the ECN nonce [RFC3540] to detect a misbehaving
+ receiver. The ECN nonce works correctly across an MPLS domain
+ without requiring any specific support from the proposal in this
+ document. The nonce does not need to be present in the MPLS shim
+ header to detect a misbehaving receiver. As long as the nonce is
+ present in the IP header when the ECN information is copied from the
+ last MPLS shim header, it will be overwritten if congestion has been
+ experienced by an LSR. This is all that is necessary for the sender
+ to detect a misbehaving receiver. If there were a need for an ECN
+ nonce in the MPLS shim header (e.g., to detect if one LSR were
+ erasing the markings of an upstream LSR in the same domain), we
+ believe this proposal does not preclude the later addition of an ECN
+ nonce capability for specific DSCPs, just as it does not preclude any
+ other use of the EXP codepoints.
+
+11. Acknowledgments
+
+ Thanks to K.K. Ramakrishnan and Sally Floyd for getting us thinking
+ about this in the first place and for providing advice on tunneling
+ of ECN packets, and to Sally Floyd, Joe Babiarz, Ben Niven-Jenkins,
+ Phil Eardley, Ruediger Geib, and Magnus Westerlund for their comments
+ on the document.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 15]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+Appendix A. Extension to Pre-Congestion Notification
+
+ This appendix describes how the mechanisms described in the body of
+ the document can be extended to support PCN [PCN]. Our intent here
+ is to show that the mechanisms are readily extended to more complex
+ scenarios than ECN, particularly in the case where more codepoints
+ are needed, but this appendix may be safely ignored if one is
+ interested only in supporting ECN. Note that the PCN standards are
+ still very much under development at the time of writing; hence, the
+ precise details contained in this appendix may be subject to change,
+ and we stress that this appendix is for illustrative purposes only.
+
+ The relevant aspects of PCN for the purposes of this discussion are:
+
+ o PCN uses 3 states rather than 2 for ECN -- these are referred to
+ as admission marked (AM), termination marked (TM), and not marked
+ (NM) states. (See Section 9.4 for further discussion of PCN and
+ the possibility of using fewer codepoints).
+
+ o A packet can go from NM to AM, from NM to TM, or from AM to TM,
+ but no other transition is possible.
+
+ o The determination of whether a packet is subject to PCN is based
+ on the PHB of the packet.
+
+ Thus, to support PCN fully in an MPLS domain for a particular PHB, a
+ total of 3 codepoints need to be allocated for that PHB. These 3
+ codepoints represent the admission marked (AM), termination marked
+ (TM), and not marked (NM) states. The procedures described in
+ Section 4 above need to be slightly modified to support this
+ scenario. The following procedures are invoked when the topmost DSCP
+ or EXP value indicates a PHB that supports PCN.
+
+A.1. Label Push onto IP Packet
+
+ If the IP packet header indicates AM, set the EXP value of all
+ entries in the label stack to AM. If the IP packet header indicates
+ TM, set the EXP value of all entries in the label stack to TM. For
+ any other marking of the IP header, set the EXP value of all entries
+ in the label stack to NM.
+
+A.2. Pushing Additional MPLS Labels
+
+ The procedures of Section 4.2 apply.
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 16]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+A.3. Admission Control or Flow Termination Marking Inside MPLS Domain
+
+ The EXP value can be set to AM or TM according to the same procedures
+ as described in [BRISCOE-CL]. For the purposes of this document, it
+ does not matter exactly which algorithms are used to decide when to
+ set AM or TM; all that matters is that if a router would have marked
+ AM (or TM) in the IP header, it should set the EXP value in the MPLS
+ header to the AM (or TM) codepoint.
+
+A.4. Popping an MPLS Label (Not End of Stack)
+
+ When popping an MPLS Label exposes another MPLS label, the AM or TM
+ marking should be transferred to the exposed EXP field in the
+ following manner:
+
+ o If the inner EXP value is NM, then it should be set to the same
+ marking state as the EXP value of the popped label stack entry.
+
+ o If the inner EXP value is AM, it should be unchanged if the popped
+ EXP value was AM, and it should be set to TM if the popped EXP
+ value was TM. If the popped EXP value was NM, this should be
+ logged in some way, and the inner EXP value should be unchanged.
+
+ o If the inner EXP value is TM, it should be unchanged whatever the
+ popped EXP value was, but any EXP value other than TM should be
+ logged.
+
+A.5. Popping the Last MPLS Label to Expose IP Header
+
+ When popping the last MPLS Label exposes the IP header, there are two
+ cases to consider:
+
+ o the popping LSR is *not* the egress router of the PCN region, in
+ which case AM or TM marking should be transferred to the exposed
+ IP header field; or
+
+ o the popping LSR *is* the egress router of the PCN region.
+
+ In the latter case, the behavior of the egress LSR is defined in
+ [PCN] and is beyond the scope of this document. In the former case,
+ the marking should be transferred from the popped MPLS header to the
+ exposed IP header as follows:
+
+ o If the inner IP header value is neither AM nor TM, and the EXP
+ value was NM, then the IP header should be unchanged. For any
+ other EXP value, the IP header should be set to the same marking
+ state as the EXP value of the popped label stack entry.
+
+
+
+
+Davie, et al. Standards Track [Page 17]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ o If the inner IP header value is AM, it should be unchanged if the
+ popped EXP value was AM, and it should be set to TM if the popped
+ EXP value was TM. If the popped EXP value was NM, this should be
+ logged in some way and the inner IP header value should be
+ unchanged.
+
+ o If the IP header value is TM, it should be unchanged whatever the
+ popped EXP value was, but any EXP value other than TM should be
+ logged.
+
+Normative References
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [RFC3031] Rosen, E., Viswanathan, A., and R. Callon,
+ "Multiprotocol Label Switching Architecture",
+ RFC 3031, January 2001.
+
+ [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y.,
+ Farinacci, D., Li, T., and A. Conta, "MPLS Label
+ Stack Encoding", RFC 3032, January 2001.
+
+ [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The
+ Addition of Explicit Congestion Notification (ECN) to
+ IP", RFC 3168, September 2001.
+
+ [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S.,
+ Vaananen, P., Krishnan, R., Cheval, P., and J.
+ Heinanen, "Multi-Protocol Label Switching (MPLS)
+ Support of Differentiated Services", RFC 3270,
+ May 2002.
+
+ [RFC4301] Kent, S. and K. Seo, "Security Architecture for the
+ Internet Protocol", RFC 4301, December 2005.
+
+Informative References
+
+ [ARUMAITHURAI] Arumaithurai, M., "NSIS PCN-QoSM: A Quality of
+ Service Model for Pre-Congestion Notification (PCN)",
+ Work in Progress, September 2007.
+
+ [BRISCOE-CL] Briscoe, B., "Pre-Congestion Notification Marking",
+ Work in Progress, October 2006.
+
+ [BRISCOE-ECN] Briscoe, B., "Layered Encapsulation of Congestion
+ Notification", Work in Progress, July 2007.
+
+
+
+
+Davie, et al. Standards Track [Page 18]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+ [Floyd] Ramakrishnan, K., Floyd, S., and B. Davie, "A
+ Proposal to Incorporate ECN in MPLS", Work in
+ Progress, June 1999.
+
+ [LEFAUCHEUR] Faucheur, F., Charny, A., Briscoe, B., Eardley, P.,
+ Barbiaz, J., and K. Chan, "RSVP Extensions for
+ Admission Control over Diffserv using Pre-congestion
+ Notification (PCN)", Work in Progress, June 2006.
+
+ [NSIS] Bader, A., Westberg, L., Karagiannis, G., Cornelia,
+ C., and T. Phelan, "RMD-QOSM - The Resource
+ Management in Diffserv QOS Model", Work in Progress,
+ November 2007.
+
+ [PCN] Eardley, P., "Pre-Congestion Notification
+ Architecture", Work in Progress, November 2007.
+
+ [RFC3260] Grossman, D., "New Terminology and Clarifications for
+ Diffserv", RFC 3260, April 2002.
+
+ [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust
+ Explicit Congestion Notification (ECN) Signaling with
+ Nonces", RFC 3540, June 2003.
+
+ [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram
+ Congestion Control Protocol (DCCP)", RFC 4340,
+ March 2006.
+
+ [Shayman] Shayman, M. and R. Jaeger, "Using ECN to Signal
+ Congestion Within an MPLS Domain", Work in Progress,
+ November 2000.
+
+ [TSVWG] Chan, K., Babiarz, J., and F. Baker, "Aggregation of
+ DiffServ Service Classes", Work in Progress,
+ November 2007.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 19]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+Authors' Addresses
+
+ Bruce Davie
+ Cisco Systems, Inc.
+ 1414 Mass. Ave.
+ Boxborough, MA 01719
+ USA
+
+ EMail: bsd@cisco.com
+
+
+ Bob Briscoe
+ BT Research
+ B54/77, Sirius House
+ Adastral Park
+ Martlesham Heath
+ Ipswich
+ Suffolk IP5 3RE
+ United Kingdom
+
+ EMail: bob.briscoe@bt.com
+
+
+ June Tay
+ BT Research
+ B54/77, Sirius House
+ Adastral Park
+ Martlesham Heath
+ Ipswich
+ Suffolk IP5 3RE
+ United Kingdom
+
+ EMail: june.tay@bt.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 20]
+
+RFC 5129 ECN for MPLS January 2008
+
+
+Full Copyright Statement
+
+ Copyright (C) The IETF Trust (2008).
+
+ This document is subject to the rights, licenses and restrictions
+ contained in BCP 78, and except as set forth therein, the authors
+ retain all their rights.
+
+ This document and the information contained herein are provided on an
+ "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+ OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
+ THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
+ OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
+ THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+ WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+ The IETF takes no position regarding the validity or scope of any
+ Intellectual Property Rights or other rights that might be claimed to
+ pertain to the implementation or use of the technology described in
+ this document or the extent to which any license under such rights
+ might or might not be available; nor does it represent that it has
+ made any independent effort to identify any such rights. Information
+ on the procedures with respect to rights in RFC documents can be
+ found in BCP 78 and BCP 79.
+
+ Copies of IPR disclosures made to the IETF Secretariat and any
+ assurances of licenses to be made available, or the result of an
+ attempt made to obtain a general license or permission for the use of
+ such proprietary rights by implementers or users of this
+ specification can be obtained from the IETF on-line IPR repository at
+ http://www.ietf.org/ipr.
+
+ The IETF invites any interested party to bring to its attention any
+ copyrights, patents or patent applications, or other proprietary
+ rights that may cover technology that may be required to implement
+ this standard. Please address the information to the IETF at
+ ietf-ipr@ietf.org.
+
+
+
+
+
+
+
+
+
+
+
+
+Davie, et al. Standards Track [Page 21]
+