From 4bfd864f10b68b71482b35c818559068ef8d5797 Mon Sep 17 00:00:00 2001 From: Thomas Voss Date: Wed, 27 Nov 2024 20:54:24 +0100 Subject: doc: Add RFC documents --- doc/rfc/rfc6391.txt | 1067 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1067 insertions(+) create mode 100644 doc/rfc/rfc6391.txt (limited to 'doc/rfc/rfc6391.txt') diff --git a/doc/rfc/rfc6391.txt b/doc/rfc/rfc6391.txt new file mode 100644 index 0000000..b5879e7 --- /dev/null +++ b/doc/rfc/rfc6391.txt @@ -0,0 +1,1067 @@ + + + + + + +Internet Engineering Task Force (IETF) S. Bryant, Ed. +Request for Comments: 6391 C. Filsfils +Category: Standards Track Cisco Systems +ISSN: 2070-1721 U. Drafz + Deutsche Telekom + V. Kompella + J. Regan + Alcatel-Lucent + S. Amante + Level 3 Communications, LLC + November 2011 + + +Flow-Aware Transport of Pseudowires over an MPLS Packet Switched Network + +Abstract + + Where the payload of a pseudowire comprises a number of distinct + flows, it can be desirable to carry those flows over the Equal Cost + Multiple Paths (ECMPs) that exist in the packet switched network. + Most forwarding engines are able to generate a hash of the MPLS label + stack and use this mechanism to balance MPLS flows over ECMPs. + + This document describes a method of identifying the flows, or flow + groups, within pseudowires such that Label Switching Routers can + balance flows at a finer granularity than individual pseudowires. + The mechanism uses an additional label in the MPLS label stack. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc6391. + + + + + + + + + + +Bryant, et al. Standards Track [Page 1] + +RFC 6391 FAT-PW November 2011 + + +Copyright Notice + + Copyright (c) 2011 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + +Table of Contents + + 1. Introduction ....................................................3 + 1.1. Requirements Language ......................................4 + 1.2. ECMP in Label Switching Routers ............................4 + 1.3. Flow Label .................................................4 + 2. Native Service Processing Function ..............................5 + 3. Pseudowire Forwarder ............................................6 + 3.1. Encapsulation ..............................................7 + 4. Signalling the Presence of the Flow Label .......................8 + 4.1. Structure of Flow Label Sub-TLV ............................9 + 5. Static Pseudowires ..............................................9 + 6. Multi-Segment Pseudowires .......................................9 + 7. Operations, Administration, and Maintenance (OAM) ..............10 + 8. Applicability of PWs Using Flow Labels .........................11 + 8.1. Equal Cost Multiple Paths .................................12 + 8.2. Link Aggregation Groups ...................................13 + 8.3. Multiple RSVP-TE Paths ....................................13 + 8.4. The Single Large Flow Case ................................14 + 8.5. Applicability to MPLS-TP ..................................15 + 8.6. Asymmetric Operation ......................................15 + 9. Applicability to MPLS LSPs .....................................15 + 10. Security Considerations .......................................16 + 11. IANA Considerations ...........................................16 + 12. Congestion Considerations .....................................16 + 13. Acknowledgements ..............................................17 + 14. References ....................................................17 + 14.1. Normative References .....................................17 + 14.2. Informative References ...................................18 + + + + + + + +Bryant, et al. Standards Track [Page 2] + +RFC 6391 FAT-PW November 2011 + + +1. Introduction + + A pseudowire (PW) [RFC3985] is normally transported over one single + network path, even if multiple Equal Cost Multiple Paths (ECMPs) + exist between the ingress and egress PW provider edge (PE) equipment + [RFC4385] [RFC4928]. This is required to preserve the + characteristics of the emulated service (e.g., to avoid misordering + Structure-Agnostic Time Division Multiplexing over Packet (SAToP) PW + packets [RFC4553] or subjecting the packets to unusable inter-arrival + times). The use of a single path to preserve order remains the + default mode of operation of a PW. The new capability proposed in + this document is an OPTIONAL mode that may be used when the use of + ECMPs is known to be beneficial (and not harmful) to the operation of + the PW. + + Some PWs are used to transport large volumes of IP traffic between + routers. One example of this is the use of an Ethernet PW to create + a virtual direct link between a pair of routers. Such PWs may carry + from hundreds of Mbps to Gbps of traffic. These PWs only require + packet ordering to be preserved within the context of each individual + transported IP flow. They do not require packet ordering to be + preserved between all packets of all IP flows within the pseudowire. + + The ability to explicitly configure such a PW to leverage the + availability of multiple ECMPs allows for better capacity planning, + as the statistical multiplexing of a larger number of smaller flows + is more efficient than with a smaller set of larger flows. + + Typically, forwarding hardware can deduce that an IP payload is being + directly carried by an MPLS label stack, and it is capable of looking + at some fields in packets to construct hash buckets for conversations + or flows. However, when the MPLS payload is a PW, an intermediate + node has no information on the type of PW being carried in the + packet. This limits the forwarder at the intermediate node to only + being able to make an ECMP choice based on a hash of the MPLS label + stack. In the case of a PW emulating a high-bandwidth trunk, the + granularity obtained by hashing the label stack is inadequate for + satisfactory load balancing. The ingress node, however, is in the + special position of being able to understand the unencapsulated + packet header to assist with spreading flows among any available + ECMPs, or even any Loop-Free Alternates [RFC5286]. This document + defines a method to introduce granularity on the hashing of traffic + running over PWs by introducing an additional label, chosen by the + ingress node, and placed at the bottom of the label stack. + + + + + + + +Bryant, et al. Standards Track [Page 3] + +RFC 6391 FAT-PW November 2011 + + + In addition to providing an indication of the flow structure for use + in ECMP forwarding decisions, the mechanism described in the document + may also be used to select flows for distribution over an IEEE + 802.1AX-2008 (originally specified as IEEE 802.3ad-2000) Link + Aggregation Group (LAG) that has been used in an MPLS network. + + NOTE: Although Ethernet is frequently referenced as a use case in + this RFC, the mechanisms described in this document are general + mechanisms that may be applied to any PW type in which there are + identifiable flows, and in which there is no requirement to preserve + the order between those flows. + +1.1. Requirements Language + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in RFC 2119 [RFC2119]. + +1.2. ECMP in Label Switching Routers + + Label Switching Routers (LSRs) commonly generate a hash of the label + stack or some elements of the label stack as a method of + discriminating between flows and use this to distribute those flows + over the available ECMPs that exist in the network. Since the label + at the bottom of the stack is usually the label most closely + associated with the flow, this normally provides the greatest + entropy, and hence is usually included in the hash. This document + describes a method of adding an additional Label Stack Entry (LSE) at + the bottom of the stack in order to facilitate the load balancing of + the flows within a PW over the available ECMPs. A similar design for + general MPLS use has also been proposed [MPLS-ENTROPY]; see Section 9 + of this document. + + An alternative method of load balancing by creating a number of PWs + and distributing the flows amongst them was considered, but was + rejected because: + + o It did not introduce as much entropy as can be introduced by + adding an additional LSE. + + o It required additional PWs to be set up and maintained. + +1.3. Flow Label + + An additional LSE [RFC3032] is interposed between the PW LSE and the + control word, or if the control word is not present, between the PW + LSE and the PW payload. This additional LSE is called the flow LSE, + and the label carried by the flow LSE is called the flow label. + + + +Bryant, et al. Standards Track [Page 4] + +RFC 6391 FAT-PW November 2011 + + + Indivisible flows within the PW MUST be mapped to the same flow label + by the ingress PE. The flow label stimulates the correct ECMP load- + balancing behaviour in the packet switched network (PSN). On receipt + of the PW packet at the egress PE (which knows a flow LSE is + present), the flow LSE is discarded without processing. + + Note that the flow label MUST NOT be an MPLS reserved label (values + in the range 0..15) [RFC3032], but is otherwise unconstrained by the + protocol. + + It is useful to give consideration to the choice of Time to Live + (TTL) value in the flow LSE [RFC3032]. The flow LSE is at the bottom + of the label stack; therefore, even when penultimate hop popping is + employed, it will always be preceded by the PW label on arrival at + the PE. If, due to an error condition, the flow LSE becomes the top + of the stack, it might be examined as if it were a normal LSE, and + the packet might then be forwarded. This can be prevented by setting + the flow LSE TTL to 1, thereby forcing the packet to be discarded by + the forwarder. Note that setting the TTL to 1 regardless of the + payload may be considered a departure from the TTL procedures defined + in [RFC3032] that apply to the general MPLS case. + + This document does not define a use for the Traffic Class (TC) field + [RFC5462] (formerly known as the Experimental Use (EXP) bits + [RFC3032]) in the flow label. Future documents may define a use for + these bits; therefore, implementations conforming to this + specification MUST set the TC field to zero at the ingress and MUST + ignore them at the egress. + +2. Native Service Processing Function + + The Native Service Processing (NSP) function [RFC3985] is a component + of a PE that has knowledge of the structure of the emulated service + and is able to take action on the service outside the scope of the + PW. In this case, it is REQUIRED that the NSP in the ingress PE + identify flows, or groups of flows within the service, and indicate + the flow (group) identity of each packet as it is passed to the + pseudowire forwarder. As an example, where the PW type is an + Ethernet, the NSP might parse the ingress Ethernet traffic and + consider all of the IP traffic. This traffic could then be + categorised into flows by considering all traffic with the same + source and destination address pair to be a single indivisible flow. + Since this is an NSP function, by definition, the method used to + identify a flow is outside the scope of the PW design. Similarly, + since the NSP is internal to the PE, the method of flow indication to + the PW forwarder is outside the scope of this document. + + + + + +Bryant, et al. Standards Track [Page 5] + +RFC 6391 FAT-PW November 2011 + + +3. Pseudowire Forwarder + + The PW forwarder must be provided with a method of mapping flows to + load-balanced paths. + + The forwarder must generate a label for the flow or group of flows. + How the flow label values are determined is outside the scope of this + document; however, the flow label allocated to a flow MUST NOT be an + MPLS reserved label and SHOULD remain constant for the life of the + flow. It is RECOMMENDED that the method chosen to generate the load- + balancing labels introduce a high degree of entropy in their values, + to maximise the entropy presented to the ECMP selection mechanism in + the LSRs in the PSN, and hence distribute the flows as evenly as + possible over the available PSN ECMP. The forwarder at the ingress + PE prepends the PW control word (if applicable), and then pushes the + flow label, followed by the PW label. + + NOTE: Although this document does not attempt to specify any hash + algorithms, it is suggested that any such algorithm should be based + on the assumption that there will be a high degree of entropy in the + values assigned to the flow labels. + + The forwarder at the egress PE uses the pseudowire label to identify + the pseudowire. From the context associated with the pseudowire + label, the egress PE can determine whether a flow LSE is present. If + a flow LSE is present, it MUST be checked to determine whether it + carries a reserved label. If it is a reserved label, the packet is + processed according to the rules associated with that reserved label; + otherwise, the LSE is discarded. + + All other PW forwarding operations are unmodified by the inclusion of + the flow LSE. + + + + + + + + + + + + + + + + + + + +Bryant, et al. Standards Track [Page 6] + +RFC 6391 FAT-PW November 2011 + + +3.1. Encapsulation + + The PWE3 Protocol Stack Reference Model modified to include flow LSE + is shown in Figure 1. + + +-------------+ +-------------+ + | Emulated | | Emulated | + | Ethernet | | Ethernet | + | (including | Emulated Service | (including | + | VLAN) |<==============================>| VLAN) | + | Services | | Services | + +-------------+ +-------------+ + | Flow | | Flow | + +-------------+ Pseudowire +-------------+ + |Demultiplexer|<==============================>|Demultiplexer| + +-------------+ +-------------+ + | PSN | PSN Tunnel | PSN | + | MPLS |<==============================>| MPLS | + +-------------+ +-------------+ + | Physical | | Physical | + +-----+-------+ +-----+-------+ + + Figure 1: PWE3 Protocol Stack Reference Model + + The encapsulation of a PW with a flow LSE is shown in Figure 2. + + +---------------------------+ + | | + | Payload | + | | n octets + | | + +---------------------------+ + | Optional Control Word | 4 octets + +---------------------------+ + | Flow LSE | 4 octets + +---------------------------+ + | PW LSE | 4 octets + +---------------------------+ + | MPLS Tunnel LSE (s) | n*4 octets (four octets per LSE) + +---------------------------+ + + Figure 2: Encapsulation of a Pseudowire with a Pseudowire Flow LSE + + + + + + + + + +Bryant, et al. Standards Track [Page 7] + +RFC 6391 FAT-PW November 2011 + + +4. Signalling the Presence of the Flow Label + + When using the signalling procedures in [RFC4447], a new Pseudowire + Interface Parameter Sub-TLV, the Flow Label Sub-TLV (FL Sub-TLV), is + used to synchronise the flow label states between the ingress and + egress PEs. + + The absence of an FL Sub-TLV indicates that the PE is unable to + process flow labels. An ingress PE that is using PW signalling and + that does not send an FL Sub-TLV MUST NOT include a flow label in the + PW packet. An ingress PE that is using PW signalling and that does + not receive an FL Sub-TLV from its egress peer MUST NOT include a + flow label in the PW packet. This preserves backwards compatibility + with existing PW specifications. + + A PE that wishes to send a flow label in a PW packet MUST include in + its label mapping message an FL Sub-TLV with T = 1 (see Section 4.1). + + A PE that is willing to receive a flow label MUST include in its + label mapping message an FL Sub-TLV with R = 1 (see Section 4.1). + + A PE that receives a label mapping message containing an FL Sub-TLV + with R = 0 MUST NOT include a flow label in the PW packet. + + Thus, a PE sending an FL Sub-TLV with T = 1 and receiving an FL + Sub-TLV with R = 1 MUST include a flow label in the PW packet. Under + all other combinations of FL Sub-TLV signalling, a PE MUST NOT + include a flow label in the PW packet. + + The signalling procedures in [RFC4447] state that "Processing of the + interface parameters should continue when unknown interface + parameters are encountered, and they MUST be silently ignored". The + signalling procedure described here is therefore backwards compatible + with existing implementations. + + Note that what is signalled is the desire to include the flow LSE in + the label stack. The value of the flow label is a local matter for + the ingress PE, and the label value itself is not signalled. + + + + + + + + + + + + + +Bryant, et al. Standards Track [Page 8] + +RFC 6391 FAT-PW November 2011 + + +4.1. Structure of Flow Label Sub-TLV + + The structure of the Flow Label Sub-TLV is shown in Figure 3. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | FL=0x17 | Length |T|R| Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 3: Flow Label Sub-TLV + + Where: + + o FL (value 0x17) is the Flow Label Sub-TLV identifier assigned by + IANA (see Section 11). + + o Length is the length of the Sub-TLV in octets and is 4. + + o When T = 1, the PE is requesting the ability to send a PW packet + that includes a flow label. When T = 0, the PE is indicating that + it will not send a PW packet containing a flow label. + + o When R = 1, the PE is able to receive a PW packet with a flow + label present. When R = 0, the PE is unable to receive a PW + packet with the flow label present. + + o Reserved bits MUST be zero on transmit and MUST be ignored on + receive. + +5. Static Pseudowires + + If PWE3 signalling [RFC4447] is not in use for a PW, then whether the + flow label is used MUST be identically provisioned in both PEs at the + PW endpoints. If there is no provisioning support for this option, + the default behaviour is not to include the flow label. + +6. Multi-Segment Pseudowires + + The flow label mechanism described in this document works on + multi-segment PWs without requiring modification to the Switching PEs + (S-PEs). This is because the flow LSE is transparent to the label + swap operation, and because interface parameter Sub-TLV signalling is + transitive. + + + + + + + +Bryant, et al. Standards Track [Page 9] + +RFC 6391 FAT-PW November 2011 + + +7. Operations, Administration, and Maintenance (OAM) + + The following OAM considerations apply to this method of load + balancing. + + Where the OAM is only to be used to perform a basic test to verify + that the PWs have been configured at the PEs, Virtual Circuit + Connectivity Verification (VCCV) [RFC5085] messages may be sent using + any load balance PW path, i.e., using any value for the flow label. + + Where it is required to verify that a pseudowire is fully functional + for all flows, a VCCV [RFC5085] connectivity verification message + MUST be sent over each ECMP path to the pseudowire egress PE. This + solution may be difficult to achieve and scales poorly. Under these + circumstances, it may be sufficient to send VCCV messages using any + load balance pseudowire path, because if a failure occurs within the + PSN, the failure will normally be detected and repaired by the PSN. + That is, the PSN's Interior Gateway Protocol (IGP) link/node failure + detection mechanism (loss of light, bidirectional forwarding + detection [RFC5880], or IGP hello detection) and the IGP convergence + will naturally modify the ECMP set of network paths between the + ingress and egress PEs. Hence, the PW is only impacted during the + normal IGP convergence time. Note that this period may be reduced if + a fast re-route or fast convergence technology is deployed in the + network [RFC4090] [RFC5286]. + + If the failure is related to the individual corruption of a Label + Forwarding Information Base (LFIB) entry in a router, then only the + network path using that specific entry is impacted. If the PW is + load-balanced over multiple network paths, then this failure can only + be detected if, by chance, the transported OAM flow is mapped onto + the impacted network path, or if all paths are tested. Since testing + all paths may present problems as noted above, other mechanisms to + detect this type of error may need to be developed, such as a Label + Switched Path (LSP) self-test technology. + + To troubleshoot the MPLS PSN, including multiple paths, the + techniques described in [RFC4378] and [RFC4379] can be used. + + Where the PW OAM is carried out of band (VCCV Type 2) [RFC5085], it + is necessary to insert an "MPLS Router Alert Label" in the label + stack. The resultant label stack is as follows: + + + + + + + + + +Bryant, et al. Standards Track [Page 10] + +RFC 6391 FAT-PW November 2011 + + + +-------------------------------+ + | | + | VCCV Message | n octets + | | + +-------------------------------+ + | Optional Control Word | 4 octets + +-------------------------------+ + | Flow LSE | 4 octets + +-------------------------------+ + | PW LSE | 4 octets + +-------------------------------+ + | Router Alert LSE | 4 octets + +-------------------------------+ + | MPLS Tunnel LSE(s) | n*4 octets (four octets per label) + +-------------------------------+ + + Figure 4: Use of Router Alert Label + + Note that, depending on the number of labels hashed by the LSR, the + inclusion of the Router Alert label may cause the OAM packet to be + load-balanced to a different path from that taken by the data packets + with identical flow and PW labels. + +8. Applicability of PWs Using Flow Labels + + A node within the PSN is not able to perform deep packet inspection + (DPI) of the PW, as the PW technology is not self-describing: the + structure of the PW payload is only known to the ingress and egress + PE devices. The method proposed in this document provides a + statistical mitigation of the problem of load balance in those cases + where a PE is able to discern flows embedded in the traffic received + on the attachment circuit. + + The methods described in this document are transparent to the PSN and + as such do not require any new capability from the PSN. + + The requirement to load-balance over multiple PSN paths occurs when + the ratio between the PW access speed and the PSN's core link + bandwidth is large (e.g., >= 10%). ATM and Frame Relay are unlikely + to meet this property. Ethernet may have this property, and for that + reason this document focuses on Ethernet. Applications for other + high-access-bandwidth PWs may be defined in the future. + + This design applies to MPLS PWs where it is meaningful to + de-construct the packets presented to the ingress PE into flows. The + mechanism described in this document promotes the distribution of + flows within the PW over different network paths. In turn, this + means that whilst packets within a flow are delivered in order + + + +Bryant, et al. Standards Track [Page 11] + +RFC 6391 FAT-PW November 2011 + + + (subject to normal IP delivery perturbations due to topology + variation), order is no longer maintained for all packets sent over + the PW. It is not proposed to associate a different sequence number + with each flow. If sequence number support is required, the flow + label mechanism MUST NOT be used. + + Where it is known that the traffic carried by the Ethernet PW is IP, + the flows can be identified and mapped to an ECMP. Such methods + typically include hashing on the source and destination addresses, + the protocol ID and higher-layer flow-dependent fields such as + TCP/UDP ports, Layer 2 Tunneling Protocol version 3 (L2TPv3) Session + IDs, etc. + + Where it is known that the traffic carried by the Ethernet PW is + non-IP, techniques used for link bundling between Ethernet switches + may be reused. In this case, however, the latency distribution would + be larger than is found in the link bundle case. The acceptability + of the increased latency is for further study. Of particular + importance, the Ethernet control frames SHOULD always be mapped to + the same PSN path to ensure in-order delivery. + +8.1. Equal Cost Multiple Paths + + ECMP in packet switched networks is statistical in nature. The + mapping of flows to a particular path does not take into account the + bandwidth of the flow being mapped or the current bandwidth usage of + the members of the ECMP set. This simplification works well when the + distribution of flows is evenly spread over the ECMP set and there + are a large number of flows that have low bandwidth relative to the + paths. The random allocation of a flow to a path provides a good + approximation to an even spread of flows, provided that polarisation + effects are avoided. The method defined in this document has the + same statistical properties as an IP PSN. + + ECMP is a load-sharing mechanism that is based on sharing the load + over a number of layer 3 paths through the PSN. Often, however, + multiple links exist between a pair of LSRs that are considered by + the IGP to be a single link. These are known as link bundles. The + mechanism described in this document can also be used to distribute + the flows within a PW over the members of the link bundle by using + the flow label value to identify candidate flows. How that mapping + takes place is outside the scope of this specification. Similar + considerations apply to Link Aggregation Groups. + + There is no mechanism currently defined to indicate the bandwidths in + use by specific flows using the fields of the MPLS shim header. + Furthermore, since the semantics of the MPLS shim header are fully + defined in [RFC3032] and [RFC5462], those fields cannot be assigned + + + +Bryant, et al. Standards Track [Page 12] + +RFC 6391 FAT-PW November 2011 + + + semantics to carry this information. This document does not define + any semantic for use in the TTL or TC fields of the label entry that + carries the flow label, but requires that the flow label itself be + selected with a high degree of entropy suggesting that the label + value should not be overloaded with additional meaning in any + subsequent specification. + + A different type of load balancing is the desire to carry a PW over a + set of PSN links in which the bandwidth of members of the link set is + less than the bandwidth of the PW. Proposals to address this problem + have been made in the past [PWBONDING]. Such a mechanism can be + considered complementary to this mechanism. + +8.2. Link Aggregation Groups + + A Link Aggregation Group (LAG) is used to bond together several + physical circuits between two adjacent nodes so they appear to + higher-layer protocols as a single, higher-bandwidth "virtual" pipe. + These may coexist in various parts of a given network. An advantage + of LAGs is that they reduce the number of routing and signalling + protocol adjacencies between devices, reducing control plane + processing overhead. As with ECMP, the key problem related to LAGs + is that due to inefficiencies in LAG load-distribution algorithms, a + particular component of a LAG may experience congestion. The + mechanism proposed here may be able to assist in producing a more + uniform flow distribution. + + The same considerations requiring a flow to go over a single member + of an ECMP set apply to a member of a LAG. + +8.3. Multiple RSVP-TE Paths + + In some networks, it is desirable for a Label Edge Router (LER) to be + able to load-balance a PW across multiple Resource Reservation + Protocol - Traffic Engineering (RSVP-TE) tunnels. The flow label + mechanism described in this document may be used to provide the LER + with the required flow information and necessary entropy to provide + this type of load balancing. An example of such a case is the use of + the flow label mechanism in networks using a link bundle with the all + ones component [RFC4201]. + + Methods by which the LER is configured to apply this type of ECMP are + outside the scope of this document. + + + + + + + + +Bryant, et al. Standards Track [Page 13] + +RFC 6391 FAT-PW November 2011 + + +8.4. The Single Large Flow Case + + Clearly, the operator should make sure that the service offered using + PW technology and the method described in this document do not exceed + the maximum planned link capacity, unless it can be guaranteed that + they conform to the Internet traffic profile of a very large number + of small flows. + + If the NSP cannot access sufficient information to distinguish flows, + perhaps because the protocol stack required parsing further into the + packet than it is able, then the functionality described in this + document does not give any benefits. The most common case where a + single flow dominates the traffic on a PW is when it is used to + transport enterprise traffic. Enterprise traffic may well consist of + a single, large TCP flow, or encrypted flows that cannot be handled + by the methods described in this document. + + An operator has four options under these circumstances: + + 1. The operator can choose to do nothing, and the system will work + as it does without the flow label. + + 2. The operator can make the customer aware that the service + offering has a restriction on flow bandwidth and police flows to + that restriction. This would allow customers offering multiple + flows to use a larger fraction of their access bandwidth, whilst + preventing a single flow from consuming a fraction of internal + link bandwidth that the operator considered excessive. + + 3. The operator could configure the ingress PE to assign a constant + flow label to all high-bandwidth flows so that only one path was + affected by these flows. + + 4. The operator could configure the ingress PE to assign a random + flow label to all high-bandwidth flows so as to minimise the + disruption to the network at the cost of out-of-order traffic to + the user. + + The issues described above are mitigated by the following two + factors: + + o Firstly, the customer of a high-bandwidth PW service has an + incentive to get the best transport service, because an + inefficient use of the PSN leads to jitter and eventually to loss + to the PW's payload. + + + + + + +Bryant, et al. Standards Track [Page 14] + +RFC 6391 FAT-PW November 2011 + + + o Secondly, the customer is usually able to tailor their + applications to generate many flows in the PSN. A well-known + example is massive data transport between servers that use many + parallel TCP sessions. This same technique can be used by any + transport protocol: multiple UDP ports, multiple L2TPv3 Session + IDs, or multiple Generic Routing Encapsulation (GRE) keys may be + used to decompose a large flow into smaller components. This + approach may be applied to IPsec [RFC4301] where multiple Security + Parameter Indexes (SPIs) may be allocated to the same security + association. + +8.5. Applicability to MPLS-TP + + The MPLS Transport Profile (MPLS-TP) [RFC5654] Requirement 44 states + that "MPLS-TP MUST support mechanisms that ensure the integrity of + the transported customer's service traffic as required by its + associated Service Level Agreement (SLA). Loss of integrity may be + defined as packet corruption, reordering, or loss during normal + network conditions". In addition, MPLS-TP makes extensive use of the + fate sharing between OAM and data packets, which is defeated by the + flow LSE. The flow-aware transport of a PW reorders packets and + therefore MUST NOT be deployed in a network conforming to MPLS-TP, + unless these integrity requirements specified in the SLA can be + satisfied. + +8.6. Asymmetric Operation + + The protocol defined in this document supports the asymmetric + inclusion of the flow LSE. Asymmetric operation can be expected when + there is asymmetry in the bandwidth requirements making it + unprofitable for one PE to perform the flow classification, or when + that PE is otherwise unable to perform the classification but is able + to receive flow labeled packets from its peer. Asymmetric operation + of the PW may also be required when one PE has a high transmission + bandwidth requirement, but has a need to receive the entire PW on a + single interface in order to perform a processing operation that + requires the context of the complete PW (for example, policing of the + egress traffic). + +9. Applicability to MPLS LSPs + + An extension of this technique is to create a basis for hash + diversity without having to peek below the label stack for IP traffic + carried over Label Distribution Protocol (LDP) LSPs. The + generalisation of this extension to MPLS has been described in + [MPLS-ENTROPY]. This generalisation can be regarded as a + + + + + +Bryant, et al. Standards Track [Page 15] + +RFC 6391 FAT-PW November 2011 + + + complementary, but distinct, approach from the technique described in + this document. While similar consideration may apply to the + identification of flows and the allocation of flow label values, the + flow labels are imposed by different network components, and the + associated signalling mechanisms are different. + +10. Security Considerations + + The PW generic security considerations described in [RFC3985] and the + security considerations applicable to a specific PW type (for + example, in the case of an Ethernet PW [RFC4448]) apply. The + security considerations in [RFC5920] also apply. + + Section 1.3 describes considerations that apply to the TTL value used + in the flow LSE. The use of a TTL value of one prevents the + accidental forwarding of a packet based on the label value in the + flow LSE. + +11. IANA Considerations + + IANA maintains the registry "Pseudowire Name Spaces (PWE3)" with + sub-registry "Pseudowire Interface Parameters Sub-TLV type Registry". + IANA has registered the Flow Label Sub-TLV type in this registry. + + Parameter ID Length Description Reference + ------------------------------------------------------ + 0x17 4 Flow Label RFC 6391 + +12. Congestion Considerations + + The congestion considerations applicable to PWs as described in + [RFC3985] apply to this design. + + The ability to explicitly configure a PW to leverage the availability + of multiple ECMPs is beneficial to capacity planning as, all other + parameters being constant, the statistical multiplexing of a larger + number of smaller flows is more efficient than with a smaller number + of larger flows. + + Note that if the classification into flows is only performed on IP + packets, the behaviour of those flows in the face of congestion will + be as already defined by the IETF for packets of that type, and no + additional congestion processing is required. + + Where flows that are not IP are classified, PW congestion avoidance + must be applied to each non-IP load balance group. + + + + + +Bryant, et al. Standards Track [Page 16] + +RFC 6391 FAT-PW November 2011 + + +13. Acknowledgements + + The authors wish to thank Mary Barnes, Eric Grey, Kireeti Kompella, + Joerg Kuechemann, Wilfried Maas, Luca Martini, Mark Townsley, Rolf + Winter, and Lucy Yong for valuable comments on this document. + +14. References + +14.1. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC3032] Rosen, E., Tappan, D., Fedorkow, G., Rekhter, Y., + Farinacci, D., Li, T., and A. Conta, "MPLS Label Stack + Encoding", RFC 3032, January 2001. + + [RFC4379] Kompella, K. and G. Swallow, "Detecting Multi-Protocol + Label Switched (MPLS) Data Plane Failures", RFC 4379, + February 2006. + + [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, + "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word + for Use over an MPLS PSN", RFC 4385, February 2006. + + [RFC4447] Martini, L., Ed., Rosen, E., El-Aawar, N., Smith, T., and + G. Heron, "Pseudowire Setup and Maintenance Using the + Label Distribution Protocol (LDP)", RFC 4447, April 2006. + + [RFC4448] Martini, L., Ed., Rosen, E., El-Aawar, N., and G. Heron, + "Encapsulation Methods for Transport of Ethernet over + MPLS Networks", RFC 4448, April 2006. + + [RFC4553] Vainshtein, A., Ed., and YJ. Stein, Ed., "Structure- + Agnostic Time Division Multiplexing (TDM) over Packet + (SAToP)", RFC 4553, June 2006. + + [RFC4928] Swallow, G., Bryant, S., and L. Andersson, "Avoiding + Equal Cost Multipath Treatment in MPLS Networks", + BCP 128, RFC 4928, June 2007. + + [RFC5085] Nadeau, T., Ed., and C. Pignataro, Ed., "Pseudowire + Virtual Circuit Connectivity Verification (VCCV): A + Control Channel for Pseudowires", RFC 5085, + December 2007. + + + + + + +Bryant, et al. Standards Track [Page 17] + +RFC 6391 FAT-PW November 2011 + + +14.2. Informative References + + [MPLS-ENTROPY] + Kompella, K., Drake, J., Amante, S., Henderickx, W., and + L. Yong, "The Use of Entropy Labels in MPLS Forwarding", + Work in Progress, October 2011. + + [PWBONDING] Stein, Y(J)., Mendelsohn, I., and R. Insler, "PW + Bonding", Work in Progress, November 2008. + + [RFC3985] Bryant, S., Ed., and P. Pate, Ed., "Pseudo Wire Emulation + Edge-to-Edge (PWE3) Architecture", RFC 3985, March 2005. + + [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast + Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, + May 2005. + + [RFC4201] Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling + in MPLS Traffic Engineering (TE)", RFC 4201, + October 2005. + + [RFC4301] Kent, S. and K. Seo, "Security Architecture for the + Internet Protocol", RFC 4301, December 2005. + + [RFC4378] Allan, D., Ed., and T. Nadeau, Ed., "A Framework for + Multi-Protocol Label Switching (MPLS) Operations and + Management (OAM)", RFC 4378, February 2006. + + [RFC5286] Atlas, A., Ed., and A. Zinin, Ed., "Basic Specification + for IP Fast Reroute: Loop-Free Alternates", RFC 5286, + September 2008. + + [RFC5462] Andersson, L. and R. Asati, "Multiprotocol Label + Switching (MPLS) Label Stack Entry: "EXP" Field Renamed + to "Traffic Class" Field", RFC 5462, February 2009. + + [RFC5654] Niven-Jenkins, B., Ed., Brungard, D., Ed., Betts, M., + Ed., Sprecher, N., and S. Ueno, "Requirements of an MPLS + Transport Profile", RFC 5654, September 2009. + + [RFC5880] Katz, D. and D. Ward, "Bidirectional Forwarding Detection + (BFD)", RFC 5880, June 2010. + + [RFC5920] Fang, L., Ed., "Security Framework for MPLS and GMPLS + Networks", RFC 5920, July 2010. + + + + + + +Bryant, et al. Standards Track [Page 18] + +RFC 6391 FAT-PW November 2011 + + +Authors' Addresses + + Stewart Bryant (editor) + Cisco Systems + 250 Longwater Ave. + Reading RG2 6GB + United Kingdom + + Phone: +44-208-824-8828 + EMail: stbryant@cisco.com + + + Clarence Filsfils + Cisco Systems + Brussels + Belgium + + EMail: cfilsfil@cisco.com + + + Ulrich Drafz + Deutsche Telekom + Muenster + Germany + + EMail: Ulrich.Drafz@telekom.de + + + Vach Kompella + Alcatel-Lucent + + EMail: vach.kompella@alcatel-lucent.com + + + Joe Regan + Alcatel-Lucent + + EMail: joe.regan@alcatel-lucent.com + + + Shane Amante + Level 3 Communications, LLC + 1025 Eldorado Blvd. + Broomfield, CO 80021 + USA + + EMail: shane@level3.net + + + + +Bryant, et al. Standards Track [Page 19] + -- cgit v1.2.3