diff options
Diffstat (limited to 'doc/rfc/rfc9524.txt')
-rw-r--r-- | doc/rfc/rfc9524.txt | 1160 |
1 files changed, 1160 insertions, 0 deletions
diff --git a/doc/rfc/rfc9524.txt b/doc/rfc/rfc9524.txt new file mode 100644 index 0000000..983e25e --- /dev/null +++ b/doc/rfc/rfc9524.txt @@ -0,0 +1,1160 @@ + + + + +Internet Engineering Task Force (IETF) D. Voyer, Ed. +Request for Comments: 9524 Bell Canada +Category: Standards Track C. Filsfils +ISSN: 2070-1721 R. Parekh + Cisco Systems, Inc. + H. Bidgoli + Nokia + Z. Zhang + Juniper Networks + February 2024 + + + Segment Routing Replication for Multipoint Service Delivery + +Abstract + + This document describes the Segment Routing Replication segment for + multipoint service delivery. A Replication segment allows a packet + to be replicated from a replication node to downstream nodes. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc9524. + +Copyright Notice + + Copyright (c) 2024 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Revised BSD License text as described in Section 4.e of the + Trust Legal Provisions and are provided without warranty as described + in the Revised BSD License. + +Table of Contents + + 1. Introduction + 1.1. Terminology + 1.2. Use Cases + 2. Replication Segment + 2.1. SR-MPLS Data Plane + 2.2. SRv6 Data Plane + 2.2.1. End.Replicate: Replicate and/or Decapsulate + 2.2.2. OAM Operations + 2.2.3. ICMPv6 Error Messages + 3. IANA Considerations + 4. Security Considerations + 5. References + 5.1. Normative References + 5.2. Informative References + Appendix A. Illustration of a Replication Segment + A.1. SR-MPLS + A.2. SRv6 + A.2.1. Pinging a Replication-SID + Acknowledgements + Contributors + Authors' Addresses + +1. Introduction + + The Replication segment is a new type of segment for Segment Routing + (SR) [RFC8402], which allows a node (henceforth called a "replication + node") to replicate packets to a set of other nodes (called + "downstream nodes") in an SR domain. A Replication segment can + replicate packets to directly connected nodes or to downstream nodes + (without the need for state on the transit routers). This document + focuses on specifying the behavior of a Replication segment for both + Segment Routing with Multiprotocol Label Switching (SR-MPLS) + [RFC8660] and Segment Routing with IPv6 (SRv6) [RFC8986]. The + examples in Appendix A illustrate the behavior of a Replication + Segment in an SR domain. The use of two or more Replication segments + stitched together to form a tree using a control plane is left to be + specified in other documents. The management of IP multicast groups, + building IP multicast trees, and performing multicast congestion + control are out of scope of this document. + +1.1. Terminology + + This section defines terms introduced and used frequently in this + document. Refer to the Terminology sections of [RFC8402], [RFC8754], + and [RFC8986] for other terms used in SR. + + Replication segment: A segment in an SR domain that replicates + packets. See Section 2 for details. + + Replication node: A node in an SR domain that replicates packets + based on a Replication segment. + + Downstream nodes: A Replication segment replicates packets to a set + of nodes. These nodes are downstream nodes. + + Replication state: State held for a Replication segment at a + replication node. It is conceptually a list of Replication + branches to downstream nodes. The list can be empty. + + Replication-SID: Data plane identifier of a Replication segment. + This is an SR-MPLS label or SRv6 Segment Identifier (SID). + + SRH: IPv6 Segment Routing Header [RFC8754]. + + Point-to-Multipoint (P2MP) Service: A service that has one ingress + node and one or more egress nodes. A packet is delivered to all + the egress nodes. + + Root node: An ingress node of a P2MP service. + + Leaf node: An egress node of a P2MP service. + + Bud node: A node that is both a replication node and a leaf node. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in BCP + 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + +1.2. Use Cases + + In the simplest use case, a single Replication segment includes the + ingress node of a multipoint service and the egress nodes of the + service as all the downstream nodes. This achieves Ingress + Replication [RFC7988] that has been widely used for Multicast VPN + (MVPN) [RFC6513] and Ethernet VPN (EVPN) [RFC7432] bridging of + Broadcast, Unknown Unicast, and Multicast (BUM) traffic. This + Replication segment on ingress and egress nodes can either be + provisioned locally or using dynamic autodiscovery procedures for + MVPN and EVPN. Note SRv6 [RFC8986] has End.DT2M replication behavior + for EVPN BUM traffic. + + Replication segments can also be used to form trees by stitching + Replication segments on a root node, intermediate replication nodes, + and leaf nodes for efficient delivery of MVPN and EVPN BUM traffic. + +2. Replication Segment + + In an SR domain, a Replication segment is a logical construct that + connects a replication node to a set of downstream nodes. A + Replication segment is a local segment instantiated at a Replication + node. It can be either provisioned locally on a node or programmed + by a control plane. + + Replication segments can be stitched together to form a tree by + either local provisioning on nodes or using a control plane. The + procedures for doing this are out of scope of this document. One + such control plane using a PCE with the SR P2MP policy is specified + in [P2MP-POLICY]. However, if local provisioning is used to stitch + Replication segments, then a chain of Replication segments SHOULD NOT + form a loop. If a control plane is used to stitch Replication + segments, the control plane specification MUST prevent loops or + detect and mitigate loops in steady state. + + A Replication segment is identified by the tuple <Replication-ID, + Node-ID>, where: + + Replication-ID: An identifier for a Replication segment that is + unique in context of the replication node. + + Node-ID: The address of the replication node for the Replication + segment. Note that the root of a multipoint service is also a + Replication node. + + Replication-ID is a variable-length field. In the simplest case, it + can be a 32-bit number, but it can be extended or modified as + required based on the specific use of a Replication segment. This is + out of scope for this document. The length of the Replication-ID is + specified in the signaling mechanism used for the Replication + segment. Examples of such signaling and extensions are described in + [P2MP-POLICY]. When the PCE signals a Replication segment to its + node, the <Replication-ID, Node-ID> tuple identifies the segment. + + A Replication segment includes the following elements: + + Replication-SID: The Segment Identifier of a Replication segment. + This is an SR-MPLS label or an SRv6 SID [RFC8402]. + + Downstream nodes: Set of nodes in an SR domain to which a packet is + replicated by the Replication segment. + + Replication state: See below. + + The downstream nodes and Replication state (RS) of a Replication + segment can change over time, depending on the network state and leaf + nodes of a multipoint service that the segment is part of. + + The Replication-SID identifies the Replication segment in the + forwarding plane. At a replication node, the Replication-SID + operates on the RS of the Replication segment. + + RS is a list of Replication branches to the downstream nodes. In + this document, each branch is abstracted to a <downstream node, + downstream Replication-SID> tuple. <downstream node> represents the + reachability from the replication node to the downstream node. In + its simplest form, this MAY be specified as an interface or next-hop + if the downstream node is adjacent to the replication node. The + reachability may be specified in terms of a Flexible Algorithm path + (including the default algorithm) [RFC9350] or specified by an SR- + explicit path represented either by a SID list (of one or more SIDs) + or by a Segment Routing Policy [RFC9256]. The downstream + Replication-SID is the Replication-SID of the Replication segment at + the downstream node. + + A packet is steered into a Replication segment at a replication node + in two ways: + + * When the active segment [RFC8402] is a locally instantiated + Replication-SID. + + * By the root of a multipoint service based on local configuration + that is outside the scope of this document. + + In either case, the packet is replicated to each downstream node in + the associated RS. + + If a downstream node is an egress (leaf) of the multipoint service, + no further replication is needed. The leaf node's Replication + segment has an indicator for the leaf role, and it does not have any + RS (i.e., the list of Replication branches is empty). The + Replication-SID at a leaf node MAY be used to identify the multipoint + service. Notice that the segment on the leaf node is still referred + to as a "Replication segment" for the purpose of generalization. + + A node can be a bud node (i.e., it is a replication node and a leaf + node of a multipoint service [P2MP-POLICY]). The Replication segment + of a bud node has a list of Replication branches as well as a leaf + role indicator. + + In principle, it is possible for different Replication segments to + replicate packets to the same Replication segment on a downstream + node. However, such usage is intentionally left out of scope of this + document. + +2.1. SR-MPLS Data Plane + + When the active segment is a Replication-SID, the processing results + in a POP [RFC8402] operation and the lookup of the associated RS. + For each replication in the RS, the operation is a PUSH [RFC8402] of + the downstream Replication-SID and an optional segment list onto the + packet to steer the packet to the downstream node. + + The operation performed on the incoming Replication-SID is NEXT + [RFC8402] at a leaf or bud node where delivery of payload off the + tree is per local configuration. For some usages, this may involve + looking at the next SID, for example, to get the necessary context. + + When the root of a multipoint service steers a packet to a + Replication segment, it results in a replication to each downstream + node in the associated RS. The operation is a PUSH of the + Replication-SID and an optional segment list onto the packet, which + is forwarded to the downstream node. + + The following applies to a Replication-SID in MPLS encapsulation: + + * SIDs MAY be inserted before the downstream SR-MPLS Replication-SID + in order to guide a packet from a non-adjacent SR node to a + replication node. + + * A replication node MAY replicate a packet to a non-adjacent + downstream node using SIDs it inserts in the copy preceding the + downstream Replication-SID. The downstream node may be a leaf + node of the Replication segment, another replication node, or both + in the case of a bud node. + + * A replication node MAY use an Anycast-SID or a Border Gateway + Protocol (BGP) PeerSet-SID in the segment list to send a + replicated packet to one downstream replication node in a set of + Anycast nodes. This occurs if and only if all nodes in the set + have an identical Replication-SID and reach the same set of + receivers. + + * For some use cases, there MAY be SIDs after the Replication-SID in + the segment list of a packet. These SIDs are used only by the + leaf and bud nodes to forward a packet off the tree independent of + the Replication-SID. Coordination regarding the absence or + presence and value of context information for leaf and bud nodes + is outside the scope of this document. + +2.2. SRv6 Data Plane + + For SRv6 [RFC8986], this document specifies "Endpoint with + replication and/or decapsulate" behavior (End.Replicate for short) to + replicate a packet and forward the replicas according to an RS. + + When processing a packet destined to a local Replication-SID, the + packet is replicated according to the associated RS to downstream + nodes and/or locally delivered off the tree when this is a leaf or + bud node. For replication, the outer header is reused, and the + downstream Replication-SID, from RS, is written into the outer IPv6 + header Destination Address (DA). If required, an optional segment + list may be used on some branches using H.Encaps.Red [RFC8986] (while + some other branches may not need that). Note that this H.Encaps.Red + is independent of the Replication segment: it is just used to steer + the replicated packet on a traffic-engineered path to a downstream + node. The penultimate segment in the encapsulating IPv6 header will + execute the Ultimate Segment Decapsulation (USD) flavor [RFC8986] of + End/End.X behavior and forward the inner (replicated) packet to the + downstream node. If H.Encaps.Red is used to steer a replicated + packet to a downstream node, the operator must ensure the MTU on path + to the downstream node is sufficient to account for additional SRv6 + encapsulation. This also applies when the Replication segment is for + the root node, whose upstream node has placed the Replication-SID in + the header. + + A local application on root (e.g., MVPN [RFC6513] or EVPN [RFC7432]) + may also apply H.Encaps.Red and then steer the resulting traffic into + the Replication segment. Again, note that H.Encaps.Red is + independent of the Replication segment: it is the action of the + application (e.g. MVPN or EVPN service). If the service is on a + root node, then the two H.Encaps mentioned, one for the service and + the other in the previous paragraph for replication to the downstream + node, SHOULD be combined for optimization (to avoid extra IPv6 + encapsulation). + + When processing a packet destined to a local Replication-SID, the + IPv6 Hop Limit MUST be decremented and MUST be non-zero to replicate + the packet. A root node that encapsulates a payload can set the IPv6 + Hop Limit based on a local policy. This local policy SHOULD set the + IPv6 Hop Limit so that a replicated packet can reach the furthest + leaf node. A root node can also have a local policy to set the IPv6 + Hop Limit from the payload. In this case, the IPv6 Hop Limit may not + be sufficient to get the replicated packet to all the leaf nodes. + Non-replication nodes (i.e., nodes that forward replicated packets + based on the IPv6 locator unicast prefix) can decrement the IPv6 Hop + Limit to zero and originate ICMPv6 error packets to the root node. + This can result in a storm of ICMPv6 packets (see Section 2.2.3) to + the root node. To avoid this, a Replication segment has an optional + IPv6 Hop Limit Threshold. If this threshold is set, a replication + node MUST discard an incoming packet with a local Replication-SID if + the IPv6 Hop Limit in the packet is less than the threshold and log + this in a rate-limited manner. The IPv6 Hop Limit Threshold SHOULD + be set so that an incoming packet can be replicated to the furthest + leaf node. + + For leaf and bud nodes, local delivery off the tree is per + Replication-SID or the next SID (if present in the SRH). For some + usages, this may involve getting the necessary context either from + the next SID (e.g., MVPN with a shared tree) or from the Replication- + SID itself (e.g., MVPN with a non-shared tree). In both cases, the + context association is achieved with signaling and is out of scope of + this document. + + The following applies to a Replication-SID in SRv6 encapsulation: + + * There MAY be SIDs preceding the SRv6 Replication-SID in order to + guide a packet from a non-adjacent SR node to a replication node + via an explicit path. + + * A replication node MAY steer a replicated packet on an explicit + path to a non-adjacent downstream node using SIDs it inserts in + the copy preceding the downstream Replication-SID. The downstream + node may be a leaf node of the Replication segment, another + replication node, or both in the case of a bud node. + + * For SRv6, as described in above paragraphs, the insertion of SIDs + prior to the Replication-SID entails a new IPv6 encapsulation with + the SRH. However, this can be optimized on the root node or for + compressed SRv6 SIDs. + + * The locator of the Replication-SID is sufficient to guide a packet + on the shortest path between non-adjacent nodes for default or + Flexible Algorithms. + + * A replication node MAY use an Anycast-SID or a BGP PeerSet-SID in + the segment list to send a replicated packet to one downstream + replication node in an Anycast set. This occurs if and only if + all nodes in the set have an identical Replication-SID and reach + the same set of receivers. + + * There MAY be SIDs after the Replication-SID in the SRH of a + packet. These SIDs are used to provide additional context for + processing a packet locally at the node where the Replication-SID + is the active segment. Coordination regarding the absence or + presence and value of context information for leaf and bud nodes + is outside the scope of this document. + +2.2.1. End.Replicate: Replicate and/or Decapsulate + + The "Endpoint with replication and/or decapsulate" (End.Replicate for + short) is a variant of End behavior. The pseudocode in this section + follows the convention introduced in [RFC8986]. + + An RS conceptually contains the following elements: + + Replication state: + { + Node-Role: {Head, Transit, Leaf, Bud}; + IPv6 Hop Limit Threshold; # default is zero + # On Leaf, replication list is zero length + Replication-List: + { + downstream node: <Node-Identifier>; + downstream Replication-SID: R-SID; + # Segment-List may be empty + Segment-List: [SID-1, .... SID-N]; + } + } + + Below is the Replicate function on a packet for Replication state + (RS). + + S01. Replicate(RS, packet) + S02. { + S03. For each Replication R in RS.Replication-List { + S04. Make a copy of the packet + S05. Set IPv6 DA = RS.R-SID + S06. If RS.Segment-List is not empty { + S07. # Head node may optimize below encapsulation and + S08. # the encapsulation of packet in a single encapsulation + S09. Execute H.Encaps or H.Encaps.Red with RS.Segment-List + on packet copy #RFC 8986, Sections 5.1 and 5.2 + S10. } + S11. Submit the packet to the egress IPv6 FIB lookup and + transmission to the new destination + S12. } + S13. } + + Notes: + + * The IPv6 DA in the copy of a packet is set from the local state + and not from the SRH. + + When N receives a packet whose IPv6 DA is S and S is a local + End.Replicate SID, N does: + + S01. Lookup FUNCT portion of S to get Replication state (RS) + S02. If (IPv6 Hop Limit <= 1) { + S03. Discard the packet + S04. # ICMPv6 Time Exceeded is not permitted + (see Section 2.2.3) + S05. } + S06. If RS is not found { + S07. Discard the packet + S08. } + S09. If (IPv6 Hop Limit < RS.IPv6 Hop Limit Threshold) { + S10. Discard the packet + S11. # Rate-limited logging + S12. } + S13. Decrement IPv6 Hop Limit by 1 + S14. If (IPv6 NH == SRH and SRH TLVs present) { + S15. Process SRH TLVs if allowed by local configuration + S16. } + S17. Call Replicate(RS, packet) + S18. If (RS.Node-Role == Leaf OR RS.Node-Role == bud) { + S19. If (IPv6 NH == SRH and Segments Left > 0) { + S20. Derive packet processing context (PPC) from Segment List + S21. If (Segments Left != 0) { + S22. Discard the packet + S23. # ICMPv6 Parameter Problem message with Code 0 + S24. # (Erroneous header field encountered) + S25. # is not permitted (Section 2.2.3) + S26. } + S27. } Else { + S28. Derive packet processing context (PPC) + from FUNCT of Replicatio-SID + S29. } + S30. Process the next header + S31. } + + The processing of the Upper-Layer header of a packet matching the + End.Replicate SID at a leaf or bud node is as follows: + + S01. If (Upper-Layer header type == 4(IPv4) OR + Upper-Layer header type == 41(IPv6) ) { + S02. Remove the outer IPv6 header with all its extension headers + S03. Process the packet in context of PPC + S04. } Else If (Upper-Layer header type == 143(Ethernet) ) { + S05. Remove the outer IPv6 header with all its extension headers + S06. Process the Ethernet Frame in context of PPC + S07. } Else If (Upper-Layer header type is allowed + by local configuration) { + S08. Proceed to process the Upper-Layer header + S09. } Else { + S10. Discard the packet + S11. # ICMPv6 Parameter Problem message with Code 4 + S12. # (SR Upper-Layer header Error) + S13. # is not permitted (Section 2.2.3) + S14. } + + Notes: + + * The behavior above MAY result in a packet with a partially + processed segment list in the SRH under some circumstances. For + example, a head node may encode a context-SID in an SRH. As per + the pseudocode above, a replication node that receives a packet + with a local Replication-SID will not process the SRH segment list + and will just forward a copy with an unmodified SRH to downstream + nodes. + + * The packet processing context is usually a FIB table "T". + + If configured to process TLVs, processing the Replication-SID may + modify the "variable-length data" of TLV types that change en route. + Therefore, TLVs that change en route are mutable. The remainder of + the SRH (Segments Left, Flags, Tag, Segment List, and TLVs that do + not change en route) are immutable while processing this SID. + +2.2.1.1. Hashed Message Authentication Code (HMAC) SRH TLV + + If a root node encodes a context-SID in an SRH with an optional HMAC + SRH TLV [RFC8754], it MUST set the 'D' bit as defined in + Section 2.1.2 of [RFC8754] because the Replication-SID is not part of + the segment list in the SRH. + + HMAC generation and verification is as specified in [RFC8754]. + Verification of an HMAC TLV is determined by local configuration. If + verification fails, an implementation of a Replication-SID MUST NOT + originate an ICMPv6 Parameter Problem message with code 0. The + failure SHOULD be logged (rate-limited) and the packet SHOULD be + discarded. + +2.2.2. OAM Operations + + [RFC9259] specifies procedures for Operations, Administration, and + Maintenance (OAM) like ping and traceroute on SRv6 SIDs. + + Assuming the source node knows the Replication-SID a priori, it is + possible to ping a Replication-SID of a leaf or bud node directly by + putting it in the IPv6 DA without an SRH or in an SRH as the last + segment. While it is not possible to ping a Replication-SID of a + transit node because transit nodes do not process Upper-Layer + headers, it is still possible to ping a Replication-SID of a leaf or + bud node of a tree via the Replication-SID of intermediate transit + nodes. The source of the ping MUST compute the ICMPv6 Echo Request + checksum using the Replication-SID of the leaf or bud node as the DA. + The source can then send the Echo Request packet to a transit node's + Replication-SID. The transit node replicates the packet by replacing + the IPv6 DA until the packet reaches the leaf or bud node, which + responds with an ICMPv6 Echo Reply. Note that a transit replication + node may replicate Echo Request packets to other leaf or bud nodes. + These nodes will drop the Echo Request due to an incorrect checksum. + Procedures to prevent the misdelivery of an Echo Request may be + addressed in a future document. Appendix A.2.1 illustrates examples + of a ping to a Replication-SID. + + Traceroute to a leaf or bud node Replication-SID is not possible due + to restrictions prohibiting the origination of the ICMPv6 Time + Exceeded error message for a Replication-SID as described in + Section 2.2.3. + +2.2.3. ICMPv6 Error Messages + + Section 2.4 of [RFC4443] states an ICMPv6 error message MUST NOT be + originated as a result of receiving a packet destined to an IPv6 + multicast address. This is to prevent a source node from being + overwhelmed by a storm of ICMPv6 error messages resulting from + replicated IPv6 packets. There are two exceptions: + + 1. The Packet Too Big message for Path MTU discovery, and + + 2. The ICMPv6 Parameter Problem message with Code 2 reporting an + unrecognized IPv6 option. + + An implementation of a Replication segment for SRv6 MUST enforce + these same restrictions and exceptions. + +3. IANA Considerations + + IANA has assigned the following codepoint for End.Replicate behavior + in the "SRv6 Endpoint Behaviors" registry in the "Segment Routing" + registry group. + + +=======+========+===================+===========+============+ + | Value | Hex | Endpoint Behavior | Reference | Change | + | | | | | Controller | + +=======+========+===================+===========+============+ + | 75 | 0x004B | End.Replicate | RFC 9524 | IETF | + +-------+--------+-------------------+-----------+------------+ + + Table 1: SRv6 Endpoint Behavior + +4. Security Considerations + + The SID behaviors defined in this document are deployed within an SR + domain [RFC8402]. An SR domain needs protection from outside + attackers (as described in [RFC8754]). The following is a brief + reminder of the same: + + * For SR-MPLS deployments: + + - Disable MPLS on external interfaces of each edge node or any + other technique to filter labeled traffic ingress on these + interfaces. + + * For SRv6 deployments: + + - Allocate all the SIDs from an IPv6 prefix block S/s and + configure each external interface of each edge node of the + domain with an inbound Infrastructure Access Control List + (IACL) that drops any incoming packet with a DA in S/s. + + - Additionally, an IACL may be applied to all nodes (k) + provisioning SIDs as defined in this specification: + + o Assign all interface addresses from within IPv6 prefix A/a. + At node k, all SIDs local to k are assigned from prefix Sk/ + sk. Configure each internal interface of each SR node k in + the SR domain with an inbound IACL that drops any incoming + packet with a DA in Sk/sk if the source address is not in A/ + a. + + - Deny traffic with spoofed source addresses by implementing + recommendations in BCP 84 [RFC3704]. + + - Additionally, the block S/s from which SIDs are allocated may + be an address that is not globally routable such as a Unique + Local Address (ULA) or the prefix defined in [SIDS-SRv6]. + + Failure to protect the SR-MPLS domain by correctly provisioning MPLS + support per interface permits attackers from outside the domain to + send packets that use the replication services provisioned within the + domain. + + Failure to protect the SRv6 domain with IACLs on external interfaces + combined with failure to implement the recommendations of BCP 38 + [RFC2827] or apply IACLs on nodes provisioning SIDs permits attackers + from outside the SR domain to send packets that use the replication + services provisioned within the domain. + + Given the definition of the Replication segment in this document, an + attacker subverting the ingress filters above cannot take advantage + of a stack of Replication segments to perform amplification attacks + nor link exhaustion attacks. Replication segment trees always + terminate at a leaf or bud node resulting in a decapsulation. + However, this does allow an attacker to inject traffic to the + receivers within a P2MP service. + + This document introduces an SR segment endpoint behavior that + replicates and decapsulates an inner payload for both the MPLS and + IPv6 data planes. Similar to any MPLS end-of-stack label, or SRv6 + END.D* behavior, if the protections described above are not + implemented, an attacker can perform an attack via the decapsulating + segment (including the one described in this document). + + Incorrect provisioning of Replication segments can result in a chain + of Replication segments forming a loop. This can happen if + Replication segments are provisioned on SR nodes without using a + control plane. In this case, replicated packets can create a storm + until MPLS TTL (for SR-MPLS) or IPv6 Hop Limit (for SRv6) decrements + to zero. A control plane such as PCE can be used to prevent loops. + The control plane protocols (like Path Computation Element + Communication Protocol (PCEP), BGP, etc.) used to instantiate + Replication segments can leverage their own security mechanisms such + as encryption, authentication filtering, etc. + + For SRv6, Section 2.2.3 describes an exception for the ICMPv6 + Parameter Problem message with Code 2. If an attacker sends a packet + destined to a Replication-SID with the source address of a node and + with an extension header using the unknown option type marked as + mandatory, then a large number of ICMPv6 Parameter Problem messages + can cause a denial-of-service attack on the source node. Although + this document does not specify any extension headers, any future + extension of this document that does so is susceptible to this + security concern. + + If an attacker can forge an IPv6 packet with: + + * the source address of a node, + + * a Replication-SID as the DA, and + + * an IPv6 Hop Limit such that nodes that forward replicated packets + on an IPv6 locator unicast prefix, decrement the Hop Limit to + zero, + + then these nodes can cause a storm of ICMPv6 error packets to + overwhelm the source node under attack. The IPv6 Hop Limit Threshold + check described in Section 2.2 can help mitigate such attacks. + +5. References + +5.1. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + <https://www.rfc-editor.org/info/rfc2119>. + + [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet + Control Message Protocol (ICMPv6) for the Internet + Protocol Version 6 (IPv6) Specification", STD 89, + RFC 4443, DOI 10.17487/RFC4443, March 2006, + <https://www.rfc-editor.org/info/rfc4443>. + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, <https://www.rfc-editor.org/info/rfc8174>. + + [RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., + Decraene, B., Litkowski, S., and R. Shakir, "Segment + Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, + July 2018, <https://www.rfc-editor.org/info/rfc8402>. + + [RFC8754] Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., + Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header + (SRH)", RFC 8754, DOI 10.17487/RFC8754, March 2020, + <https://www.rfc-editor.org/info/rfc8754>. + + [RFC8986] Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer, + D., Matsushima, S., and Z. Li, "Segment Routing over IPv6 + (SRv6) Network Programming", RFC 8986, + DOI 10.17487/RFC8986, February 2021, + <https://www.rfc-editor.org/info/rfc8986>. + + [RFC9259] Ali, Z., Filsfils, C., Matsushima, S., Voyer, D., and M. + Chen, "Operations, Administration, and Maintenance (OAM) + in Segment Routing over IPv6 (SRv6)", RFC 9259, + DOI 10.17487/RFC9259, June 2022, + <https://www.rfc-editor.org/info/rfc9259>. + +5.2. Informative References + + [P2MP-POLICY] + Voyer, D., Ed., Filsfils, C., Parekh, R., Bidgoli, H., and + Z. J. Zhang, "Segment Routing Point-to-Multipoint Policy", + Work in Progress, Internet-Draft, draft-ietf-pim-sr-p2mp- + policy-07, 11 October 2023, + <https://datatracker.ietf.org/doc/html/draft-ietf-pim-sr- + p2mp-policy-07>. + + [PGM-ILLUSTRATION] + Filsfils, C., Camarillo, P., Ed., Li, Z., Matsushima, S., + Decraene, B., Steinberg, D., Lebrun, D., Raszuk, R., and + J. Leddy, "Illustrations for SRv6 Network Programming", + Work in Progress, Internet-Draft, draft-filsfils-spring- + srv6-net-pgm-illustration-04, 30 March 2021, + <https://datatracker.ietf.org/doc/html/draft-filsfils- + spring-srv6-net-pgm-illustration-04>. + + [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: + Defeating Denial of Service Attacks which employ IP Source + Address Spoofing", BCP 38, RFC 2827, DOI 10.17487/RFC2827, + May 2000, <https://www.rfc-editor.org/info/rfc2827>. + + [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed + Networks", BCP 84, RFC 3704, DOI 10.17487/RFC3704, March + 2004, <https://www.rfc-editor.org/info/rfc3704>. + + [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ + BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February + 2012, <https://www.rfc-editor.org/info/rfc6513>. + + [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., + Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based + Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February + 2015, <https://www.rfc-editor.org/info/rfc7432>. + + [RFC7988] Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress + Replication Tunnels in Multicast VPN", RFC 7988, + DOI 10.17487/RFC7988, October 2016, + <https://www.rfc-editor.org/info/rfc7988>. + + [RFC8660] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., + Decraene, B., Litkowski, S., and R. Shakir, "Segment + Routing with the MPLS Data Plane", RFC 8660, + DOI 10.17487/RFC8660, December 2019, + <https://www.rfc-editor.org/info/rfc8660>. + + [RFC9256] Filsfils, C., Talaulikar, K., Ed., Voyer, D., Bogdanov, + A., and P. Mattes, "Segment Routing Policy Architecture", + RFC 9256, DOI 10.17487/RFC9256, July 2022, + <https://www.rfc-editor.org/info/rfc9256>. + + [RFC9350] Psenak, P., Ed., Hegde, S., Filsfils, C., Talaulikar, K., + and A. Gulko, "IGP Flexible Algorithm", RFC 9350, + DOI 10.17487/RFC9350, February 2023, + <https://www.rfc-editor.org/info/rfc9350>. + + [SIDS-SRv6] + Krishnan, S., "SRv6 Segment Identifiers in the IPv6 + Addressing Architecture", Work in Progress, Internet- + Draft, draft-ietf-6man-sids-06, 15 February 2024, + <https://datatracker.ietf.org/doc/html/draft-ietf-6man- + sids-06>. + +Appendix A. Illustration of a Replication Segment + + This section illustrates an example of a single Replication segment. + Examples showing Replication segments stitched together to form a + P2MP tree (based on SR P2MP policy) are in [P2MP-POLICY]. + + Consider the following topology: + + R3------R6 + / \ + R1----R2----R5-----R7 + \ / + +--R4---+ + + Figure 1: Topology for Illustration of a Replication Segment + +A.1. SR-MPLS + + In this example, the Node-SID of a node Rn is N-SIDn and the Adj-SID + from node Rm to node Rn is A-SIDmn. The interface between Rm and Rn + is Lmn. The state representation uses "R-SID->Lmn" to represent a + packet replication with outgoing Replication-SID R-SID sent on + interface Lmn. + + Assume a Replication segment identified with R-ID at Replication node + R1 and downstream nodes R2, R6, and R7. The Replication-SID at node + n is R-SIDn. A packet replicated from R1 to R7 has to traverse R4. + + The Replication segments at nodes R1, R2, R6, and R7 are shown below. + Note nodes R3, R4, and R5 do not have a Replication segment. + + Replication segment at R1: + + Replication segment + <R-ID,R1>: Replication-SID: R-SID1 Replication state: R2: + <R-SID2->L12> R6: <N-SID6, R-SID6> R7: <N-SID4, + A-SID47, R-SID7> + + Replication to R2 steers the packet directly to R2 on interface L12. + Replication to R6, using N-SID6, steers the packet via the shortest + path to that node. Replication to R7 is steered via R4, using N-SID4 + and then adjacency SID A-SID47 to R7. + + Replication segment at R2: + + Replication segment + <R-ID,R2>: Replication-SID: R-SID2 Replication state: R2: + <Leaf> + + Replication segment at R6: + + Replication segment + <R-ID,R6>: Replication-SID: R-SID6 Replication state: R6: + <Leaf> + + Replication segment at R7: + + Replication segment + <R-ID,R7>: Replication-SID: R-SID7 Replication state: R7: + <Leaf> + + When a packet is steered into the Replication segment at R1: + + * R1 performs the PUSH operation with just the <R-SID2> label for + the replicated copy and sends it to R2 on interface L12, since R1 + is directly connected to R2. R2, as leaf, performs the NEXT + operation, pops the R-SID2 label, and delivers the payload. + + * R1 performs the PUSH operation with the <N-SID6, R-SID6> label + stack for the replicated copy to R6 and sends it to R2, which is + the nexthop on the shortest path to R6. R2 performs the CONTINUE + operation on N-SID6 and forwards it to R3. R3 is the penultimate + hop for N-SID6; it performs penultimate hop popping, which + corresponds to the NEXT operation. The packet is then sent to R6 + with <R-SID6> in the label stack. R6, as leaf, performs the NEXT + operation, pops the R-SID6 label, and delivers the payload. + + * R1 performs the PUSH operation with the <N-SID4, A-SID47, R-SID7> + label stack for the replicated copy to R7 and sends it to R2, + which is the nexthop on the shortest path to R4. R2 is the + penultimate hop for N-SID4; it performs penultimate hop popping, + which corresponds to the NEXT operation. The packet is then sent + to R4 with <A-SID47, R-SID1> in the label stack. R4 performs the + NEXT operation, pops A-SID47, and delivers the packet to R7 with + <R-SID7> in the label stack. R7, as leaf, performs the NEXT + operation, pops the R-SID7 label, and delivers the payload. + +A.2. SRv6 + + For SRv6, we use the SID allocation scheme, reproduced below, from + "Illustrations for SRv6 Network Programming" [PGM-ILLUSTRATION]: + + * 2001:db8::/32 is an IPv6 block allocated by a Regional Internet + Registry (RIR) to the operator. + + * 2001:db8:0::/48 is dedicated to the internal address space. + + * 2001:db8:cccc::/48 is dedicated to the internal SRv6 SID space. + + * We assume a location expressed in 64 bits and a function expressed + in 16 bits. + + * Node k has a classic IPv6 loopback address 2001:db8::k/128, which + is advertised in the Interior Gateway Protocol (IGP). + + * Node k has 2001:db8:cccc:k::/64 for its local SID space. Its SIDs + will be explicitly assigned from that block. + + * Node k advertises 2001:db8:cccc:k::/64 in its IGP. + + * Function :1:: (function 1, for short) represents the End function + with the Penultimate Segment Pop (PSP) of the SRH [RFC8986] and + USD support. + + * Function :Cn:: (function Cn, for short) represents the End.X + function from to Node n with PSP and USD support. + + Each node k has: + + * An explicit SID instantiation 2001:db8:cccc:k:1::/128 bound to an + End function with additional support for PSP and USD. + + * An explicit SID instantiation 2001:db8:cccc:k:Cj::/128 bound to an + End.X function to neighbor J with additional support for PSP and + USD. + + * An explicit SID instantiation 2001:db8:cccc:k:Fk::/128 bound to an + End.Replicate function. + + Assume a Replication segment identified with R-ID at Replication node + R1 and downstream nodes R2, R6, and R7. The Replication-SID at node + k, bound to an End.Replicate function, is 2001:db8:cccc:k:Fk::/128. + A packet replicated from R1 to R7 has to traverse R4. + + The Replication segments at nodes R1, R2, R6, and R7 are shown below. + Note nodes R3, R4, and R5 do not have a Replication segment. The + state representation uses "R-SID->Lmn" to represent a packet + replication with outgoing Replication-SID R-SID sent on interface + Lmn. "SL" represents an optional segment list used to steer a + replicated packet on a specific path to a downstream node. + + Replication segment at R1: + + Replication segment + <R-ID,R1>: Replication-SID: 2001:db8:cccc:1:F1::0 Replication + state: R2: <2001:db8:cccc:2:F2::0->L12> R6: + <2001:db8:cccc:6:F6::0> R7: <2001:db8:cccc:4:C7::0>, SL: + <2001:db8:cccc:7:F7::0> + + Replication to R2 steers the packet directly to R2 on interface L12. + Replication to R6, using 2001:db8:cccc:6:F6::0, steers the packet via + the shortest path to that node. Replication to R7 is steered via R4, + using H.Encaps.Red with End.X SID 2001:db8:cccc:4:C7::0 at R4 to R7. + + Replication segment at R2: + + Replication segment + <R-ID,R2>: Replication-SID: 2001:db8:cccc:2:F2::0 Replication + state: R2: <Leaf> + + Replication segment at R6: + + Replication segment + <R-ID,R6>: Replication-SID: 2001:db8:cccc:6:F6::0 Replication + state: R6: <Leaf> + + Replication segment at R7: + + Replication segment + <R-ID,R7>: Replication-SID: 2001:db8:cccc:7:F7::0 Replication + state: R7: <Leaf> + + When a packet, (A,B2), is steered into the Replication segment at R1: + + * R1 creates an encapsulated replicated copy (2001:db8::1, + 2001:db8:cccc:2:F2::0) (A, B2), and sends it to R2 on interface + L12, since R1 is directly connected to R2. R2, as leaf, removes + the outer IPv6 header and delivers the payload. + + * R1 creates an encapsulated replicated copy (2001:db8::1, + 2001:db8:cccc:6:F6::0) (A, B2) then forwards the resulting packet + on the shortest path to 2001:db8:cccc:6::/64. R2 and R3 forward + the packet using 2001:db8:cccc:6::/64. R6, as leaf, removes the + outer IPv6 header and delivers the payload. + + * R1 has to steer the packet to downstream node R7 via node R4. It + can do this in one of two ways: + + - R1 creates an encapsulated replicated copy (2001:db8::1, + 2001:db8:cccc:7:F7::0) (A, B2) and then performs H.Encaps.Red + using the SL to create the (2001:db8::1, 2001:db8:cccc:4:C7::0) + (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) packet. It sends + this packet to R2, which is the nexthop on the shortest path to + 2001:db8:cccc:4::/64. R2 forwards the packet to R4 using + 2001:db8:cccc:4::/64. R4 executes the End.X function on + 2001:db8:cccc:4:C7::0, performs a USD action, removes the outer + IPv6 encapsulation, and sends the resulting packet + (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, as + leaf, removes the outer IPv6 header and delivers the payload. + + - R1 is the root of the Replication segment. Therefore, it can + combine above encapsulations to create an encapsulated + replicated copy (2001:db8::1, 2001:db8:cccc:4:C7::0) + (2001:db8:cccc:7:F7::0; SL=1) (A, B2) and sends it to R2, which + is the nexthop on the shortest path to 2001:db8:cccc:4::/64. + R2 forwards the packet to R4 using 2001:db8:cccc:4::/64. R4 + executes the End.X function on 2001:db8:cccc:4:C7::0, performs + a PSP action, removes the SRH, and sends the resulting packet + (2001:db8::1, 2001:db8:cccc:7:F7::0) (A, B2) to R7. R7, as + leaf, removes the outer IPv6 header and delivers the payload. + +A.2.1. Pinging a Replication-SID + + This section illustrates the ping of a Replication-SID. + + Node R1 pings the Replication-SID of node R6 directly by sending the + following packet: + + 1. R1 to R6: (2001:db8::1, 2001:db8:cccc:6:F6::0; NH=ICMPv6) (ICMPv6 + Echo Request). + + 2. Node R6 as a leaf processes the upper-layer ICMPv6 Echo Request + and responds with an ICMPv6 Echo Reply. + + Node R1 pings the Replication-SID of R7 via R4 by sending the + following packet with the SRH: + + 1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:C7::0) + (2001:db8:cccc:7:F7::0; SL=1; NH=ICMPV6) (ICMPv6 Echo Request). + + 2. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 + Echo Request). + + 3. Node R7 as a leaf processes the upper-layer ICMPv6 Echo Request + and responds with an ICMPv6 Echo Reply. + + Assume node R4 is a transit replication node with Replication-SID + 2001:db8:cccc:4:F4::0 replicating to R7. Node R1 pings the + Replication-SID of R7 via the Replication-SID of R4 as follows: + + 1. R1 to R4: (2001:db8::1, 2001:db8:cccc:4:F4::0; NH=ICMPv6) (ICMPv6 + Echo Request). + + 2. R4 replicates to R7 by replacing the IPv6 DA with the + Replication-SID of R7 from its Replication state. + + 3. R4 to R7: (2001:db8::1, 2001:db8:cccc:7:F7::0; NH=ICMPv6) (ICMPv6 + Echo Request). + + 4. Node R7 as a leaf processes the upper-layer ICMPv6 Echo Request + and responds with an ICMPv6 Echo Reply. + +Acknowledgements + + The authors would like to acknowledge Siva Sivabalan, Mike Koldychev, + Vishnu Pavan Beeram, Alexander Vainshtein, Bruno Decraene, Thierry + Couture, Joel Halpern, Ketan Talaulikar, Darren Dukes and Jingrong + Xie for their valuable inputs. + +Contributors + + Clayton Hassen + Bell Canada + Vancouver + Canada + Email: clayton.hassen@bell.ca + + + Kurtis Gillis + Bell Canada + Halifax + Canada + Email: kurtis.gillis@bell.ca + + + Arvind Venkateswaran + Cisco Systems, Inc. + San Jose, CA + United States of America + Email: arvvenka@cisco.com + + + Zafar Ali + Cisco Systems, Inc. + United States of America + Email: zali@cisco.com + + + Swadesh Agrawal + Cisco Systems, Inc. + San Jose, CA + United States of America + Email: swaagraw@cisco.com + + + Jayant Kotalwar + Nokia + Mountain View, CA + United States of America + Email: jayant.kotalwar@nokia.com + + + Tanmoy Kundu + Nokia + Mountain View, CA + United States of America + Email: tanmoy.kundu@nokia.com + + + Andrew Stone + Nokia + Ottawa + Canada + Email: andrew.stone@nokia.com + + + Tarek Saad + Cisco Systems, Inc. + Canada + Email: tsaad@cisco.com + + + Kamran Raza + Cisco Systems, Inc. + Canada + Email: skraza@cisco.com + + + Jingrong Xie + Huawei Technologies + Beijing + China + Email: xiejingrong@huawei.com + + +Authors' Addresses + + Daniel Voyer (editor) + Bell Canada + Montreal + Canada + Email: daniel.voyer@bell.ca + + + Clarence Filsfils + Cisco Systems, Inc. + Brussels + Belgium + Email: cfilsfil@cisco.com + + + Rishabh Parekh + Cisco Systems, Inc. + San Jose, CA + United States of America + Email: riparekh@cisco.com + + + Hooman Bidgoli + Nokia + Ottawa + Canada + Email: hooman.bidgoli@nokia.com + + + Zhaohui Zhang + Juniper Networks + Email: zzhang@juniper.net |