summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc9574.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc9574.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc9574.txt')
-rw-r--r--doc/rfc/rfc9574.txt1580
1 files changed, 1580 insertions, 0 deletions
diff --git a/doc/rfc/rfc9574.txt b/doc/rfc/rfc9574.txt
new file mode 100644
index 0000000..5f7e3b3
--- /dev/null
+++ b/doc/rfc/rfc9574.txt
@@ -0,0 +1,1580 @@
+
+
+
+
+Internet Engineering Task Force (IETF) J. Rabadan, Ed.
+Request for Comments: 9574 S. Sathappan
+Category: Standards Track Nokia
+ISSN: 2070-1721 W. Lin
+ Juniper Networks
+ M. Katiyar
+ Versa Networks
+ A. Sajassi
+ Cisco Systems
+ May 2024
+
+
+ Optimized Ingress Replication Solution for Ethernet VPNs (EVPNs)
+
+Abstract
+
+ Network Virtualization Overlay (NVO) networks using Ethernet VPNs
+ (EVPNs) as their control plane may use trees based on ingress
+ replication or Protocol Independent Multicast (PIM) to convey the
+ overlay Broadcast, Unknown Unicast, or Multicast (BUM) traffic. PIM
+ provides an efficient solution that prevents sending multiple copies
+ of the same packet over the same physical link; however, it may not
+ always be deployed in the NVO network core. Ingress replication
+ avoids the dependency on PIM in the NVO network core. While ingress
+ replication provides a simple multicast transport, some NVO networks
+ with demanding multicast applications require a more efficient
+ solution without PIM in the core. This document describes a solution
+ to optimize the efficiency of ingress replication trees.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 7841.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ https://www.rfc-editor.org/info/rfc9574.
+
+Copyright Notice
+
+ Copyright (c) 2024 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (https://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Revised BSD License text as described in Section 4.e of the
+ Trust Legal Provisions and are provided without warranty as described
+ in the Revised BSD License.
+
+Table of Contents
+
+ 1. Introduction
+ 2. Terminology and Conventions
+ 3. Solution Requirements
+ 4. EVPN BGP Attributes for Optimized Ingress Replication
+ 5. Non-selective Assisted Replication (AR) Solution Description
+ 5.1. Non-selective AR-REPLICATOR Procedures
+ 5.2. Non-selective AR-LEAF Procedures
+ 5.3. RNVE Procedures
+ 6. Selective Assisted Replication (AR) Solution Description
+ 6.1. Selective AR-REPLICATOR Procedures
+ 6.2. Selective AR-LEAF Procedures
+ 7. Pruned Flooding Lists (PFLs)
+ 7.1. Example of a Pruned Flooding List
+ 8. AR Procedures for Single-IP AR-REPLICATORS
+ 9. AR Procedures and EVPN All-Active Multihoming Split-Horizon
+ 9.1. Ethernet Segments on AR-LEAF Nodes
+ 9.2. Ethernet Segments on AR-REPLICATOR Nodes
+ 10. Security Considerations
+ 11. IANA Considerations
+ 12. References
+ 12.1. Normative References
+ 12.2. Informative References
+ Acknowledgements
+ Contributors
+ Authors' Addresses
+
+1. Introduction
+
+ Ethernet Virtual Private Networks (EVPNs) may be used as the control
+ plane for a Network Virtualization Overlay (NVO) network [RFC8365].
+ Network Virtualization Edge (NVE) and Provider Edge (PE) devices that
+ are part of the same EVPN Broadcast Domain (BD) use Ingress
+ Replication (IR) or PIM-based trees to transport the tenant's
+ Broadcast, Unknown Unicast, or Multicast (BUM) traffic.
+
+ In the ingress replication approach, the ingress NVE receiving a BUM
+ frame from the Tenant System (TS) will create as many copies of the
+ frame as the number of remote NVEs/PEs that are attached to the BD.
+ Each of those copies will be encapsulated into an IP packet where the
+ outer IP Destination Address (IP DA) identifies the loopback of the
+ egress NVE/PE. The IP fabric core nodes (also known as spines) will
+ simply route the IP-encapsulated BUM frames based on the outer IP DA.
+ If PIM-based trees are used instead of ingress replication, the NVEs/
+ PEs attached to the same BD will join a PIM-based tree. The ingress
+ NVE receiving a BUM frame will send a single copy of the frame,
+ encapsulated into an IP packet where the outer IP DA is the multicast
+ address that represents the PIM-based tree. The IP fabric core nodes
+ are part of the PIM tree and keep multicast state for the multicast
+ group, so that IP-encapsulated BUM frames can be routed to all the
+ NVEs/PEs that joined the tree.
+
+ The two approaches are illustrated in Figure 1. On the left-hand
+ side of the diagram, NVE1 uses ingress replication to send a BUM
+ frame (originated from Tenant System TS1) to the remote nodes
+ attached to the BD, i.e., NVE2, NVE3, and PE1. On the right-hand
+ side, the same example is depicted but using a PIM-based tree, i.e.,
+ (S1,G1), instead of ingress replication. While a single copy of the
+ tunneled BUM frame is generated in the latter approach, all the
+ routers in the fabric need to keep multicast state, e.g., the spine
+ keeps a PIM routing entry for (S1,G1) with an Incoming Interface
+ (IIF) and three Outgoing Interfaces (OIFs).
+
+ To WAN To WAN
+ ^ ^
+ | |
+ +-----+ +-----+
+ +----------| PE1 |-----------+ +----------| PE1 |-----------+
+ | +--^--+ | | +--^--+ |
+ | | IP Fabric | | | IP Fabric |
+ | PE | | (S1,G1) |OIF to G1 |
+ | +----PE->+-----+ No State | | IIF +-----+ OIF to G1 |
+ | | +---2->|Spine|------+ | | +------>Spine|------+ |
+ | | | +-3->+-----+ | | | | +-----+ | |
+ | | | | 2 3 | | |PIM |OIF to G1| |
+ | | | |IR | | | | |tree | | |
+ |+-----+ +--v--+ +--v--+ | |+-----+ +--v--+ +--v--+ |
+ +| NVE1|---| NVE2|---| NVE3|-+ +| NVE1|---| NVE2|---| NVE3|-+
+ +--^--+ +-----+ +-----+ +--^--+ +-----+ +-----+
+ | | | | | |
+ | v v | v v
+ TS1 TS2 TS3 TS1 TS2 TS3
+
+ Figure 1: Ingress Replication vs. PIM-Based Trees in NVO Networks
+
+ In NVO networks where PIM-based trees cannot be used, ingress
+ replication is the only option. Examples of these situations are NVO
+ networks where the core nodes do not support PIM or the network
+ operator does not want to run PIM in the core.
+
+ In some use cases, the amount of replication for BUM traffic is kept
+ under control on the NVEs due to the following fairly common
+ assumptions:
+
+ a. Broadcast traffic is greatly reduced due to the proxy Address
+ Resolution Protocol (ARP) and proxy Neighbor Discovery (ND)
+ capabilities supported by EVPNs [RFC9161] on the NVEs. Some NVEs
+ can even provide Dynamic Host Configuration Protocol (DHCP)
+ server functions for the attached TSs, reducing the broadcast
+ traffic even further.
+
+ b. Unknown unicast traffic is greatly reduced in NVO networks where
+ all the Media Access Control (MAC) and IP addresses from the TSs
+ are learned in the control plane.
+
+ c. Multicast applications are not used.
+
+ If the above assumptions are true for a given NVO network, then
+ ingress replication provides a simple solution for multi-destination
+ traffic. However, statement c. above is not always true, and
+ multicast applications are required in many use cases.
+
+ When the multicast sources are attached to NVEs residing in
+ hypervisors or low-performance-replication Top-of-Rack (ToR)
+ switches, the ingress replication of a large amount of multicast
+ traffic to a significant number of remote NVEs/PEs can seriously
+ degrade the performance of the NVE and impact the application.
+
+ This document describes a solution that makes use of two ingress
+ replication optimizations:
+
+ 1. Assisted Replication (AR)
+
+ 2. Pruned Flooding Lists (PFLs)
+
+ Assisted Replication consists of a set of procedures that allows the
+ ingress NVE/PE to send a single copy of a broadcast or multicast
+ frame received from a TS to the BD without the need for PIM in the
+ underlay. Assisted Replication defines the roles of AR-REPLICATOR
+ and AR-LEAF routers. The AR-LEAF is the ingress NVE/PE attached to
+ the TS. The AR-LEAF sends a single copy of a broadcast or multicast
+ packet to a selected AR-REPLICATOR that replicates the packet
+ multiple times to remote AR-LEAF or AR-REPLICATOR routers and is
+ therefore "assisting" the ingress AR-LEAF in delivering the broadcast
+ or multicast traffic to the remote NVEs/PEs attached to the same BD.
+ Assisted Replication can use a single AR-REPLICATOR or two AR-
+ REPLICATOR routers in the path between the ingress AR-LEAF and the
+ remote destination NVEs/PEs. The procedures that use a single AR-
+ REPLICATOR (the non-selective Assisted Replication solution) are
+ specified in Section 5, whereas Section 6 describes how multi-stage
+ replication, i.e., two AR-REPLICATOR routers in the path between the
+ ingress AR-LEAF and destination NVEs/PEs, is accomplished (the
+ selective Assisted Replication solution). The procedures for
+ Assisted Replication do not impact unknown unicast traffic, which
+ follows the same forwarding procedures as known unicast traffic so
+ that packet reordering does not occur.
+
+ PFLs provide a method for the ingress NVE/PE to prune or remove
+ certain destination NVEs/PEs from a flooding list, depending on the
+ interest of those NVEs/PEs in receiving BUM traffic. As specified in
+ [RFC8365], an NVE/PE builds a flooding list for BUM traffic based on
+ the next hops of the received EVPN Inclusive Multicast Ethernet Tag
+ routes for the BD. While [RFC8365] states that the flooding list is
+ used for all BUM traffic, this document allows pruning certain next
+ hops from the list. As an example, suppose an ingress NVE creates a
+ flooding list with next hops PE1, PE2, and PE3. If PE2 and PE3 did
+ not signal any interest in receiving unknown unicast traffic in their
+ Inclusive Multicast Ethernet Tag routes, when the ingress NVE
+ receives an unknown unicast frame from a TS, it will replicate it
+ only to PE1. That is, PE2 and PE3 are "pruned" from the NVE's
+ flooding list for unknown unicast traffic. PFLs can be used with
+ ingress replication or Assisted Replication and are described in
+ Section 7.
+
+ Both optimizations -- Assisted Replication and PFLs -- may be used
+ together or independently so that the performance and efficiency of
+ the network to transport multicast can be improved. Both solutions
+ require some extensions to the BGP attributes used in [RFC7432]; see
+ Section 4 for details.
+
+ The Assisted Replication solution described in this document is
+ focused on NVO networks (hence its use of IP tunnels). MPLS
+ transport networks are out of scope for this document. The PFLs
+ solution MAY be used in NVO and MPLS transport networks.
+
+ Section 3 lists the requirements of the combined optimized ingress
+ replication solution, whereas Sections 5 and 6 describe the Assisted
+ Replication solution for non-selective and selective procedures,
+ respectively. Section 7 provides the PFLs solution.
+
+2. Terminology and Conventions
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
+ capitals, as shown here.
+
+ The following terminology is used throughout this document:
+
+ AR-IP: Assisted Replication - IP. Refers to an IP address owned by
+ the AR-REPLICATOR and used to differentiate the incoming traffic
+ that must follow the AR procedures. The AR-IP is also used in the
+ Tunnel Identifier and Next Hop fields of the Replicator-AR route.
+
+ AR-LEAF: Assisted Replication - LEAF. Refers to an NVE/PE that
+ sends all the BM traffic to an AR-REPLICATOR that can replicate
+ the traffic further on its behalf. An AR-LEAF is typically an
+ NVE/PE with poor replication performance capabilities.
+
+ AR-REPLICATOR: Assisted Replication - REPLICATOR. Refers to an NVE/
+ PE that can replicate broadcast or multicast traffic received on
+ overlay tunnels to other overlay tunnels and local Attachment
+ Circuits (ACs). This document defines the control and data plane
+ procedures that an AR-REPLICATOR needs to follow.
+
+ AR-VNI: Assisted Replication - VNI. Refers to a Virtual eXtensible
+ Local Area Network (VXLAN) Network Identifier (VNI) advertised by
+ the AR-REPLICATOR along with the Replicator-AR route. It is used
+ to identify the incoming packets that must follow the AR
+ procedures ONLY in the single-IP AR-REPLICATOR case (see
+ Section 8).
+
+ Assisted Replication forwarding mode: In the case of an AR-LEAF,
+ sending an AC Broadcast and Multicast (BM) packet to a single AR-
+ REPLICATOR with a tunnel destination address AR-IP. In the case
+ of an AR-REPLICATOR, this means sending a BM packet to a selected
+ number of, or all of, the overlay tunnels when the packet was
+ previously received from an overlay tunnel.
+
+ BD: Broadcast Domain, as defined in [RFC7432].
+
+ BD label: Defined as the MPLS label that identifies the BD and is
+ advertised in Regular-IR or Replicator-AR routes, when the
+ encapsulation is MPLS over GRE (MPLSoGRE) or MPLS over UDP
+ (MPLSoUDP).
+
+ BM traffic: Refers to broadcast and multicast frames (excluding
+ unknown unicast frames).
+
+ DF and NDF: Designated Forwarder and Non-Designated Forwarder.
+ These are roles defined in NVEs/PEs attached to multihomed TSs, as
+ per [RFC7432] and [RFC8365].
+
+ ES and ESI: Ethernet Segment and Ethernet Segment Identifier. EVPN
+ multihoming concepts as specified in [RFC7432].
+
+ EVI: EVPN Instance. A group of Provider Edge (PE) devices
+ participating in the same EVPN service, as specified in [RFC7432].
+
+ GRE: Generic Routing Encapsulation [RFC4023].
+
+ Ingress Replication forwarding mode: Refers to the ingress
+ replication behavior explained in [RFC7432]. In this mode, an AC
+ BM packet copy is sent to each remote PE/NVE in the BD, and an
+ overlay BM packet is sent only to the ACs and not to other overlay
+ tunnels.
+
+ IR-IP: Ingress Replication - IP. Refers to the local IP address of
+ an NVE/PE that is used for the ingress replication signaling and
+ procedures provided in [RFC7432]. Encapsulated incoming traffic
+ with an outer destination IP address matching the IR-IP will
+ follow the procedures for ingress replication and not the
+ procedures for Assisted Replication. The IR-IP is also used in
+ the Tunnel Identifier and Next Hop fields of the Regular-IR route.
+
+ IR-VNI: Ingress Replication - VNI. Refers to a VNI advertised along
+ with the Inclusive Multicast Ethernet Tag route for the ingress
+ replication tunnel type.
+
+ MPLS: Multi-Protocol Label Switching.
+
+ NVE: Network Virtualization Edge [RFC8365].
+
+ NVGRE: Network virtualization using Generic Routing Encapsulation
+ [RFC7637].
+
+ PE: Provider Edge.
+
+ PMSI: P-Multicast Service Interface. A conceptual interface for a
+ PE to send customer multicast traffic to all or some PEs in the
+ same VPN [RFC6513].
+
+ RD: Route Distinguisher.
+
+ Regular-IR route: An EVPN Inclusive Multicast Ethernet Tag route
+ [RFC7432] that uses the ingress replication tunnel type.
+
+ Replicator-AR route: An EVPN Inclusive Multicast Ethernet Tag route
+ that is advertised by an AR-REPLICATOR to signal its capabilities,
+ as described in Section 4.
+
+ RNVE: Regular NVE. Refers to an NVE that supports the procedures
+ provided in [RFC8365] and does not support the procedures provided
+ in this document. However, this document defines procedures to
+ interoperate with RNVEs.
+
+ ToR switch: Top-of-Rack switch.
+
+ TS and VM: Tenant System and Virtual Machine. In this document, TSs
+ and VMs are the devices connected to the ACs of the PEs and NVEs.
+
+ VNI: VXLAN Network Identifier. Used in VXLAN tunnels.
+
+ VSID: Virtual Segment Identifier. Used in NVGRE tunnels.
+
+ VXLAN: Virtual eXtensible Local Area Network [RFC7348].
+
+3. Solution Requirements
+
+ The ingress replication optimization solution specified in this
+ document meets the following requirements:
+
+ a. The solution provides an ingress replication optimization for BM
+ traffic without the need for PIM while preserving the packet
+ order for unicast applications, i.e., unknown unicast traffic
+ should follow the same path as known unicast traffic. This
+ optimization is required in low-performance NVEs.
+
+ b. The solution reduces the flooded traffic in NVO networks where
+ some NVEs do not need broadcast/multicast and/or unknown unicast
+ traffic.
+
+ c. The solution is compatible with [RFC7432] and [RFC8365] and has
+ no impact on the Customer Edge (CE) procedures for BM traffic.
+ In particular, the solution supports the following EVPN
+ functions:
+
+ * All-active multihoming, including the split-horizon and DF
+ functions.
+
+ * Single-active multihoming, including the DF function.
+
+ * Handling of multi-destination traffic and processing of BM
+ traffic as per [RFC7432].
+
+ d. The solution is backward compatible with existing NVEs using a
+ non-optimized version of ingress replication. A given BD can
+ have NVEs/PEs supporting regular ingress replication and
+ optimized ingress replication.
+
+ e. The solution is independent of the NVO-specific data plane
+ encapsulation and the virtual identifiers being used, e.g., VXLAN
+ VNIs, NVGRE VSIDs, or MPLS labels, as long as the tunnel is IP
+ based.
+
+4. EVPN BGP Attributes for Optimized Ingress Replication
+
+ The ingress replication optimization solution specified in this
+ document extends the Inclusive Multicast Ethernet Tag routes and
+ attributes described in [RFC7432] so that an NVE/PE can signal its
+ optimized ingress replication capabilities.
+
+ The Network Layer Reachability Information (NLRI) of the Inclusive
+ Multicast Ethernet Tag route [RFC7432] is shown in Figure 2 and is
+ used in this document without any modifications to its format. The
+ PMSI Tunnel Attribute's general format as provided in [RFC7432]
+ (which takes it from [RFC6514]) is used in this document; only a new
+ tunnel type and new flags are specified, as shown in Figure 3.
+
+ +------------------------------------+
+ | RD (8 octets) |
+ +------------------------------------+
+ | Ethernet Tag ID (4 octets) |
+ +------------------------------------+
+ | IP Address Length (1 octet) |
+ +------------------------------------+
+ | Originating Router's IP Address |
+ | (4 or 16 octets) |
+ +------------------------------------+
+
+ Figure 2: EVPN Inclusive Multicast Ethernet Tag Route's NLRI
+
+ 0 1 2 3 4 5 6 7
+ +---------------------------------+ +--+--+--+--+--+--+--+--+
+ | Flags (1 octet) | -> |x |E |x | T |BM|U |L |
+ +---------------------------------+ +--+--+--+--+--+--+--+--+
+ | Tunnel Type (1 octet) | T = Assisted Replication Type
+ +---------------------------------+ BM = Broadcast and Multicast
+ | MPLS Label (3 octets) | U = Unknown (unknown unicast)
+ +---------------------------------+ x = unassigned
+ | Tunnel Identifier (variable) |
+ +---------------------------------+
+
+ Figure 3: PMSI Tunnel Attribute
+
+ The Flags field in Figure 3 is 8 bits long as per [RFC7902]. The
+ Extension (E) flag was allocated by [RFC7902], and the Leaf
+ Information Required (L) flag was allocated by [RFC6514]. This
+ document defines the use of 4 bits of this Flags field:
+
+ * Bits 3 and 4, which together form the Assisted Replication Type
+ (T) field
+
+ * Bit 5, called the Broadcast and Multicast (BM) flag
+
+ * Bit 6, called the Unknown (U) flag
+
+ Bits 5 and 6 are collectively referred to as the Pruned Flooding
+ Lists (PFLs) flags.
+
+ The T field and PFLs flags are defined as follows:
+
+ * T is the Assisted Replication Type field (2 bits), which defines
+ the AR role of the advertising router:
+
+ - 00 (decimal 0) = RNVE (non-AR support)
+
+ - 01 (decimal 1) = AR-REPLICATOR
+
+ - 10 (decimal 2) = AR-LEAF
+
+ - 11 (decimal 3) = RESERVED
+
+ * The PFLs flags define the desired behavior of the advertising
+ router for the different types of traffic:
+
+ - Broadcast and Multicast (BM) flag. BM = 1 means "prune me from
+ the BM flooding list". BM = 0 indicates regular behavior.
+
+ - Unknown (U) flag. U = 1 means "prune me from the Unknown
+ flooding list". U = 0 indicates regular behavior.
+
+ * The L flag (bit 7) is defined in [RFC6514] and will be used only
+ in the selective AR solution.
+
+ Please refer to Section 11 for the IANA considerations related to the
+ PMSI Tunnel Attribute flags.
+
+ In this document, the above Inclusive Multicast Ethernet Tag route
+ (Figure 2) and PMSI Tunnel Attribute (Figure 3) can be used in two
+ different modes for the same BD:
+
+ Regular-IR route: In this route, Originating Router's IP Address,
+ Tunnel Type (0x06), MPLS Label, and Tunnel Identifier MUST be used
+ as described in [RFC7432] when ingress replication is in use. The
+ NVE/PE that advertises the route will set the Next Hop to an IP
+ address that we denominate IR-IP in this document. When
+ advertised by an AR-LEAF node, the Regular-IR route MUST be
+ advertised with the T field set to 10 (AR-LEAF).
+
+ Replicator-AR route: This route is used by the AR-REPLICATOR to
+ advertise its AR capabilities, with the fields set as follows:
+
+ * Originating Router's IP Address MUST be set to an IP address of
+ the advertising router that is common to all the EVIs on the PE
+ (usually this is a loopback address of the PE).
+
+ - The Tunnel Identifier and Next Hop fields SHOULD be set to
+ the same IP address as the Originating Router's IP Address
+ field when the NVE/PE originates the route -- that is, when
+ the NVE/PE is not an ASBR; see Section 10.2 of [RFC8365].
+ Irrespective of the values in the Tunnel Identifier and
+ Originating Router's IP Address fields, the ingress NVE/PE
+ will process the received Replicator-AR route and will use
+ the IP address setting in the Next Hop field to create IP
+ tunnels to the AR-REPLICATOR.
+
+ - The Next Hop address is referred to as the AR-IP and MUST be
+ different from the IR-IP for a given PE/NVE, unless the
+ procedures provided in Section 8 are followed.
+
+ * Tunnel Type MUST be set to Assisted Replication Tunnel.
+ Section 11 provides the allocated type value.
+
+ * T (Assisted Replication type) MUST be set to 01 (AR-
+ REPLICATOR).
+
+ * L (Leaf Information Required) MUST be set to 0 for non-
+ selective AR and MUST be set to 1 for selective AR.
+
+ An NVE/PE configured as an AR-REPLICATOR for a BD MUST advertise a
+ Replicator-AR route for the BD and MAY advertise a Regular-IR route.
+ The advertisement of the Replicator-AR route will indicate to the AR-
+ LEAFs which outer IP DA, i.e., which AR-IP, they need to use for IP-
+ encapsulated BM frames that use Assisted Replication forwarding mode.
+ The AR-REPLICATOR will forward an IP-encapsulated BM frame in
+ Assisted Replication forwarding mode if the outer IP DA matches its
+ AR-IP but will forward in Ingress Replication forwarding mode if the
+ outer IP DA matches its IR-IP.
+
+ In addition, this document also uses the Leaf Auto-Discovery (Leaf
+ A-D) route defined in [RFC9572] in cases where the selective AR mode
+ is used. An AR-LEAF MAY send a Leaf A-D route in response to
+ reception of a Replicator-AR route whose L flag is set. The Leaf A-D
+ route is only used for selective AR, and the fields of such a route
+ are set as follows:
+
+ * Originating Router's IP Address is set to the advertising router's
+ IP address (the same IP address used by the AR-LEAF in Regular-IR
+ routes). The Next Hop address is set to the IR-IP, which SHOULD
+ be the same IP address as the advertising router's IP address,
+ when the NVE/PE originates the route, i.e., when the NVE/PE is not
+ an ASBR; see Section 10.2 of [RFC8365].
+
+ * Route Key [RFC9572] is the "Route Type Specific" NLRI of the
+ Replicator-AR route for which this Leaf A-D route is generated.
+
+ * The AR-LEAF constructs an IP-address-specific Route Target,
+ analogously to [RFC9572], by placing the IP address carried in the
+ Next Hop field of the received Replicator-AR route in the Global
+ Administrator field of the extended community, with the Local
+ Administrator field of this extended community set to 0, and
+ setting the Extended Communities attribute of the Leaf A-D route
+ to that extended community. The same IP-address-specific import
+ Route Target is auto-configured by the AR-REPLICATOR that sent the
+ Replicator-AR route, in order to control the acceptance of the
+ Leaf A-D routes.
+
+ * The Leaf A-D route MUST include the PMSI Tunnel Attribute with
+ Tunnel Type set to Assisted Replication Tunnel (Section 11), T
+ (Assisted Replication type) set to AR-LEAF, and Tunnel Identifier
+ set to the IP address of the advertising AR-LEAF. The PMSI Tunnel
+ Attribute MUST carry a downstream-assigned MPLS label or VNI that
+ is used by the AR-REPLICATOR to send traffic to the AR-LEAF.
+
+ Each AR-enabled node understands and processes the T (Assisted
+ Replication type) field in the PMSI Tunnel Attribute (Flags field) of
+ the routes and MUST signal the corresponding type (AR-REPLICATOR or
+ AR-LEAF type) according to its administrative choice. An NVE/PE
+ following this specification is not expected to set the Assisted
+ Replication Type field to decimal 3 (which is a RESERVED value). If
+ a route with the Assisted Replication Type field set to decimal 3 is
+ received by an AR-REPLICATOR or AR-LEAF, the router will process the
+ route as a Regular-IR route advertised by an RNVE.
+
+ Each node attached to the BD may understand and process the BM/U
+ flags (PFLs flags). Note that these BM/U flags may be used to
+ optimize the delivery of multi-destination traffic; their use SHOULD
+ be an administrative choice and independent of the AR role. When the
+ PFL capability is enabled, the BM/U flags can be used with the
+ Regular-IR, Replicator-AR, and Leaf A-D routes.
+
+ Non-optimized ingress replication NVEs/PEs will be unaware of the new
+ PMSI Tunnel Attribute flag definition as well as the new tunnel type
+ (AR), i.e., non-upgraded NVEs/PEs will ignore the information
+ contained in the Flags field or an unknown tunnel type (type AR in
+ this case) for any Inclusive Multicast Ethernet Tag route.
+
+5. Non-selective Assisted Replication (AR) Solution Description
+
+ Figure 4 illustrates an example NVO network where the non-selective
+ AR function is enabled. Three different roles are defined for a
+ given BD: AR-REPLICATOR, AR-LEAF, and RNVE. The solution is called
+ "non-selective" because the chosen AR-REPLICATOR for a given flow
+ MUST replicate the BM traffic to all the NVEs/PEs in the BD except
+ for the source NVE/PE. NVO tunnels, i.e., IP tunnels, exist among
+ all the PEs and NVEs in the diagram. The PEs and NVEs in the diagram
+ have TSs or VMs connected to their ACs.
+
+ ( )
+ (_ WAN _)
+ +---(_ _)----+
+ | (_ _) |
+ PE1 | PE2 |
+ +------+----+ +----+------+
+ TS1--+ (BD-1) | | (BD-1) +--TS2
+ |REPLICATOR | |REPLICATOR |
+ +--------+--+ +--+--------+
+ | |
+ +--+----------------+--+
+ | |
+ | |
+ +----+ VXLAN/NVGRE/MPLSoGRE +----+
+ | | IP Fabric | |
+ | | | |
+ NVE1 | +-----------+----------+ | NVE3
+ Hypervisor| ToR | NVE2 |Hypervisor
+ +---------+-+ +-----+-----+ +-+---------+
+ | (BD-1) | | (BD-1) | | (BD-1) |
+ | LEAF | | RNVE | | LEAF |
+ +--+-----+--+ +--+-----+--+ +--+-----+--+
+ | | | | | |
+ VM11 VM12 TS3 TS4 VM31 VM32
+
+ Figure 4: Non-selective AR Scenario
+
+ In AR BDs, such as BD-1 in Figure 4, BM traffic between two NVEs may
+ follow a different path than unicast traffic. This solution
+ recommends the replication of BM traffic through the AR-REPLICATOR
+ node, whereas unknown/known unicast traffic will be delivered
+ directly from the source node to the destination node without being
+ replicated by any intermediate node.
+
+ Note that known unicast forwarding is not impacted by this solution,
+ i.e., unknown unicast traffic SHALL follow the same path as known
+ unicast traffic.
+
+5.1. Non-selective AR-REPLICATOR Procedures
+
+ An AR-REPLICATOR is defined as an NVE/PE capable of replicating
+ incoming BM traffic received on an overlay tunnel to other overlay
+ tunnels and local ACs. The AR-REPLICATOR signals its role in the
+ control plane and understands where the other roles (AR-LEAF nodes,
+ RNVEs, and other AR-REPLICATORs) are located. A given AR-enabled BD
+ service may have zero, one, or more AR-REPLICATORs. In our example
+ in Figure 4, PE1 and PE2 are defined as AR-REPLICATORs. The
+ following considerations apply to the AR-REPLICATOR role:
+
+ a. The AR-REPLICATOR role SHOULD be an administrative choice in any
+ NVE/PE that is part of an AR-enabled BD. This administrative
+ option to enable AR-REPLICATOR capabilities MAY be implemented as
+ a system-level option as opposed to a per-BD option.
+
+ b. An AR-REPLICATOR MUST advertise a Replicator-AR route and MAY
+ advertise a Regular-IR route. The AR-REPLICATOR MUST NOT
+ generate a Regular-IR route if it does not have local ACs. If
+ the Regular-IR route is advertised, the Assisted Replication Type
+ field of the Regular-IR route MUST be set to 0.
+
+ c. The Replicator-AR and Regular-IR routes are generated according
+ to Section 4. The AR-IP and IR-IP are different IP addresses
+ owned by the AR-REPLICATOR.
+
+ d. When a node defined as an AR-REPLICATOR receives a BM packet on
+ an overlay tunnel, it will do a tunnel destination IP address
+ lookup and apply the following procedures:
+
+ * If the destination IP address is the AR-REPLICATOR IR-IP
+ address, the node will process the packet normally as
+ discussed in [RFC7432].
+
+ * If the destination IP address is the AR-REPLICATOR AR-IP
+ address, the node MUST replicate the packet to local ACs and
+ overlay tunnels (excluding the overlay tunnel to the source of
+ the packet). When replicating to remote AR-REPLICATORs, the
+ tunnel destination IP address will be an IR-IP. This will
+ indicate to the remote AR-REPLICATOR that it MUST NOT
+ replicate to overlay tunnels. The tunnel source IP address
+ used by the AR-REPLICATOR MUST be its IR-IP when replicating
+ to AR-REPLICATOR or AR-LEAF nodes.
+
+ An AR-REPLICATOR MUST follow a data path implementation compatible
+ with the following rules:
+
+ * The AR-REPLICATORs will build a flooding list composed of ACs and
+ overlay tunnels to remote nodes in the BD. Some of those overlay
+ tunnels MAY be flagged as non-BM receivers based on the BM flag
+ received from the remote nodes in the BD.
+
+ * When an AR-REPLICATOR receives a BM packet on an AC, it will
+ forward the BM packet to its flooding list (including local ACs
+ and remote NVEs/PEs), skipping the non-BM overlay tunnels.
+
+ * When an AR-REPLICATOR receives a BM packet on an overlay tunnel,
+ it will check the destination IP address of the underlay IP header
+ and:
+
+ - If the destination IP address matches its IR-IP, the AR-
+ REPLICATOR will skip all the overlay tunnels from the flooding
+ list, i.e., it will only replicate to local ACs. This is the
+ regular ingress replication behavior described in [RFC7432].
+
+ - If the destination IP address matches its AR-IP, the AR-
+ REPLICATOR MUST forward the BM packet to its flooding list (ACs
+ and overlay tunnels), excluding the non-BM overlay tunnels.
+ The AR-REPLICATOR will ensure that the traffic is not sent back
+ to the originating AR-LEAF.
+
+ - If the encapsulation is MPLSoGRE or MPLSoUDP and the received
+ BD label that the AR-REPLICATOR advertised in the Replicator-AR
+ route is not at the bottom of the stack, the AR-REPLICATOR MUST
+ copy all the labels below the BD label and propagate them when
+ forwarding the packet to the egress overlay tunnels.
+
+ * The AR-REPLICATOR/LEAF nodes will build an unknown unicast
+ flooding list composed of ACs and overlay tunnels to the IR-IP
+ addresses of the remote nodes in the BD. Some of those overlay
+ tunnels MAY be flagged as non-U (unknown unicast) receivers based
+ on the U flag received from the remote nodes in the BD.
+
+ - When an AR-REPLICATOR/LEAF receives an unknown unicast packet
+ on an AC, it will forward the unknown unicast packet to its
+ flooding list, skipping the non-U overlay tunnels.
+
+ - When an AR-REPLICATOR/LEAF receives an unknown unicast packet
+ on an overlay tunnel, it will forward the unknown unicast
+ packet to its local ACs and never to an overlay tunnel. This
+ is the regular ingress replication behavior described in
+ [RFC7432].
+
+5.2. Non-selective AR-LEAF Procedures
+
+ An AR-LEAF is defined as an NVE/PE that, given its poor replication
+ performance, sends all the BM traffic to an AR-REPLICATOR that can
+ replicate the traffic further on its behalf. It MAY signal its AR-
+ LEAF capability in the control plane and understands where the other
+ roles are located (AR-REPLICATORs and RNVEs). A given service can
+ have zero, one, or more AR-LEAF nodes. In Figure 4, NVE1 and NVE3
+ (both residing in hypervisors) act as AR-LEAF nodes. The following
+ considerations apply to the AR-LEAF role:
+
+ a. The AR-LEAF role SHOULD be an administrative choice in any NVE/PE
+ that is part of an AR-enabled BD. This administrative option to
+ enable AR-LEAF capabilities MAY be implemented as a system-level
+ option as opposed to a per-BD option.
+
+ b. In this non-selective AR solution, the AR-LEAF MUST advertise a
+ single Regular-IR Inclusive Multicast Ethernet Tag route as
+ described in [RFC7432]. The AR-LEAF SHOULD set the Assisted
+ Replication Type field to AR-LEAF. Note that although this field
+ does not affect the remote nodes when creating an EVPN
+ destination to the AR-LEAF, this field is useful from the
+ standpoint of ease of operation and troubleshooting of the BD.
+
+ c. In a BD where there are no AR-REPLICATORs due to the AR-
+ REPLICATORs being down or reconfigured, the AR-LEAF MUST use
+ regular ingress replication based on the remote Regular-IR
+ Inclusive Multicast Ethernet Tag routes as described in
+ [RFC7432]. This may happen in the following cases:
+
+ * The AR-LEAF has a list of AR-REPLICATORs for the BD, but it
+ detects that all the AR-REPLICATORs for the BD are down (via
+ next-hop tracking in the IGP or some other detection
+ mechanism).
+
+ * The AR-LEAF receives updates from all the former AR-
+ REPLICATORs containing a non-REPLICATOR AR type in the
+ Inclusive Multicast Ethernet Tag routes.
+
+ * The AR-LEAF never discovered an AR-REPLICATOR for the BD.
+
+ d. In a service where there are one or more AR-REPLICATORs (based on
+ the received Replicator-AR routes for the BD), the AR-LEAF can
+ locally select which AR-REPLICATOR it sends the BM traffic to:
+
+ * A single AR-REPLICATOR MAY be selected for all the BM packets
+ received on the AR-LEAF ACs for a given BD. This selection is
+ a local decision and does not have to match other AR-LEAFs'
+ selections within the same BD.
+
+ * An AR-LEAF MAY select more than one AR-REPLICATOR and do
+ either per-flow or per-BD load balancing.
+
+ * In the case of failure of the selected AR-REPLICATOR, another
+ AR-REPLICATOR SHOULD be selected by the AR-LEAF.
+
+ * When an AR-REPLICATOR is selected for a given flow or BD, the
+ AR-LEAF MUST send all the BM packets targeted to that AR-
+ REPLICATOR using the forwarding information given by the
+ Replicator-AR route for the chosen AR-REPLICATOR, with Tunnel
+ Type = 0x0A (AR tunnel). The underlay destination IP address
+ MUST be the AR-IP advertised by the AR-REPLICATOR in the
+ Replicator-AR route.
+
+ * An AR-LEAF MAY change the selection of AR-REPLICATOR(s)
+ dynamically due to an administrative or policy configuration
+ change.
+
+ * AR-LEAF nodes SHALL send service-level BM control plane
+ packets, following the procedures for regular ingress
+ replication. An example would be IGMP, Multicast Listener
+ Discovery (MLD), or PIM packets, and, in general, any packets
+ using link-local scope multicast IPv4 or IPv6 packets. The
+ AR-REPLICATORs MUST NOT replicate these control plane packets
+ to other overlay tunnels, since they will use the IR-IP
+ address.
+
+ e. The use of an AR-REPLICATOR-activation-timer (in seconds, with a
+ default value of 3) on the AR-LEAF nodes is RECOMMENDED. Upon
+ receiving a new Replicator-AR route where the AR-REPLICATOR is
+ selected, the AR-LEAF will run a timer before programming the new
+ AR-REPLICATOR. In the case of a newly added AR-REPLICATOR or if
+ an AR-REPLICATOR reboots, this timer will give the AR-REPLICATOR
+ some time to program the AR-LEAF nodes before the AR-LEAF sends
+ BM traffic. The AR-REPLICATOR-activation-timer SHOULD be
+ configurable in seconds, and its value needs to account for the
+ time it takes for the AR-LEAF Regular-IR Inclusive Multicast
+ Ethernet Tag route to get to the AR-REPLICATOR and be programmed.
+ While the AR-REPLICATOR-activation-timer is running, the AR-LEAF
+ node will use regular ingress replication.
+
+ f. If the AR-LEAF has selected an AR-REPLICATOR, whether or not to
+ change to a new preferred AR-REPLICATOR for the existing BM
+ traffic flows is a matter of local policy.
+
+ An AR-LEAF MUST follow a data path implementation compatible with the
+ following rules:
+
+ * The AR-LEAF nodes will build two flooding lists:
+
+ Flooding list #1: Composed of ACs and an AR-REPLICATOR-set of
+ overlay tunnels. The AR-REPLICATOR-set is defined as one or
+ more overlay tunnels to the AR-IP addresses of the remote AR-
+ REPLICATOR(s) in the BD. The selection of more than one AR-
+ REPLICATOR is described in item d. above and is a local AR-LEAF
+ decision.
+
+ Flooding list #2: Composed of ACs and overlay tunnels to the
+ remote IR-IP addresses.
+
+ * When an AR-LEAF receives a BM packet on an AC, it will check the
+ AR-REPLICATOR-set:
+
+ - If the AR-REPLICATOR-set is empty, the AR-LEAF MUST send the
+ packet to flooding list #2.
+
+ - If the AR-REPLICATOR-set is NOT empty, the AR-LEAF MUST send
+ the packet to flooding list #1, where only one of the overlay
+ tunnels of the AR-REPLICATOR-set is used.
+
+ * When an AR-LEAF receives a BM packet on an overlay tunnel, it will
+ forward the BM packet to its local ACs and never to an overlay
+ tunnel. This is the regular ingress replication behavior
+ described in [RFC7432].
+
+ * AR-LEAF nodes process unknown unicast traffic in the same way AR-
+ REPLICATORS do, as described in Section 5.1.
+
+5.3. RNVE Procedures
+
+ An RNVE is defined as an NVE/PE without AR-REPLICATOR or AR-LEAF
+ capabilities that does ingress replication as described in [RFC7432].
+ The RNVE does not signal any AR role and is unaware of the AR-
+ REPLICATOR/LEAF roles in the BD. The RNVE will ignore the flags in
+ the Regular-IR routes and will ignore the Replicator-AR routes (due
+ to an unknown tunnel type in the PMSI Tunnel Attribute) and the Leaf
+ A-D routes (due to the IP-address-specific Route Target).
+
+ This role provides EVPNs with the backward compatibility required in
+ optimized ingress replication BDs. In Figure 4, NVE2 acts as an
+ RNVE.
+
+6. Selective Assisted Replication (AR) Solution Description
+
+ Figure 5 is used to describe the selective AR solution.
+
+ ( )
+ (_ WAN _)
+ +---(_ _)----+
+ | (_ _) |
+ PE1 | PE2 |
+ +------+----+ +----+------+
+ TS1--+ (BD-1) | | (BD-1) +--TS2
+ |REPLICATOR | |REPLICATOR |
+ +--------+--+ +--+--------+
+ | |
+ +--+----------------+--+
+ | |
+ | |
+ +----+ VXLAN/NVGRE/MPLSoGRE +----+
+ | | IP Fabric | |
+ | | | |
+ NVE1 | +-----------+----------+ | NVE3
+ Hypervisor| ToR | NVE2 |Hypervisor
+ +---------+-+ +-----+-----+ +-+---------+
+ | (BD-1) | | (BD-1) | | (BD-1) |
+ |LEAF-set-1 | |LEAF-set-1 | |LEAF-set-2 |
+ +--+-----+--+ +--+-----+--+ +--+-----+--+
+ | | | | | |
+ VM11 VM12 TS3 TS4 VM31 VM32
+
+ Figure 5: Selective AR Scenario
+
+ The solution is called "selective" because a given AR-REPLICATOR MUST
+ replicate the BM traffic to only the AR-LEAFs that requested the
+ replication (as opposed to all the AR-LEAF nodes) and MUST replicate
+ the BM traffic to the RNVEs (if there are any). The same AR roles as
+ those defined in Sections 4 and 5 are used here; however, the
+ procedures are different.
+
+ The selective AR procedures create multiple AR-LEAF-sets in the EVPN
+ BD and build single-hop trees among AR-LEAFs of the same set (AR-
+ LEAF->AR-REPLICATOR->AR-LEAF) and two-hop trees among AR-LEAFs of
+ different sets (AR-LEAF->AR-REPLICATOR->AR-REPLICATOR->AR-LEAF).
+ Compared to the selective solution, the non-selective AR method
+ assumes that all the AR-LEAFs of the BD are in the same set and
+ always creates single-hop trees among AR-LEAFs. While the selective
+ solution is more efficient than the non-selective solution in multi-
+ stage IP fabrics, the trade-off is additional signaling and an
+ additional outer source IP address lookup.
+
+ The following subsections describe the differences in the procedures
+ for AR-REPLICATORs/LEAFs compared to the non-selective AR solution.
+ There are no changes applicable to RNVEs.
+
+6.1. Selective AR-REPLICATOR Procedures
+
+ In our example in Figure 5, PE1 and PE2 are defined as selective AR-
+ REPLICATORs. The following considerations apply to the selective AR-
+ REPLICATOR role:
+
+ a. The selective AR-REPLICATOR role SHOULD be an administrative
+ choice in any NVE/PE that is part of an AR-enabled BD. This
+ administrative option MAY be implemented as a system-level option
+ as opposed to a per-BD option.
+
+ b. Each AR-REPLICATOR will build a list of AR-REPLICATOR, AR-LEAF,
+ and RNVE nodes. In spite of the "selective" administrative
+ option, an AR-REPLICATOR MUST NOT behave as a selective AR-
+ REPLICATOR if at least one of the AR-REPLICATORs has the L flag
+ NOT set. If at least one AR-REPLICATOR sends a Replicator-AR
+ route with L = 0 (in the BD context), the rest of the AR-
+ REPLICATORs will fall back to non-selective AR mode.
+
+ c. The selective AR-REPLICATOR MUST follow the procedures described
+ in Section 5.1, except for the following differences:
+
+ * The AR-REPLICATOR MUST have the L flag set to 1 when
+ advertising the Replicator-AR route. This flag is used by the
+ AR-REPLICATORs to advertise their "selective" AR-REPLICATOR
+ capabilities. In addition, the AR-REPLICATOR auto-configures
+ its IP-address-specific import Route Target as described in
+ the third bullet of the procedures for Leaf A-D routes in
+ Section 4.
+
+ * The AR-REPLICATOR will build a "selective" AR-LEAF-set with
+ the list of nodes that requested replication to its own AR-IP.
+ For instance, assuming that NVE1 and NVE2 advertise a Leaf A-D
+ route with PE1's IP-address-specific Route Target and NVE3
+ advertises a Leaf A-D route with PE2's IP-address-specific
+ Route Target, PE1 will only add NVE1/NVE2 to its selective AR-
+ LEAF-set for BD-1 and exclude NVE3. Likewise, PE2 will only
+ add NVE3 to its selective AR-LEAF-set for BD-1 and exclude
+ NVE1/NVE2.
+
+ * When a node defined and operating as a selective AR-REPLICATOR
+ receives a packet on an overlay tunnel, it will do a tunnel
+ destination IP lookup, and if the destination IP address is
+ the AR-REPLICATOR AR-IP address, the node MUST replicate the
+ packet to:
+
+ - Local ACs.
+
+ - Overlay tunnels in the selective AR-LEAF-set, excluding the
+ overlay tunnel to the source AR-LEAF.
+
+ - Overlay tunnels to the RNVEs if the tunnel source IP
+ address is the IR-IP of an AR-LEAF. In any other case, the
+ AR-REPLICATOR MUST NOT replicate the BM traffic to remote
+ RNVEs. In other words, only the first-hop selective AR-
+ REPLICATOR will replicate to all the RNVEs.
+
+ - Overlay tunnels to the remote selective AR-REPLICATORs if
+ the tunnel source IP address (of the encapsulated packet
+ that arrived on the overlay tunnel) is an IR-IP of its own
+ AR-LEAF-set. In any other case, the AR-REPLICATOR MUST NOT
+ replicate the BM traffic to remote AR-REPLICATORs. When
+ doing this replication, the tunnel destination IP address
+ is the AR-IP of the remote selective AR-REPLICATOR. The
+ tunnel destination address AR-IP will indicate to the
+ remote selective AR-REPLICATOR that the packet needs
+ further replication to its AR-LEAFs.
+
+ A selective AR-REPLICATOR data path implementation MUST be compatible
+ with the following rules:
+
+ * The selective AR-REPLICATORs will build two flooding lists:
+
+ Flooding list #1: Composed of ACs and overlay tunnels to the
+ remote nodes in the BD, always using the IR-IPs in the tunnel
+ destination IP addresses.
+
+ Flooding list #2: Composed of ACs, a selective AR-LEAF-set, and a
+ selective AR-REPLICATOR-set, where:
+
+ - The selective AR-LEAF-set is composed of the overlay tunnels
+ to the AR-LEAFs that advertise a Leaf A-D route for the
+ local AR-REPLICATOR. This set is updated with every Leaf
+ A-D route received/withdrawn from a new AR-LEAF.
+
+ - The selective AR-REPLICATOR-set is composed of the overlay
+ tunnels to all the AR-REPLICATORs that send a Replicator-AR
+ route with L = 1. The AR-IP addresses are used as tunnel
+ destination IP addresses.
+
+ * Some of the overlay tunnels in the flooding lists MAY be flagged
+ as non-BM receivers based on the BM flag received from the remote
+ nodes in the routes.
+
+ * When a selective AR-REPLICATOR receives a BM packet on an AC, it
+ MUST forward the BM packet to its flooding list #1, skipping the
+ non-BM overlay tunnels.
+
+ * When a selective AR-REPLICATOR receives a BM packet on an overlay
+ tunnel, it will check the destination and source IPs of the
+ underlay IP header and:
+
+ - If the destination IP address matches its AR-IP and the source
+ IP address matches an IP of its own selective AR-LEAF-set, the
+ AR-REPLICATOR MUST forward the BM packet to its flooding list
+ #2, unless some AR-REPLICATOR within the BD has advertised L =
+ 0. In the latter case, the node reverts to Non-selective mode,
+ and flooding list #1 MUST be used. Non-BM overlay tunnels are
+ skipped when sending BM packets.
+
+ - If the destination IP address matches its AR-IP and the source
+ IP address does not match any IP address of its selective AR-
+ LEAF-set, the AR-REPLICATOR MUST forward the BM packet to
+ flooding list #2, skipping the AR-REPLICATOR-set. Non-BM
+ overlay tunnels are skipped when sending BM packets.
+
+ - If the destination IP address matches its IR-IP, the AR-
+ REPLICATOR MUST use flooding list #1 but MUST skip all the
+ overlay tunnels from the flooding list, i.e., it will only
+ replicate to local ACs. This is the regular ingress
+ replication behavior described in [RFC7432]. Non-BM overlay
+ tunnels are skipped when sending BM packets.
+
+ * In any case, the AR-REPLICATOR ensures that the traffic is not
+ sent back to the originating source. If the encapsulation is
+ MPLSoGRE or MPLSoUDP and the received BD label (the label that the
+ AR-REPLICATOR advertised in the Replicator-AR route) is not at the
+ bottom of the stack, the AR-REPLICATOR MUST copy the rest of the
+ labels when forwarding them to the egress overlay tunnels.
+
+6.2. Selective AR-LEAF Procedures
+
+ A selective AR-LEAF chooses a single selective AR-REPLICATOR per BD
+ and:
+
+ * Sends all the BD's BM traffic to that AR-REPLICATOR and
+
+ * Expects to receive all the BM traffic for a given BD from the same
+ AR-REPLICATOR (except for the BM traffic from the RNVEs, which
+ comes directly from the RNVEs)
+
+ In the example in Figure 5, we consider NVE1/NVE2/NVE3 as selective
+ AR-LEAFs. NVE1 selects PE1 as its selective AR-REPLICATOR. If that
+ is so, NVE1 will send all its BM traffic for BD-1 to PE1. If other
+ AR-LEAFs/REPLICATORs send BM traffic, NVE1 will receive that traffic
+ from PE1. A selective AR-LEAF and a non-selective AR-LEAF behave
+ differently, as follows:
+
+ a. The selective AR-LEAF role SHOULD be an administrative choice in
+ any NVE/PE that is part of an AR-enabled BD. This administrative
+ option to enable AR-LEAF capabilities MAY be implemented as a
+ system-level option as opposed to a per-BD option.
+
+ b. The AR-LEAF MAY advertise a Regular-IR route if there are RNVEs
+ in the BD. The selective AR-LEAF MUST advertise a Leaf A-D route
+ after receiving a Replicator-AR route with L = 1. It is
+ RECOMMENDED that the selective AR-LEAF wait for a period
+ specified by an AR-LEAF-join-wait-timer (in seconds, with a
+ default value of 3) before sending the Leaf A-D route, so that
+ the AR-LEAF can collect all the Replicator-AR routes for the BD
+ before advertising the Leaf A-D route. If the Replicator-AR
+ route with L = 1 is withdrawn, the corresponding Leaf A-D route
+ is withdrawn too.
+
+ c. In a service where there is more than one selective AR-
+ REPLICATOR, the selective AR-LEAF MUST locally select a single
+ selective AR-REPLICATOR for the BD. Once selected:
+
+ * The selective AR-LEAF MUST send a Leaf A-D route, including
+ the route key and IP-address-specific Route Target of the
+ selected AR-REPLICATOR.
+
+ * The selective AR-LEAF MUST send all the BM packets received on
+ the ACs for a given BD to that AR-REPLICATOR.
+
+ * In the case of failure of the selected AR-REPLICATOR (detected
+ when the Replicator-AR route becomes infeasible as a result of
+ any of the underlying BGP mechanisms), another AR-REPLICATOR
+ will be selected and a new Leaf A-D update will be issued for
+ the new AR-REPLICATOR. This new route will update the
+ selective list in the new selective AR-REPLICATOR. In the
+ case of failure of the active selective AR-REPLICATOR, it is
+ RECOMMENDED that the selective AR-LEAF revert to ingress
+ replication behavior for an AR-REPLICATOR-activation-timer (in
+ seconds, with a default value of 3) to mitigate the traffic
+ impact. When the timer expires, the selective AR-LEAF will
+ resume its AR mode with the new selective AR-REPLICATOR. The
+ AR-REPLICATOR-activation-timer MAY be the same configurable
+ parameter as the parameter discussed in Section 5.2.
+
+ * A selective AR-LEAF MAY change the selection of AR-
+ REPLICATOR(s) dynamically due to an administrative or policy
+ configuration change.
+
+ All the AR-LEAFs in a BD are expected to be configured as either
+ selective or non-selective. A mix of selective and non-selective AR-
+ LEAFs SHOULD NOT coexist in the same BD. If a non-selective AR-LEAF
+ is present, its BM traffic sent to a selective AR-REPLICATOR will not
+ be replicated to other AR-LEAFs that are not in its selective AR-
+ LEAF-set.
+
+ A selective AR-LEAF MUST follow a data path implementation compatible
+ with the following rules:
+
+ * The selective AR-LEAF nodes will build two flooding lists:
+
+ Flooding list #1: Composed of ACs and the overlay tunnel to the
+ selected AR-REPLICATOR (using the AR-IP as the tunnel
+ destination IP address).
+
+ Flooding list #2: Composed of ACs and overlay tunnels to the
+ remote IR-IP addresses.
+
+ * Some of the overlay tunnels in the flooding lists MAY be flagged
+ as non-BM receivers based on the BM flag received from the remote
+ nodes in the routes.
+
+ * When an AR-LEAF receives a BM packet on an AC, it will check to
+ see if an AR-REPLICATOR was selected; if one is found, flooding
+ list #1 MUST be used. Otherwise, flooding list #2 MUST be used.
+ Non-BM overlay tunnels are skipped when sending BM packets.
+
+ * When an AR-LEAF receives a BM packet on an overlay tunnel, it MUST
+ forward the BM packet to its local ACs and never to an overlay
+ tunnel. This is the regular ingress replication behavior
+ described in [RFC7432].
+
+7. Pruned Flooding Lists (PFLs)
+
+ In addition to AR, the second optimization supported by the ingress
+ replication optimization solution specified in this document is the
+ ability of all the BD nodes to signal PFLs. As described in
+ Section 4, an EVPN node can signal a given value for the BM and U
+ PFLs flags in the Regular-IR, Replicator-AR, or Leaf A-D routes,
+ where:
+
+ * BM is the Broadcast and Multicast flag. BM = 1 means "prune me
+ from the BM flooding list". BM = 0 indicates regular behavior.
+
+ * U is the Unknown flag. U = 1 means "prune me from the Unknown
+ flooding list". U = 0 indicates regular behavior.
+
+ The ability to signal and process these PFLs flags SHOULD be an
+ administrative choice. If a node is configured to process the PFLs
+ flags, upon receiving a non-zero PFLs flag for a route, an NVE/PE
+ will add the corresponding flag to the created overlay tunnel in the
+ flooding list. When replicating a BM packet in the context of a
+ flooding list, the NVE/PE will skip the overlay tunnels marked with
+ the flag BM = 1, since the NVEs/PEs at the end of those tunnels are
+ not expecting BM packets. Similarly, when replicating unknown
+ unicast packets, the NVE/PE will skip the overlay tunnels marked with
+ U = 1.
+
+ An NVE/PE not following this document or not configured for this
+ optimization will ignore any of the received PFLs flags. An AR-LEAF
+ or RNVE receiving BUM traffic on an overlay tunnel MUST replicate the
+ traffic to its local ACs, regardless of the BM/U flags on the overlay
+ tunnels.
+
+ This optimization MAY be used along with the Assisted Replication
+ solution.
+
+7.1. Example of a Pruned Flooding List
+
+ In order to illustrate the use of the PFLs solution, we will assume
+ that BD-1 in Figure 4 is optimized ingress replication enabled and:
+
+ * PE1 and PE2 are administratively configured as AR-REPLICATORs due
+ to their high-performance replication capabilities. PE1 and PE2
+ will send a Replicator-AR route with BM/U flags = 00.
+
+ * NVE1 and NVE3 are administratively configured as AR-LEAF nodes due
+ to their low-performance software-based replication capabilities.
+ They will advertise a Regular-IR route with type AR-LEAF.
+ Assuming that both NVEs advertise all of the attached VMs' MAC and
+ IP addresses in EVPNs as soon as they come up and these NVEs do
+ not have any VMs interested in multicast applications, they will
+ be configured to signal BM/U flags = 11 for BD-1. That is,
+ neither NVE1 nor NVE3 is interested in receiving BM or unknown
+ unicast traffic, since:
+
+ - Their attached VMs (VM11, VM12, VM31, VM32) do not support
+ multicast applications.
+
+ - Their attached VMs will not receive ARP Requests. Proxy ARP
+ [RFC9161] on the remote NVEs/PEs will reply to ARP Requests
+ locally, and no other broadcast traffic is expected.
+
+ - Their attached VMs will not receive unknown unicast traffic,
+ since the VMs' MAC and IP addresses are always advertised by
+ EVPNs as long as the VMs are active.
+
+ * NVE2 is optimized ingress replication unaware; therefore, it takes
+ on the RNVE role in BD-1.
+
+ Based on the above assumptions, the following forwarding behavior
+ will take place:
+
+ 1. Any BM packets sent from VM11 will be sent to VM12 and PE1. PE1
+ will then forward the BM packets on to TS1, the WAN link, PE2,
+ and NVE2 but not to NVE3. PE2 and NVE2 will replicate the BM
+ packets to their local ACs, but NVE3 will be prevented from
+ having to replicate those BM packets to VM31 and VM32
+ unnecessarily.
+
+ 2. Any BM packets received on PE2 from the WAN will be sent to PE1
+ and NVE2 but not to NVE1 and NVE3, sparing the two hypervisors
+ from replicating unnecessarily to their local VMs. PE1 and NVE2
+ will replicate to their local ACs only.
+
+ 3. Any unknown unicast packet sent from VM31 will be forwarded by
+ NVE3 to NVE2, PE1, and PE2 but not to NVE1. The solution
+ prevents unnecessary replication to NVE1, since the destination
+ of the unknown traffic cannot be NVE1.
+
+ 4. Any unknown unicast packet sent from TS1 will be forwarded by PE1
+ to the WAN link, PE2, and NVE2 but not to NVE1 and NVE3, since
+ the target of the unknown traffic cannot be NVE1 or NVE3.
+
+8. AR Procedures for Single-IP AR-REPLICATORS
+
+ The procedures explained in Sections 5 and 6 assume that the AR-
+ REPLICATOR can use two local routable IP addresses to terminate and
+ originate NVO tunnels, i.e., IR-IP and AR-IP addresses. This is
+ usually the case for PE-based AR-REPLICATOR nodes.
+
+ In some cases, the AR-REPLICATOR node does not support more than one
+ IP address to terminate and originate NVO tunnels, i.e., the IR-IP
+ and AR-IP are the same IP addresses. This may be the case in some
+ software-based or low-end AR-REPLICATOR nodes. If this is the case,
+ the procedures provided in Sections 5 and 6 MUST be modified in the
+ following way:
+
+ * The Replicator-AR routes generated by the AR-REPLICATOR use an AR-
+ IP that will match its IR-IP. In order to differentiate the data
+ plane packets that need to use ingress replication from the
+ packets that must use Assisted Replication forwarding mode, the
+ Replicator-AR route MUST advertise a different VNI/VSID than the
+ one used by the Regular-IR route. For instance, the AR-REPLICATOR
+ will advertise an AR-VNI along with the Replicator-AR route and an
+ IR-VNI along with the Regular-IR route. Since both routes have
+ the same key, different Route Distinguishers are needed in each
+ route.
+
+ * An AR-REPLICATOR will perform Ingress Replication forwarding mode
+ or Assisted Replication forwarding mode for the incoming overlay
+ packets based on an ingress VNI lookup as opposed to the tunnel IP
+ DA lookup. Note that when replicating to remote AR-REPLICATOR
+ nodes, the use of the IR-VNI or AR-VNI advertised by the egress
+ node will determine whether Ingress Replication forwarding mode or
+ Assisted Replication forwarding mode is used at the subsequent AR-
+ REPLICATOR.
+
+ The rest of the procedures will follow those described in Sections 5
+ and 6.
+
+9. AR Procedures and EVPN All-Active Multihoming Split-Horizon
+
+ This section extends the procedures for the cases where two or more
+ AR-LEAF nodes are attached to the same ES and two or more AR-
+ REPLICATOR nodes are attached to the same ES in the BD. The mixed
+ case -- where an AR-LEAF node and an AR-REPLICATOR node are attached
+ to the same ES -- would require extended procedures that are out of
+ scope for this document.
+
+9.1. Ethernet Segments on AR-LEAF Nodes
+
+ If a VXLAN or NVGRE is used and if the split-horizon is based on the
+ tunnel source IP address and "local bias" as described in [RFC8365],
+ the split-horizon check will not work if an ES is shared between two
+ AR-LEAF nodes, and the AR-REPLICATOR replaces the tunnel source IP
+ address of the packets with its own AR-IP.
+
+ In order to be compatible with the source IP address split-horizon
+ check, the AR-REPLICATOR MAY keep the original received tunnel source
+ IP address when replicating packets to a remote AR-LEAF or RNVE.
+ This will allow AR-LEAF nodes to apply split-horizon check procedures
+ for BM packets before sending them to the local ES. Even if the AR-
+ LEAF's source IP address is preserved when replicating to AR-LEAFs or
+ RNVEs, the AR-REPLICATOR MUST always use its IR-IP as the source IP
+ address when replicating to other AR-REPLICATORs.
+
+ When EVPNs are used for MPLSoGRE or MPLSoUDP, the ESI-label-based
+ split-horizon procedure provided in [RFC7432] will not work for
+ multihomed ESs defined on AR-LEAF nodes. Local bias is recommended
+ in this case, as it is in the case of a VXLAN or NVGRE as explained
+ above. The local-bias and tunnel source IP address preservation
+ mechanisms provide the required split-horizon behavior in non-
+ selective or selective AR.
+
+ Note that if the AR-REPLICATOR implementation keeps the received
+ tunnel source IP address, the use of unicast Reverse Path Forwarding
+ (uRPF) checks in the IP fabric based on the tunnel source IP address
+ MUST be disabled.
+
+9.2. Ethernet Segments on AR-REPLICATOR Nodes
+
+ AR-REPLICATOR nodes attached to the same all-active ES will follow
+ local-bias procedures [RFC8365] as follows:
+
+ a. For BUM traffic received on a local AR-REPLICATOR's AC, local-
+ bias procedures as provided in [RFC8365] MUST be followed.
+
+ b. For BUM traffic received on an AR-REPLICATOR overlay tunnel with
+ AR-IP as the IP DA, local bias MUST also be followed. That is,
+ traffic received with AR-IP as the IP DA will be treated as
+ though it had been received on a local AC that is part of the ES
+ and will be forwarded to all local ESs, irrespective of their DF
+ or NDF state.
+
+ c. BUM traffic received on an AR-REPLICATOR overlay tunnel with IR-
+ IP as the IP DA will follow regular local-bias rules [RFC8365]
+ and will not be forwarded to local ESs that are shared with the
+ AR-LEAF or AR-REPLICATOR originating the traffic.
+
+ d. In cases where the AR-REPLICATOR supports a single IP address,
+ the IR-IP and the AR-IP are the same IP address, as discussed in
+ Section 8. The received BUM traffic will be treated as specified
+ in item b above if the received VNI is the AR-VNI and as
+ specified in item c if the VNI is the IR-VNI.
+
+10. Security Considerations
+
+ The security considerations in [RFC7432] and [RFC8365] apply to this
+ document. The security considerations related to the Leaf A-D route
+ in [RFC9572] apply too.
+
+ In addition, the Assisted Replication method introduced by this
+ document may introduce some new risks that could affect the
+ successful delivery of BM traffic. Unicast traffic is not affected
+ by Assisted Replication (although unknown unicast traffic is affected
+ by the procedures for PFLs). The forwarding of BM traffic is
+ modified, and BM traffic from the AR-LEAF nodes will be drawn toward
+ AR-REPLICATORs in the BD. An AR-LEAF will forward BM traffic to its
+ selected AR-REPLICATOR; therefore, an attack on the AR-REPLICATOR
+ could impact the delivery of the BM traffic using that node. Also,
+ an attack on the AR-REPLICATOR and any change to the advertised AR
+ type will modify the selections made by the AR-LEAF nodes. If no
+ other AR-REPLICATOR is selected, the AR-LEAF nodes will be forced to
+ use Ingress Replication forwarding mode, which will impact their
+ performance, since the AR-LEAF nodes are usually NVEs/PEs with poor
+ replication performance.
+
+ This document introduces the ability of the AR-REPLICATOR to forward
+ traffic received on an overlay tunnel to another overlay tunnel. The
+ reader may determine that this introduces the risk of BM loops --
+ that is, an AR-LEAF receiving a BM-encapsulated packet that the AR-
+ LEAF originated in the first place due to one or two AR-REPLICATORs
+ "looping" the BM traffic back to the AR-LEAF. Following the
+ procedures provided in this document will prevent these BM loops,
+ since the AR-REPLICATOR will always forward the BM traffic using the
+ correct tunnel IP DA (or the correct VNI in the case of single-IP AR-
+ REPLICATORs), which instructs the remote nodes regarding how to
+ forward the traffic. This is true for both the Non-selective and
+ Selective modes defined in this document. However, incorrect
+ implementation of the procedures provided in this document may lead
+ to those unexpected BM loops.
+
+ The Selective mode provides a multi-stage replication solution, where
+ proper configuration of all the AR-REPLICATORs will prevent any
+ issues. A mix of mistakenly configured selective and non-selective
+ AR-REPLICATORs in the same BD could theoretically create packet
+ duplication in some AR-LEAFs; however, this document specifies a
+ fallback solution -- falling back to Non-selective mode in cases
+ where the AR-REPLICATORs advertised an inconsistent AR mode.
+
+ This document allows the AR-REPLICATOR to preserve the tunnel source
+ IP address of the AR-LEAF (as an option) when forwarding BM packets
+ from an overlay tunnel to another overlay tunnel. Preserving the AR-
+ LEAF source IP address makes the local-bias filtering procedures
+ possible for AR-LEAF nodes that are attached to the same ES. If the
+ AR-REPLICATOR does not preserve the AR-LEAF source IP address, AR-
+ LEAF nodes attached to all-active ESs will cause packet duplication
+ on the multihomed CE.
+
+ The AR-REPLICATOR nodes are, by design, using more bandwidth than PEs
+ [RFC7432] or NVEs [RFC8365] would use. Certain network events or
+ unexpected low performance may exceed the AR-REPLICATOR's local
+ bandwidth and cause service disruption.
+
+ Finally, PFLs (Section 7) should be used with care. Intentional or
+ unintentional misconfiguration of the BDs on a given leaf node may
+ result in the leaf not receiving the required BM or unknown unicast
+ traffic.
+
+11. IANA Considerations
+
+ IANA has allocated the following Border Gateway Protocol (BGP)
+ parameters:
+
+ * Allocation in the "P-Multicast Service Interface Tunnel (PMSI
+ Tunnel) Tunnel Types" registry:
+
+ +=======+=============================+===========+
+ | Value | Meaning | Reference |
+ +=======+=============================+===========+
+ | 0x0A | Assisted Replication Tunnel | RFC 9574 |
+ +-------+-----------------------------+-----------+
+
+ Table 1
+
+ * Allocations in the "P-Multicast Service Interface (PMSI) Tunnel
+ Attribute Flags" registry:
+
+ +=======+===============================+===========+
+ | Value | Name | Reference |
+ +=======+===============================+===========+
+ | 3-4 | Assisted Replication Type (T) | RFC 9574 |
+ +-------+-------------------------------+-----------+
+ | 5 | Broadcast and Multicast (BM) | RFC 9574 |
+ +-------+-------------------------------+-----------+
+ | 6 | Unknown (U) | RFC 9574 |
+ +-------+-------------------------------+-----------+
+
+ Table 2
+
+12. References
+
+12.1. Normative References
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119,
+ DOI 10.17487/RFC2119, March 1997,
+ <https://www.rfc-editor.org/info/rfc2119>.
+
+ [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/
+ BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February
+ 2012, <https://www.rfc-editor.org/info/rfc6513>.
+
+ [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP
+ Encodings and Procedures for Multicast in MPLS/BGP IP
+ VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012,
+ <https://www.rfc-editor.org/info/rfc6514>.
+
+ [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A.,
+ Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based
+ Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February
+ 2015, <https://www.rfc-editor.org/info/rfc7432>.
+
+ [RFC7902] Rosen, E. and T. Morin, "Registry and Extensions for
+ P-Multicast Service Interface Tunnel Attribute Flags",
+ RFC 7902, DOI 10.17487/RFC7902, June 2016,
+ <https://www.rfc-editor.org/info/rfc7902>.
+
+ [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
+ 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
+ May 2017, <https://www.rfc-editor.org/info/rfc8174>.
+
+ [RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R.,
+ Uttaro, J., and W. Henderickx, "A Network Virtualization
+ Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365,
+ DOI 10.17487/RFC8365, March 2018,
+ <https://www.rfc-editor.org/info/rfc8365>.
+
+ [RFC9572] Zhang, Z., Lin, W., Rabadan, J., Patel, K., and A.
+ Sajassi, "Updates to EVPN Broadcast, Unknown Unicast, or
+ Multicast (BUM) Procedures", RFC 9572,
+ DOI 10.17487/RFC9572, May 2024,
+ <https://www.rfc-editor.org/info/rfc9572>.
+
+12.2. Informative References
+
+ [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed.,
+ "Encapsulating MPLS in IP or Generic Routing Encapsulation
+ (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005,
+ <https://www.rfc-editor.org/info/rfc4023>.
+
+ [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
+ L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
+ eXtensible Local Area Network (VXLAN): A Framework for
+ Overlaying Virtualized Layer 2 Networks over Layer 3
+ Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014,
+ <https://www.rfc-editor.org/info/rfc7348>.
+
+ [RFC7637] Garg, P., Ed. and Y. Wang, Ed., "NVGRE: Network
+ Virtualization Using Generic Routing Encapsulation",
+ RFC 7637, DOI 10.17487/RFC7637, September 2015,
+ <https://www.rfc-editor.org/info/rfc7637>.
+
+ [RFC9161] Rabadan, J., Ed., Sathappan, S., Nagaraj, K., Hankins, G.,
+ and T. King, "Operational Aspects of Proxy ARP/ND in
+ Ethernet Virtual Private Networks", RFC 9161,
+ DOI 10.17487/RFC9161, January 2022,
+ <https://www.rfc-editor.org/info/rfc9161>.
+
+Acknowledgements
+
+ The authors would like to thank Neil Hart, David Motz, Dai Truong,
+ Thomas Morin, Jeffrey Zhang, Shankar Murthy, and Krzysztof Szarkowicz
+ for their valuable feedback and contributions. Also, thanks to John
+ Scudder for his thorough review, which improved the quality of the
+ document significantly.
+
+Contributors
+
+ In addition to the authors listed on the front page, the following
+ people also contributed to this document and should be considered
+ coauthors:
+
+ Wim Henderickx
+ Nokia
+
+
+ Kiran Nagaraj
+ Nokia
+
+
+ Ravi Shekhar
+ Juniper Networks
+
+
+ Nischal Sheth
+ Juniper Networks
+
+
+ Aldrin Isaac
+ Juniper
+
+
+ Mudassir Tufail
+ Citibank
+
+
+Authors' Addresses
+
+ Jorge Rabadan (editor)
+ Nokia
+ 777 Middlefield Road
+ Mountain View, CA 94043
+ United States of America
+ Email: jorge.rabadan@nokia.com
+
+
+ Senthil Sathappan
+ Nokia
+ Email: senthil.sathappan@nokia.com
+
+
+ Wen Lin
+ Juniper Networks
+ Email: wlin@juniper.net
+
+
+ Mukul Katiyar
+ Versa Networks
+ Email: mukul@versa-networks.com
+
+
+ Ali Sajassi
+ Cisco Systems
+ Email: sajassi@cisco.com