1 files changed, 2635 insertions, 0 deletions
diff --git a/doc/rfc/rfc4428.txt b/doc/rfc/rfc4428.txt
new file mode 100644
index 0000000..2cf0284
--- /dev/null
+++ b/doc/rfc/rfc4428.txt
@@ -0,0 +1,2635 @@
+
+
+
+
+
+
+Network Working Group                              D. Papadimitriou, Ed.
+Request for Comments: 4428                                       Alcatel
+Category: Informational                                   E. Mannie, Ed.
+                                                                Perceval
+                                                              March 2006
+
+
+ Analysis of Generalized Multi-Protocol Label Switching (GMPLS)-based
+      Recovery Mechanisms (including Protection and Restoration)
+
+Status of This Memo
+
+   This memo provides information for the Internet community.  It does
+   not specify an Internet standard of any kind.  Distribution of this
+   memo is unlimited.
+
+Copyright Notice
+
+   Copyright (C) The Internet Society (2006).
+
+Abstract
+
+   This document provides an analysis grid to evaluate, compare, and
+   contrast the Generalized Multi-Protocol Label Switching (GMPLS)
+   protocol suite capabilities with the recovery mechanisms currently
+   proposed at the IETF CCAMP Working Group.  A detailed analysis of
+   each of the recovery phases is provided using the terminology defined
+   in RFC 4427.  This document focuses on transport plane survivability
+   and recovery issues and not on control plane resilience and related
+   aspects.
+
+Table of Contents
+
+   1. Introduction ....................................................3
+   2. Contributors ....................................................4
+   3. Conventions Used in this Document ...............................5
+   4. Fault Management ................................................5
+      4.1. Failure Detection ..........................................5
+      4.2. Failure Localization and Isolation .........................8
+      4.3. Failure Notification .......................................9
+      4.4. Failure Correlation .......................................11
+   5. Recovery Mechanisms ............................................11
+      5.1. Transport vs. Control Plane Responsibilities ..............11
+      5.2. Technology-Independent and Technology-Dependent
+           Mechanisms ................................................12
+           5.2.1. OTN Recovery .......................................12
+           5.2.2. Pre-OTN Recovery ...................................13
+           5.2.3. SONET/SDH Recovery .................................13
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 1]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+      5.3. Specific Aspects of Control Plane-Based Recovery
+           Mechanisms ................................................14
+           5.3.1. In-Band vs. Out-Of-Band Signaling ..................14
+           5.3.2. Uni- vs. Bi-Directional Failures ...................15
+           5.3.3. Partial vs. Full Span Recovery .....................17
+           5.3.4. Difference between LSP, LSP Segment and
+                  Span Recovery ......................................18
+      5.4. Difference between Recovery Type and Scheme ...............19
+      5.5. LSP Recovery Mechanisms ...................................21
+           5.5.1. Classification .....................................21
+           5.5.2. LSP Restoration ....................................23
+           5.5.3. Pre-Planned LSP Restoration ........................24
+           5.5.4. LSP Segment Restoration ............................25
+   6. Reversion ......................................................26
+      6.1. Wait-To-Restore (WTR) .....................................26
+      6.2. Revertive Mode Operation ..................................26
+      6.3. Orphans ...................................................27
+   7. Hierarchies ....................................................27
+      7.1. Horizontal Hierarchy (Partitioning) .......................28
+      7.2. Vertical Hierarchy (Layers) ...............................28
+           7.2.1. Recovery Granularity ...............................30
+      7.3. Escalation Strategies .....................................30
+      7.4. Disjointness ..............................................31
+           7.4.1. SRLG Disjointness ..................................32
+   8. Recovery Mechanisms Analysis ...................................33
+      8.1. Fast Convergence (Detection/Correlation and
+           Hold-off Time) ............................................34
+      8.2. Efficiency (Recovery Switching Time) ......................34
+      8.3. Robustness ................................................35
+      8.4. Resource Optimization .....................................36
+           8.4.1. Recovery Resource Sharing ..........................37
+           8.4.2. Recovery Resource Sharing and SRLG Recovery ........39
+           8.4.3. Recovery Resource Sharing, SRLG
+                  Disjointness and Admission Control .................40
+   9. Summary and Conclusions ........................................42
+   10. Security Considerations .......................................43
+   11. Acknowledgements ..............................................43
+   12. References ....................................................44
+      12.1. Normative References .....................................44
+      12.2. Informative References ...................................44
+
+
+
+
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 2]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+1.  Introduction
+
+   This document provides an analysis grid to evaluate, compare, and
+   contrast the Generalized MPLS (GMPLS) protocol suite capabilities
+   with the recovery mechanisms proposed at the IETF CCAMP Working
+   Group.  The focus is on transport plane survivability and recovery
+   issues and not on control-plane-resilience-related aspects.  Although
+   the recovery mechanisms described in this document impose different
+   requirements on GMPLS-based recovery protocols, the protocols'
+   specifications will not be covered in this document.  Though the
+   concepts discussed are technology independent, this document
+   implicitly focuses on SONET [T1.105]/SDH [G.707], Optical Transport
+   Networks (OTN) [G.709], and pre-OTN technologies, except when
+   specific details need to be considered (for instance, in the case of
+   failure detection).
+
+   A detailed analysis is provided for each of the recovery phases as
+   identified in [RFC4427].  These phases define the sequence of generic
+   operations that need to be performed when a LSP/Span failure (or any
+   other event generating such failures) occurs:
+
+      - Phase 1: Failure Detection
+      - Phase 2: Failure Localization (and Isolation)
+      - Phase 3: Failure Notification
+      - Phase 4: Recovery (Protection or Restoration)
+      - Phase 5: Reversion (Normalization)
+
+   Together, failure detection, localization, and notification phases
+   are referred to as "fault management".  Within a recovery domain, the
+   entities involved during the recovery operations are defined in
+   [RFC4427]; these entities include ingress, egress, and intermediate
+   nodes.  The term "recovery mechanism" is used to cover both
+   protection and restoration mechanisms.  Specific terms such as
+   "protection" and "restoration" are used only when differentiation is
+   required.  Likewise the term "failure" is used to represent both
+   signal failure and signal degradation.
+
+   In addition, when analyzing the different hierarchical recovery
+   mechanisms including disjointness-related issues, a clear distinction
+   is made between partitioning (horizontal hierarchy) and layering
+   (vertical hierarchy).  In order to assess the current GMPLS protocol
+   capabilities and the potential need for further extensions, the
+   dimensions for analyzing each of the recovery mechanisms detailed in
+   this document are introduced.  This document concludes by detailing
+   the applicability of the current GMPLS protocol building blocks for
+   recovery purposes.
+
+
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 3]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+2.  Contributors
+
+   This document is the result of the CCAMP Working Group Protection and
+   Restoration design team joint effort.  Besides the editors, the
+   following are the authors that contributed to the present memo:
+
+   Deborah Brungard (AT&T)
+   200 S. Laurel Ave.
+   Middletown, NJ 07748, USA
+
+   EMail: dbrungard@att.com
+
+
+   Sudheer Dharanikota
+
+   EMail: sudheer@ieee.org
+
+
+   Jonathan P. Lang (Sonos)
+   506 Chapala Street
+   Santa Barbara, CA 93101, USA
+
+   EMail: jplang@ieee.org
+
+
+   Guangzhi Li (AT&T)
+   180 Park Avenue,
+   Florham Park, NJ 07932, USA
+
+   EMail: gli@research.att.com
+
+
+   Eric Mannie
+   Perceval
+   Rue Tenbosch, 9
+   1000 Brussels
+   Belgium
+
+   Phone: +32-2-6409194
+   EMail: eric.mannie@perceval.net
+
+
+   Dimitri Papadimitriou (Alcatel)
+   Francis Wellesplein, 1
+   B-2018 Antwerpen, Belgium
+
+   EMail: dimitri.papadimitriou@alcatel.be
+
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 4]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   Bala Rajagopalan
+   Microsoft India Development Center
+   Hyderabad, India
+
+   EMail: balar@microsoft.com
+
+
+   Yakov Rekhter (Juniper)
+   1194 N. Mathilda Avenue
+   Sunnyvale, CA 94089, USA
+
+   EMail: yakov@juniper.net
+
+3.  Conventions Used in this Document
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in [RFC2119].
+
+   Any other recovery-related terminology used in this document conforms
+   to that defined in [RFC4427].  The reader is also assumed to be
+   familiar with the terminology developed in [RFC3945], [RFC3471],
+   [RFC3473], [RFC4202], and [RFC4204].
+
+4.  Fault Management
+
+4.1.  Failure Detection
+
+   Transport failure detection is the only phase that cannot be achieved
+   by the control plane alone because the latter needs a hook to the
+   transport plane in order to collect the related information.  It has
+   to be emphasized that even if failure events themselves are detected
+   by the transport plane, the latter, upon a failure condition, must
+   trigger the control plane for subsequent actions through the use of
+   GMPLS signaling capabilities (see [RFC3471] and [RFC3473]) or Link
+   Management Protocol capabilities (see [RFC4204], Section 6).
+
+   Therefore, by definition, transport failure detection is transport
+   technology dependent (and so exceptionally, we keep here the
+   "transport plane" terminology).  In transport fault management,
+   distinction is made between a defect and a failure.  Here, the
+   discussion addresses failure detection (persistent fault cause).  In
+   the technology-dependent descriptions, a more precise specification
+   will be provided.
+
+   As an example, SONET/SDH (see [G.707], [G.783], and [G.806]) provides
+   supervision capabilities covering:
+
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 5]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   - Continuity: SONET/SDH monitors the integrity of the continuity of a
+     trail (i.e., section or path).  This operation is performed by
+     monitoring the presence/absence of the signal.  Examples are Loss
+     of Signal (LOS) detection for the physical layer, Unequipped (UNEQ)
+     Signal detection for the path layer, Server Signal Fail Detection
+     (e.g., AIS) at the client layer.
+
+   - Connectivity: SONET/SDH monitors the integrity of the routing of
+     the signal between end-points.  Connectivity monitoring is needed
+     if the layer provides flexible connectivity, either automatically
+     (e.g., cross-connects) or manually (e.g., fiber distribution
+     frame).  An example is the Trail (i.e., section or path) Trace
+     Identifier used at the different layers and the corresponding Trail
+     Trace Identifier Mismatch detection.
+
+   - Alignment: SONET/SDH checks that the client and server layer frame
+     start can be correctly recovered from the detection of loss of
+     alignment.  The specific processes depend on the signal/frame
+     structure and may include: (multi-)frame alignment, pointer
+     processing, and alignment of several independent frames to a common
+     frame start in case of inverse multiplexing.  Loss of alignment is
+     a generic term.  Examples are loss of frame, loss of multi-frame,
+     or loss of pointer.
+
+   - Payload type: SONET/SDH checks that compatible adaptation functions
+     are used at the source and the destination.  Normally, this is done
+     by adding a payload type identifier (referred to as the "signal
+     label") at the source adaptation function and comparing it with the
+     expected identifier at the destination.  For instance, the payload
+     type identifier is compared with the corresponding mismatch
+     detection.
+
+   - Signal Quality: SONET/SDH monitors the performance of a signal.
+     For instance, if the performance falls below a certain threshold, a
+     defect -- excessive errors (EXC) or degraded signal (DEG) -- is
+     detected.
+
+   The most important point is that the supervision processes and the
+   corresponding failure detection (used to initiate the recovery
+   phase(s)) result in either:
+
+   - Signal Degrade (SD): A signal indicating that the associated data
+     has degraded in the sense that a degraded defect condition is
+     active (for instance, a dDEG declared when the Bit Error Rate
+     exceeds a preset threshold).  Or
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 6]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   - Signal Fail (SF): A signal indicating that the associated data has
+     failed in the sense that a signal interrupting near-end defect
+     condition is active (as opposed to the degraded defect).
+
+   In Optical Transport Networks (OTN), equivalent supervision
+   capabilities are provided at the optical/digital section layers
+   (i.e., Optical Transmission Section (OTS), Optical Multiplex Section
+   (OMS) and Optical channel Transport Unit (OTU)) and at the
+   optical/digital path layers (i.e., Optical Channel (OCh) and Optical
+   channel Data Unit (ODU)).  Interested readers are referred to the
+   ITU-T Recommendations [G.798] and [G.709] for more details.
+
+   The above are examples that illustrate cases where the failure
+   detection and reporting entities (see [RFC4427]) are co-located.  The
+   following example illustrates the scenario where the failure
+   detecting and reporting entities (see [RFC4427]) are not co-located.
+
+   In pre-OTN networks, a failure may be masked by intermediate O-E-O
+   based Optical Line System (OLS), preventing a Photonic Cross-Connect
+   (PXC) from detecting upstream failures.  In such cases, failure
+   detection may be assisted by an out-of-band communication channel,
+   and failure condition may be reported to the PXC control plane.  This
+   can be provided by using [RFC4209] extensions that deliver IP
+   message-based communication between the PXC and the OLS control
+   plane.  Also, since PXCs are independent of the framing format,
+   failure conditions can only be triggered either by detecting the
+   absence of the optical signal or by measuring its quality.  These
+   mechanisms are generally less reliable than electrical (digital)
+   ones.  Both types of detection mechanisms are outside the scope of
+   this document.  If the intermediate OLS supports electrical (digital)
+   mechanisms, using the LMP communication channel, these failure
+   conditions are reported to
+
+   the PXC and subsequent recovery actions are performed as described in
+   Section 5.  As such, from the control plane viewpoint, this mechanism
+   turns the OLS-PXC-composed system into a single logical entity, thus
+   having the same failure management mechanisms as any other O-E-O
+   capable device.
+
+   More generally, the following are typical failure conditions in
+   SONET/SDH and pre-OTN networks:
+
+   - Loss of Light (LOL)/Loss of Signal (LOS): Signal Failure (SF)
+     condition where the optical signal is not detected any longer on
+     the receiver of a given interface.
+
+   - Signal Degrade (SD): detection of the signal degradation over
+     a specific period of time.
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 7]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   - For SONET/SDH payloads, all of the above-mentioned supervision
+     capabilities can be used, resulting in SD or SF conditions.
+
+   In summary, the following cases apply when considering the
+   communication between the detecting and reporting entities:
+
+   - Co-located detecting and reporting entities: both the detecting and
+     reporting entities are on the same node (e.g., SONET/SDH equipment,
+     Opaque cross-connects, and, with some limitations, Transparent
+     cross-connects, etc.)
+
+   - Non-co-located detecting and reporting entities:
+
+     o with in-band communication between entities: entities are
+       physically separated, but the transport plane provides in-band
+       communication between them (e.g., Server Signal Failures such as
+       Alarm Indication Signal (AIS), etc.)
+
+     o with out-of-band communication between entities: entities are
+       physically separated, but an out-of-band communication channel is
+       provided between them (e.g., using [RFCF4204]).
+
+4.2.  Failure Localization and Isolation
+
+   Failure localization provides information to the deciding entity
+   about the location (and so the identity) of the transport plane
+   entity that detects the LSP(s)/span(s) failure.  The deciding entity
+   can then make an accurate decision to achieve finer grained recovery
+   switching action(s).  Note that this information can also be included
+   as part of the failure notification (see Section 4.3).
+
+   In some cases, this accurate failure localization information may be
+   less urgent to determine if it requires performing more time-
+   consuming failure isolation (see also Section 4.4).  This is
+   particularly the case when edge-to-edge LSP recovery is performed
+   based on a simple failure notification (including the identification
+   of the working LSPs under failure condition).  Note that "edge"
+   refers to a sub-network end-node, for instance.  In this case, a more
+   accurate localization and isolation can be performed after recovery
+   of these LSPs.
+
+   Failure localization should be triggered immediately after the fault
+   detection phase.  This operation can be performed at the transport
+   plane and/or (if the operation is unavailable via the transport
+   plane) the control plane level where dedicated signaling messages can
+   be used.  When performed at the control plane level, a protocol such
+   as LMP (see [RFC4204], Section 6) can be used for failure
+   localization purposes.
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 8]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+4.3.  Failure Notification
+
+   Failure notification is used 1) to inform intermediate nodes that an
+   LSP/span failure has occurred and has been detected and 2) to inform
+   the deciding entities (which can correspond to any intermediate or
+   end-point of the failed LSP/span) that the corresponding service is
+   not available.  In general, these deciding entities will be the ones
+   making the appropriate recovery decision.  When co-located with the
+   recovering entity, these entities will also perform the corresponding
+   recovery action(s).
+
+   Failure notification can be provided either by the transport or by
+   the control plane.  As an example, let us first briefly describe the
+   failure notification mechanism defined at the SONET/SDH transport
+   plane level (also referred to as maintenance signal supervision):
+
+   - AIS (Alarm Indication Signal) occurs as a result of a failure
+     condition such as Loss of Signal and is used to notify downstream
+     nodes (of the appropriate layer processing) that a failure has
+     occurred.  AIS performs two functions: 1) inform the intermediate
+     nodes (with the appropriate layer monitoring capability) that a
+     failure has been detected and 2) notify the connection end-point
+     that the service is no longer available.
+
+   For a distributed control plane supporting one (or more) failure
+   notification mechanism(s), regardless of the mechanism's actual
+   implementation, the same capabilities are needed with more (or less)
+   information provided about the LSPs/spans under failure condition,
+   their detailed statuses, etc.
+
+   The most important difference between these mechanisms is related to
+   the fact that transport plane notifications (as defined today) would
+   directly initiate either a certain type of protection switching (such
+   as those described in [RFC4427]) via the transport plane or
+   restoration actions via the management plane.
+
+   On the other hand, using a failure notification mechanism through the
+   control plane would provide the possibility of triggering either a
+   protection or a restoration action via the control plane.  This has
+   the advantage that a control-plane-recovery-responsible entity does
+   not necessarily have to be co-located with a transport
+   maintenance/recovery domain.  A control plane recovery domain can be
+   defined at entities not supporting a transport plane recovery.
+
+   Moreover, as specified in [RFC3473], notification message exchanges
+   through a GMPLS control plane may not follow the same path as the
+   LSP/spans for which these messages carry the status.  In turn, this
+   ensures a fast, reliable (through acknowledgement and the use of
+
+
+
+Papadimitriou & Mannie       Informational                      [Page 9]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   either a dedicated control plane network or disjoint control
+   channels), and efficient (through the aggregation of several LSP/span
+   statuses within the same message) failure notification mechanism.
+
+   The other important properties to be met by the failure notification
+   mechanism are mainly the following:
+
+   - Notification messages must provide enough information such that the
+     most efficient subsequent recovery action will be taken at the
+     recovering entities (in most of the recovery types and schemes this
+     action is even deterministic).  Remember here that these entities
+     can be either intermediate or end-points through which normal
+     traffic flows.  Based on local policy, intermediate nodes may not
+     use this information for subsequent recovery actions (see for
+     instance the APS protocol phases as described in [RFC4427]).  In
+     addition, since fast notification is a mechanism running in
+     collaboration with the existing GMPLS signaling (see [RFC3473])
+     that also allows intermediate nodes to stay informed about the
+     status of the working LSP/spans under failure condition.
+
+     The trade-off here arises when defining what information the
+     LSP/span end-points (more precisely, the deciding entities) need in
+     order for the recovering entity to take the best recovery action:
+     If not enough information is provided, the decision cannot be
+     optimal (note that in this eventuality, the important issue is to
+     quantify the level of sub-optimality).  If too much information is
+     provided, the control plane may be overloaded with unnecessary
+     information and the aggregation/correlation of this notification
+     information will be more complex and time-consuming to achieve.
+     Note that a more detailed quantification of the amount of
+     information to be exchanged and processed is strongly dependent on
+     the failure notification protocol.
+
+   - If the failure localization and isolation are not performed by one
+     of the LSP/span end-points or some intermediate points, the points
+     should receive enough information from the notification message in
+     order to locate the failure.  Otherwise, they would need to (re-)
+     initiate a failure localization and isolation action.
+
+   - Avoiding so-called notification storms implies that 1) the failure
+     detection output is correlated (i.e., alarm correlation) and
+     aggregated at the node detecting the failure(s), 2) the failure
+     notifications are directed to a restricted set of destinations (in
+     general the end-points), and 3) failure notification suppression
+     (i.e., alarm suppression) is provided in order to limit flooding in
+     case of multiple and/or correlated failures detected at several
+     locations in the network.
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 10]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   - Alarm correlation and aggregation (at the failure-detecting node)
+     implies a consistent decision based on the conditions for which a
+     trade-off between fast convergence (at detecting node) and fast
+     notification (implying that correlation and aggregation occurs at
+     receiving end-points) can be found.
+
+4.4.  Failure Correlation
+
+   A single failure event (such as a span failure) can cause multiple
+   failure (such as individual LSP failures) conditions to be reported.
+   These can be grouped (i.e., correlated) to reduce the number of
+   failure conditions communicated on the reporting channel, for both
+   in-band and out-of-band failure reporting.
+
+   In such a scenario, it can be important to wait for a certain period
+   of time, typically called failure correlation time, and gather all
+   the failures to report them as a group of failures (or simply group
+   failure).  For instance, this approach can be provided using LMP-WDM
+   for pre-OTN networks (see [RFC4209]) or when using Signal
+   Failure/Degrade Group in the SONET/SDH context.
+
+   Note that a default average time interval during which failure
+   correlation operation can be performed is difficult to provide since
+   it is strongly dependent on the underlying network topology.
+   Therefore, providing a per-node configurable failure correlation time
+   can be advisable.  The detailed selection criteria for this time
+   interval are outside of the scope of this document.
+
+   When failure correlation is not provided, multiple failure
+   notification messages may be sent out in response to a single failure
+   (for instance, a fiber cut).  Each failure notification message
+   contains a set of information on the failed working resources (for
+   instance, the individual lambda LSP flowing through this fiber).
+   This allows for a more prompt response, but can potentially overload
+   the control plane due to a large amount of failure notifications.
+
+5.  Recovery Mechanisms
+
+5.1.  Transport vs. Control Plane Responsibilities
+
+   When applicable, recovery resources are provisioned, for both
+   protection and restoration, using GMPLS signaling capabilities.
+   Thus, these are control plane-driven actions (topological and
+   resource-constrained) that are always performed in this context.
+
+   The following tables give an overview of the responsibilities taken
+   by the control plane in case of LSP/span recovery:
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 11]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   1. LSP/span Protection
+
+   - Phase 1: Failure Detection                  Transport plane
+   - Phase 2: Failure Localization/Isolation     Transport/Control plane
+   - Phase 3: Failure Notification               Transport/Control plane
+   - Phase 4: Protection Switching               Transport/Control plane
+   - Phase 5: Reversion (Normalization)          Transport/Control plane
+
+   Note: in the context of LSP/span protection, control plane actions
+   can be performed either for operational purposes and/or
+   synchronization purposes (vertical synchronization between transport
+   and control plane) and/or notification purposes (horizontal
+   synchronization between end-nodes at control plane level).  This
+   suggests the selection of the responsible plane (in particular for
+   protection switching) during the provisioning phase of the
+   protected/protection LSP.
+
+   2. LSP/span Restoration
+
+   - Phase 1: Failure Detection                  Transport plane
+   - Phase 2: Failure Localization/Isolation     Transport/Control plane
+   - Phase 3: Failure Notification               Control plane
+   - Phase 4: Recovery Switching                 Control plane
+   - Phase 5: Reversion (Normalization)          Control plane
+
+   Therefore, this document primarily focuses on provisioning of LSP
+   recovery resources, failure notification mechanisms, recovery
+   switching, and reversion operations.  Moreover, some additional
+   considerations can be dedicated to the mechanisms associated to the
+   failure localization/isolation phase.
+
+5.2.  Technology-Independent and Technology-Dependent Mechanisms
+
+   The present recovery mechanisms analysis applies to any circuit-
+   oriented data plane technology with discrete bandwidth increments
+   (like SONET/SDH, G.709 OTN, etc.) being controlled by a GMPLS-based
+   distributed control plane.
+
+   The following sub-sections are not intended to favor one technology
+   versus another.  They list pro and cons for each technology in order
+   to determine the mechanisms that GMPLS-based recovery must deliver to
+   overcome their cons and make use of their pros in their respective
+   applicability context.
+
+5.2.1.  OTN Recovery
+
+   OTN recovery specifics are left for further consideration.
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 12]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+5.2.2.  Pre-OTN Recovery
+
+   Pre-OTN recovery specifics (also referred to as "lambda switching")
+   present mainly the following advantages:
+
+   - They benefit from a simpler architecture, making it more suitable
+     for mesh-based recovery types and schemes (on a per-channel basis).
+
+   - Failure suppression at intermediate node transponders, e.g., use of
+     squelching, implies that failures (such as LoL) will propagate to
+     edge nodes.  Thus, edge nodes will have the possibility to initiate
+     recovery actions driven by upper layers (vs. use of non-standard
+     masking of upstream failures).
+
+   The main disadvantage is the lack of interworking due to the large
+   amount of failure management (in particular failure notification
+   protocols) and recovery mechanisms currently available.
+
+   Note also, that for all-optical networks, combination of recovery
+   with optical physical impairments is left for a future release of
+   this document because corresponding detection technologies are under
+   specification.
+
+5.2.3.  SONET/SDH Recovery
+
+   Some of the advantages of SONET [T1.105]/SDH [G.707], and more
+   generically any Time Division Multiplexing (TDM) transport plane
+   recovery, are that they provide:
+
+   - Protection types operating at the data plane level that are
+     standardized (see [G.841]) and can operate across protected domains
+     and interwork (see [G.842]).
+
+   - Failure detection, notification, and path/section Automatic
+     Protection Switching (APS) mechanisms.
+
+   - Greater control over the granularity of the TDM LSPs/links that can
+     be recovered with respect to coarser optical channel (or whole
+     fiber content) recovery switching
+
+   Some of the limitations of the SONET/SDH recovery are:
+
+   - Limited topological scope: Inherently the use of ring topologies,
+     typically, dedicated Sub-Network Connection Protection (SNCP) or
+     shared protection rings, has reduced flexibility and resource
+     efficiency with respect to the (somewhat more complex) meshed
+     recovery.
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 13]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   - Inefficient use of spare capacity: SONET/SDH protection is largely
+     applied to ring topologies, where spare capacity often remains
+     idle, making the efficiency of bandwidth usage a real issue.
+
+   - Support of meshed recovery requires intensive network management
+     development, and the functionality is limited by both the network
+     elements and the capabilities of the element management systems
+     (thus justifying the development of GMPLS-based distributed
+     recovery mechanisms.)
+
+5.3.  Specific Aspects of Control Plane-Based Recovery Mechanisms
+
+5.3.1.  In-Band vs. Out-Of-Band Signaling
+
+   The nodes communicate through the use of IP terminating control
+   channels defining the control plane (transport) topology.  In this
+   context, two classes of transport mechanisms can be considered here:
+   in-fiber or out-of-fiber (through a dedicated physically diverse
+   control network referred to as the Data Communication Network or
+   DCN).  The potential impact of the usage of an in-fiber (signaling)
+   transport mechanism is briefly considered here.
+
+   In-fiber transport mechanisms can be further subdivided into in-band
+   and out-of-band.  As such, the distinction between in-fiber in-band
+   and in-fiber out-of-band signaling reduces to the consideration of a
+   logically- versus physically-embedded control plane topology with
+   respect to the transport plane topology.  In the scope of this
+   document, it is assumed that at least one IP control channel between
+   each pair of adjacent nodes is continuously available to enable the
+   exchange of recovery-related information and messages.  Thus, in
+   either case (i.e., in-band or out-of-band) at least one logical or
+   physical control channel between each pair of nodes is always
+   expected to be available.
+
+   Therefore, the key issue when using in-fiber signaling is whether one
+   can assume independence between the fault-tolerance capabilities of
+   control plane and the failures affecting the transport plane
+   (including the nodes).  Note also that existing specifications like
+   the OTN provide a limited form of independence for in-fiber signaling
+   by dedicating a separate optical supervisory channel (OSC, see
+   [G.709] and [G.874]) to transport the overhead and other control
+   traffic.  For OTNs, failure of the OSC does not result in failing the
+   optical channels.  Similarly, loss of the control channel must not
+   result in failing the data channels (transport plane).
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 14]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+5.3.2.  Uni- vs. Bi-Directional Failures
+
+   The failure detection, correlation, and notification mechanisms
+   (described in Section 4) can be triggered when either a uni-
+   directional or a bi-directional LSP/Span failure occurs (or a
+   combination of both).  As illustrated in Figures 1 and 2, two
+   alternatives can be considered here:
+
+   1. Uni-directional failure detection: the failure is detected on the
+      receiver side, i.e., it is detected by only the downstream node to
+      the failure (or by the upstream node depending on the failure
+      propagation direction, respectively).
+
+   2. Bi-directional failure detection: the failure is detected on the
+      receiver side of both downstream node AND upstream node to the
+      failure.
+
+   Notice that after the failure detection time, if only control-plane-
+   based failure management is provided, the peering node is unaware of
+   the failure detection status of its neighbor.
+
+    -------             -------           -------             -------
+   |       |           |       |Tx     Rx|       |           |       |
+   | NodeA |----...----| NodeB |xxxxxxxxx| NodeC |----...----| NodeD |
+   |       |----...----|       |---------|       |----...----|       |
+    -------             -------           -------             -------
+
+   t0                                >>>>>>> F
+
+   t1                      x <---------------x
+                               Notification
+   t2  <--------...--------x                 x--------...-------->
+          Up Notification                      Down Notification
+
+              Figure 1: Uni-directional failure detection
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 15]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+    -------             -------           -------             -------
+   |       |           |       |Tx     Rx|       |           |       |
+   | NodeA |----...----| NodeB |xxxxxxxxx| NodeC |----...----| NodeD |
+   |       |----...----|       |xxxxxxxxx|       |----...----|       |
+    -------             -------           -------             -------
+
+   t0                      F <<<<<<< >>>>>>> F
+
+   t1                      x <-------------> x
+                               Notification
+   t2  <--------...--------x                 x--------...-------->
+          Up Notification                      Down Notification
+
+               Figure 2: Bi-directional failure detection
+
+   After failure detection, the following failure management operations
+   can be subsequently considered:
+
+   - Each detecting entity sends a notification message to the
+     corresponding transmitting entity.  For instance, in Figure 1, node
+     C sends a notification message to node B.  In Figure 2, node C
+     sends a notification message to node B while node B sends a
+     notification message to node C.  To ensure reliable failure
+     notification, a dedicated acknowledgement message can be returned
+     back to the sender node.
+
+   - Next, within a certain (and pre-determined) time window, nodes
+     impacted by the failure occurrences may perform their correlation.
+     In case of uni-directional failure, node B only receives the
+     notification message from node C, and thus the time for this
+     operation is negligible.  In case of bi-directional failure, node B
+     has to correlate the received notification message from node C with
+     the corresponding locally detected information (and node C has to
+     do the same with the message from node B).
+
+   - After some (pre-determined) period of time, referred to as the
+     hold-off time, if the local recovery actions (see Section 5.3.4)
+     were not successful, the following occurs.  In case of uni-
+     directional failure and depending on the directionality of the LSP,
+     node B should send an upstream notification message (see [RFC3473])
+     to the ingress node A.  Node C may send a downstream notification
+     message (see [RFC3473]) to the egress node D.  However, in that
+     case, only node A would initiate an edge to edge recovery action.
+     Node A is referred to as the "master", and node D is referred to as
+     the "slave", per [RFC4427].  Note that the other LSP end-node (node
+     D in this case) may be optionally notified using a downstream
+     notification message (see [RFC3473]).
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 16]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+     In case of bi-directional failure, node B should send an upstream
+     notification message (see [RFC3473]) to the ingress node A.  Node C
+     may send a downstream notification message (see [RFC3473]) to the
+     egress node D.  However, due to the dependence on the LSP
+     directionality, only ingress node A would initiate an edge-to-edge
+     recovery action.  Note that the other LSP end-node (node D in this
+     case) should also be notified of this event using a downstream
+     notification message (see [RFC3473]).  For instance, if an LSP
+     directed from D to A is under failure condition, only the
+     notification message sent from node C to D would initiate a
+     recovery action.  In this case, per [RFC4427], the deciding and
+     recovering node D is referred to as the "master", while node A is
+     referred to as the "slave" (i.e., recovering only entity).
+
+     Note: The determination of the master and the slave may be based
+     either on configured information or dedicated protocol capability.
+
+   In the above scenarios, the path followed by the upstream and
+   downstream notification messages does not have to be the same as the
+   one followed by the failed LSP (see [RFC3473] for more details on the
+   notification message exchange).  The important point concerning this
+   mechanism is that either the detecting/reporting entity (i.e., nodes
+   B and C) is also the deciding/recovery entity or the
+   detecting/reporting entity is simply an intermediate node in the
+   subsequent recovery process.  One refers to local recovery in the
+   former case, and to edge-to-edge recovery in the latter one (see also
+   Section 5.3.4).
+
+5.3.3.  Partial vs. Full Span Recovery
+
+   When a given span carries more than one LSP or LSP segment, an
+   additional aspect must be considered.  In case of span failure, the
+   LSPs it carries can be recovered individually, as a group (aka bulk
+   LSP recovery), or as independent sub-groups.  When correlation time
+   windows are used and simultaneous recovery of several LSPs can be
+   performed using a single request, the selection of this mechanism
+   would be triggered independently of the failure notification
+   granularity.  Moreover, criteria for forming such sub-groups are
+   outside of the scope of this document.
+
+   Additional complexity arises in the case of (sub-)group LSP recovery.
+   Between a given pair of nodes, the LSPs that a given (sub-)group
+   contains may have been created from different source nodes (i.e.,
+   initiator) and directed toward different destination nodes.
+   Consequently the failure notification messages following a bi-
+   directional span failure that affects several LSPs (or the whole
+   group of LSPs it carries) are not necessarily directed toward the
+   same initiator nodes.  In particular, these messages may be directed
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 17]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   to both the upstream and downstream nodes to the failure.  Therefore,
+   such span failure may trigger recovery actions to be performed from
+   both sides (i.e., from both the upstream and the downstream nodes to
+   the failure).  In order to facilitate the definition of the
+   corresponding recovery mechanisms (and their sequence), one assumes
+   here as well that, per [RFC4427], the deciding (and recovering)
+   entity (referred to as the "master") is the only initiator of the
+   recovery of the whole LSP (sub-)group.
+
+5.3.4.  Difference between LSP, LSP Segment and Span Recovery
+
+   The recovery definitions given in [RFC4427] are quite generic and
+   apply for link (or local span) and LSP recovery.  The major
+   difference between LSP, LSP Segment and span recovery is related to
+   the number of intermediate nodes that the signaling messages have to
+   travel.  Since nodes are not necessarily adjacent in the case of LSP
+   (or LSP Segment) recovery, signaling message exchanges from the
+   reporting to the deciding/recovery entity may have to cross several
+   intermediate nodes.  In particular, this applies to the notification
+   messages due to the number of hops separating the location of a
+   failure occurrence from its destination.  This results in an
+   additional propagation and forwarding delay.  Note that the former
+   delay may in certain circumstances be non-negligible; e.g., in a
+   copper out-of-band network, the delay is approximately 1 ms per
+   200km.
+
+   Moreover, the recovery mechanisms applicable to end-to-end LSPs and
+   to the segments that may compose an end-to-end LSP (i.e., edge-to-
+   edge recovery) can be exactly the same.  However, one expects in the
+   latter case, that the destination of the failure notification message
+   will be the ingress/egress of each of these segments.  Therefore,
+   using the mechanisms described in Section 5.3.2, failure notification
+   messages can be exchanged first between terminating points of the LSP
+   segment, and after expiration of the hold-off time, between
+   terminating points of the end-to-end LSP.
+
+   Note: Several studies provide quantitative analysis of the relative
+   performance of LSP/span recovery techniques. [WANG] for instance,
+   provides an analysis grid for these techniques showing that dynamic
+   LSP restoration (see Section 5.5.2) performs well under medium
+   network loads, but suffers performance degradations at higher loads
+   due to greater contention for recovery resources.  LSP restoration
+   upon span failure, as defined in [WANG], degrades at higher loads
+   because paths around failed links tend to increase the hop count of
+   the affected LSPs and thus consume additional network resources.
+   Also, performance of LSP restoration can be enhanced by a failed
+   working LSP's source node that initiates a new recovery attempt if an
+   initial attempt fails.  A single retry attempt is sufficient to
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 18]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   produce large increases in the restoration success rate and ability
+   to initiate successful LSP restoration attempts, especially at high
+   loads, while not adding significantly to the long-term average
+   recovery time.  Allowing additional attempts produces only small
+   additional gains in performance.  This suggests using additional
+   (intermediate) crankback signaling when using dynamic LSP restoration
+   (described in Section 5.5.2 - case 2).  Details on crankback
+   signaling are outside the scope of this document.
+
+5.4.  Difference between Recovery Type and Scheme
+
+   [RFC4427] defines the basic LSP/span recovery types.  This section
+   describes the recovery schemes that can be built using these recovery
+   types.  In brief, a recovery scheme is defined as the combination of
+   several ingress-egress node pairs supporting a given recovery type
+   (from the set of the recovery types they allow).  Several examples
+   are provided here to illustrate the difference between recovery types
+   such as 1:1 or M:N, and recovery schemes such as (1:1)^n or (M:N)^n
+   (referred to as shared-mesh recovery).
+
+   1. (1:1)^n with recovery resource sharing
+
+   The exponent, n, indicates the number of times a 1:1 recovery type is
+   applied between at most n different ingress-egress node pairs.  Here,
+   at most n pairs of disjoint working and recovery LSPs/spans share a
+   common resource at most n times.  Since the working LSPs/spans are
+   mutually disjoint, simultaneous requests for use of the shared
+   (common) resource will only occur in case of simultaneous failures,
+   which are less likely to happen.
+
+   For instance, in the common (1:1)^2 case, if the 2 recovery LSPs in
+   the group overlap the same common resource, then it can handle only
+   single failures; any multiple working LSP failures will cause at
+   least one working LSP to be denied automatic recovery.  Consider for
+   instance the following topology with the working LSPs A-B-C and F-G-H
+   and their respective recovery LSPs A-D-E-C and F-D-E-H that share a
+   common D-E link resource.
+
+                          A---------B---------C
+                           \                 /
+                            \               /
+                             D-------------E
+                            /               \
+                           /                 \
+                          F---------G---------H
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 19]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   2. (M:N)^n with recovery resource sharing
+
+   The (M:N)^n scheme is documented here for the sake of completeness
+   only (i.e., it is not mandated that GMPLS capabilities support this
+   scheme).  The exponent, n, indicates the number of times an M:N
+   recovery type is applied between at most n different ingress-egress
+   node pairs.  So the interpretation follows from the previous case,
+   except that here disjointness applies to the N working LSPs/spans and
+   to the M recovery LSPs/spans while sharing at most n times M common
+   resources.
+
+   In both schemes, it results in a "group" of sum{n=1}^N N{n} working
+   LSPs and a pool of shared recovery resources, not all of which are
+   available to any given working LSP.  In such conditions, defining a
+   metric that describes the amount of overlap among the recovery LSPs
+   would give some indication of the group's ability to handle
+   simultaneous failures of multiple LSPs.
+
+   For instance, in the simple (1:1)^n case, if n recovery LSPs in a
+   (1:1)^n group overlap, then the group can handle only single
+   failures; any simultaneous failure of multiple working LSPs will
+   cause at least one working LSP to be denied automatic recovery.  But
+   if one considers, for instance, a (2:2)^2 group in which there are
+   two pairs of overlapping recovery LSPs, then two LSPs (belonging to
+   the same pair) can be simultaneously recovered.  The latter case can
+   be illustrated by the following topology with 2 pairs of working LSPs
+   A-B-C and F-G-H and their respective recovery LSPs A-D-E-C and
+   F-D-E-H that share two common D-E link resources.
+
+                           A========B========C
+                           \\               //
+                            \\             //
+                             D =========== E
+                            //             \\
+                           //               \\
+                           F========G========H
+
+   Moreover, in all these schemes, (working) path disjointness can be
+   enforced by exchanging information related to working LSPs during the
+   recovery LSP signaling.  Specific issues related to the combination
+   of shared (discrete) bandwidth and disjointness for recovery schemes
+   are described in Section 8.4.2.
+
+
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 20]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+5.5.  LSP Recovery Mechanisms
+
+5.5.1.  Classification
+
+   The recovery time and ratio of LSPs/spans depend on proper recovery
+   LSP provisioning (meaning pre-provisioning when performed before
+   failure occurrence) and the level of overbooking of recovery
+   resources (i.e., over-provisioning).  A proper balance of these two
+   operations will result in the desired LSP/span recovery time and
+   ratio when single or multiple failures occur.  Note also that these
+   operations are mostly performed during the network planning phases.
+
+   The different options for LSP (pre-)provisioning and overbooking are
+   classified below to structure the analysis of the different recovery
+   mechanisms.
+
+   1. Pre-Provisioning
+
+   Proper recovery LSP pre-provisioning will help to alleviate the
+   failure of the working LSPs (due to the failure of the resources that
+   carry these LSPs).  As an example, one may compute and establish the
+   recovery LSP either end-to-end or segment-per-segment, to protect a
+   working LSP from multiple failure events affecting link(s), node(s)
+   and/or SRLG(s).  The recovery LSP pre-provisioning options are
+   classified as follows in the figure below:
+
+   (1) The recovery path can be either pre-computed or computed on-
+       demand.
+
+   (2) When the recovery path is pre-computed, it can be either pre-
+       signaled (implying recovery resource reservation) or signaled
+       on-demand.
+
+   (3) When the recovery resources are pre-signaled, they can be either
+       pre-selected or selected on-demand.
+
+   Recovery LSP provisioning phases:
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 21]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   (1) Path Computation --> On-demand
+           |
+           |
+            --> Pre-Computed
+                    |
+                    |
+                   (2) Signaling --> On-demand
+                           |
+                           |
+                            --> Pre-Signaled
+                                    |
+                                    |
+                                   (3) Resource Selection --> On-demand
+                                                |
+                                                |
+                                                 --> Pre-Selected
+
+   Note that these different options lead to different LSP/span recovery
+   times.  The following sections will consider the above-mentioned
+   pre-provisioning options when analyzing the different recovery
+   mechanisms.
+
+   2. Overbooking
+
+   There are many mechanisms available that allow the overbooking of the
+   recovery resources.  This overbooking can be done per LSP (as in the
+   example mentioned above), per link (such as span protection), or even
+   per domain.  In all these cases, the level of overbooking, as shown
+   in the below figure, can be classified as dedicated (such as 1+1 and
+   1:1), shared (such as 1:N and M:N), or unprotected (and thus
+   restorable, if enough recovery resources are available).
+
+   Overbooking levels:
+
+                    +----- Dedicated (for instance: 1+1, 1:1, etc.)
+                    |
+                    |
+
+                    +----- Shared (for instance: 1:N, M:N, etc.)
+                    |
+   Level of         |
+   Overbooking -----+----- Unprotected (for instance: 0:1, 0:N)
+
+   Also, when using shared recovery, one may support preemptible extra-
+   traffic; the recovery mechanism is then expected to allow preemption
+   of this low priority traffic in case of recovery resource contention
+   during recovery operations.  The following sections will consider the
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 22]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   above-mentioned overbooking options when analyzing the different
+   recovery mechanisms.
+
+5.5.2.  LSP Restoration
+
+   The following times are defined to provide a quantitative estimation
+   about the time performance of the different LSP restoration
+   mechanisms (also referred to as LSP re-routing):
+
+   - Path Computation Time: Tc
+   - Path Selection Time: Ts
+   - End-to-end LSP Resource Reservation Time: Tr (a delta for resource
+     selection is also considered, the corresponding total time is then
+     referred to as Trs)
+   - End-to-end LSP Resource Activation Time: Ta (a delta for
+     resource selection is also considered, the corresponding total
+     time is then referred to as Tas)
+
+   The Path Selection Time (Ts) is considered when a pool of recovery
+   LSP paths between a given pair of source/destination end-points is
+   pre-computed, and after a failure occurrence one of these paths is
+   selected for the recovery of the LSP under failure condition.
+
+   Note: failure management operations such as failure detection,
+   correlation, and notification are considered (for a given failure
+   event) as equally time-consuming for all the mechanisms described
+   below:
+
+   1. With Route Pre-computation (or LSP re-provisioning)
+
+   An end-to-end restoration LSP is established after the failure(s)
+   occur(s) based on a pre-computed path.  As such, one can define this
+   as an "LSP re-provisioning" mechanism.  Here, one or more (disjoint)
+   paths for the restoration LSP are computed (and optionally pre-
+   selected) before a failure occurs.
+
+   No reservation or selection of resources is performed along the
+   restoration path before failure occurrence.  As a result, there is no
+   guarantee that a restoration LSP is available when a failure occurs.
+
+   The expected total restoration time T is thus equal to Ts + Trs or to
+   Trs when a dedicated computation is performed for each working LSP.
+
+   2. Without Route Pre-computation (or Full LSP re-routing)
+
+   An end-to-end restoration LSP is dynamically established after the
+   failure(s) occur(s).  After failure occurrence, one or more
+   (disjoint) paths for the restoration LSP are dynamically computed and
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 23]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   one is selected.  As such, one can define this as a complete "LSP
+   re-routing" mechanism.
+
+   No reservation or selection of resources is performed along the
+   restoration path before failure occurrence.  As a result, there is no
+   guarantee that a restoration LSP is available when a failure occurs.
+
+   The expected total restoration time T is thus equal to Tc (+ Ts) +
+   Trs.  Therefore, time performance between these two approaches
+   differs by the time required for route computation Tc (and its
+   potential selection time, Ts).
+
+5.5.3.  Pre-Planned LSP Restoration
+
+   Pre-planned LSP restoration (also referred to as pre-planned LSP re-
+   routing) implies that the restoration LSP is pre-signaled.  This in
+   turn implies the reservation of recovery resources along the
+   restoration path.  Two cases can be defined based on whether the
+   recovery resources are pre-selected.
+
+   1. With resource reservation and without resource pre-selection
+
+   Before failure occurrence, an end-to-end restoration path is pre-
+   selected from a set of pre-computed (disjoint) paths.  The
+   restoration LSP is signaled along this pre-selected path to reserve
+   resources at each node, but these resources are not selected.
+
+   In this case, the resources reserved for each restoration LSP may be
+   dedicated or shared between multiple restoration LSPs whose working
+   LSPs are not expected to fail simultaneously.  Local node policies
+   can be applied to define the degree to which these resources can be
+   shared across independent failures.  Also, since a restoration scheme
+   is considered, resource sharing should not be limited to restoration
+   LSPs that start and end at the same ingress and egress nodes.
+   Therefore, each node participating in this scheme is expected to
+   receive some feedback information on the sharing degree of the
+   recovery resource(s) that this scheme involves.
+
+   Upon failure detection/notification message reception, signaling is
+   initiated along the restoration path to select the resources, and to
+   perform the appropriate operation at each node crossed by the
+   restoration LSP (e.g., cross-connections).  If lower priority LSPs
+   were established using the restoration resources, they must be
+   preempted when the restoration LSP is activated.
+
+   Thus, the expected total restoration time T is equal to Tas (post-
+   failure activation), while operations performed before failure
+   occurrence take Tc + Ts + Tr.
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 24]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   2. With both resource reservation and resource pre-selection
+
+   Before failure occurrence, an end-to-end restoration path is pre-
+   selected from a set of pre-computed (disjoint) paths.  The
+   restoration LSP is signaled along this pre-selected path to reserve
+   AND select resources at each node, but these resources are not
+   committed at the data plane level.  So that the selection of the
+   recovery resources is committed at the control plane level only, no
+   cross-connections are performed along the restoration path.
+
+   In this case, the resources reserved and selected for each
+   restoration LSP may be dedicated or even shared between multiple
+   restoration LSPs whose associated working LSPs are not expected to
+   fail simultaneously.  Local node policies can be applied to define
+   the degree to which these resources can be shared across independent
+   failures.  Also, because a restoration scheme is considered, resource
+   sharing should not be limited to restoration LSPs that start and end
+   at the same ingress and egress nodes.  Therefore, each node
+   participating in this scheme is expected to receive some feedback
+   information on the sharing degree of the recovery resource(s) that
+   this scheme involves.
+
+   Upon failure detection/notification message reception, signaling is
+   initiated along the restoration path to activate the reserved and
+   selected resources, and to perform the appropriate operation at each
+   node crossed by the restoration LSP (e.g., cross-connections).  If
+   lower priority LSPs were established using the restoration resources,
+   they must be preempted when the restoration LSP is activated.
+
+   Thus, the expected total restoration time T is equal to Ta (post-
+   failure activation), while operations performed before failure
+   occurrence take Tc + Ts + Trs.  Therefore, time performance between
+   these two approaches differs only by the time required for resource
+   selection during the activation of the recovery LSP (i.e., Tas - Ta).
+
+5.5.4.  LSP Segment Restoration
+
+   The above approaches can be applied on an edge-to-edge LSP basis
+   rather than end-to-end LSP basis (i.e., to reduce the global recovery
+   time) by allowing the recovery of the individual LSP segments
+   constituting the end-to-end LSP.
+
+   Also, by using the horizontal hierarchy approach described in Section
+   7.1, an end-to-end LSP can be recovered by multiple recovery
+   mechanisms applied on an LSP segment basis (e.g., 1:1 edge-to-edge
+   LSP protection in a metro network, and M:N edge-to-edge protection in
+   the core).  These mechanisms are ideally independent and may even use
+   different failure localization and notification mechanisms.
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 25]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+6.  Reversion
+
+   Reversion (a.k.a. normalization) is defined as the mechanism allowing
+   switching of normal traffic from the recovery LSP/span to the working
+   LSP/span previously under failure condition.  Use of normalization is
+   at the discretion of the recovery domain policy.  Normalization may
+   impact the normal traffic (a second hit) depending on the
+   normalization mechanism used.
+
+   If normalization is supported, then 1) the LSP/span must be returned
+   to the working LSP/span when the failure condition clears and 2) the
+   capability to de-activate (turn-off) the use of reversion should be
+   provided.  De-activation of reversion should not impact the normal
+   traffic, regardless of whether it is currently using the working or
+   recovery LSP/span.
+
+   Note: during the failure, the reuse of any non-failed resources
+   (e.g., LSP and/or spans) belonging to the working LSP/span is under
+   the discretion of recovery domain policy.
+
+6.1.  Wait-To-Restore (WTR)
+
+   A specific mechanism (Wait-To-Restore) is used to prevent frequent
+   recovery switching operations due to an intermittent defect (e.g.,
+   Bit Error Rate (BER) fluctuating around the SD threshold).
+
+   First, an LSP/span under failure condition must become fault-free,
+   e.g., a BER less than a certain recovery threshold.  After the
+   recovered LSP/span (i.e., the previously working LSP/span) meets this
+   criterion, a fixed period of time shall elapse before normal traffic
+   uses the corresponding resources again.  This duration called Wait-
+   To-Restore (WTR) period or timer is generally on the order of a few
+   minutes (for instance, 5 minutes) and should be capable of being set.
+   The WTR timer may be either a fixed period, or provide for
+   incrementally longer periods before retrying.  An SF or SD condition
+   on the previously working LSP/span will override the WTR timer value
+   (i.e., the WTR cancels and the WTR timer will restart).
+
+6.2.  Revertive Mode Operation
+
+   In revertive mode of operation, when the recovery LSP/span is no
+   longer required, i.e., the failed working LSP/span is no longer in SD
+   or SF condition, a local Wait-to-Restore (WTR) state will be
+   activated before switching the normal traffic back to the recovered
+   working LSP/span.
+
+   During the reversion operation, since this state becomes the highest
+   in priority, signaling must maintain the normal traffic on the
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 26]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   recovery LSP/span from the previously failed working LSP/span.
+   Moreover, during this WTR state, any null traffic or extra traffic
+   (if applicable) request is rejected.
+
+   However, deactivation (cancellation) of the wait-to-restore timer may
+   occur if there are higher priority request attempts.  That is, the
+   recovery LSP/span usage by the normal traffic may be preempted if a
+   higher priority request for this recovery LSP/span is attempted.
+
+6.3.  Orphans
+
+   When a reversion operation is requested, normal traffic must be
+   switched from the recovery to the recovered working LSP/span.  A
+   particular situation occurs when the previously working LSP/span
+   cannot be recovered, so normal traffic cannot be switched back.  In
+   that case, the LSP/span under failure condition (also referred to as
+   "orphan") must be cleared (i.e., removed) from the pool of resources
+   allocated for normal traffic.  Otherwise, potential de-
+   synchronization between the control and transport plane resource
+   usage can appear.  Depending on the signaling protocol capabilities
+   and behavior, different mechanisms are expected here.
+
+   Therefore, any reserved or allocated resources for the LSP/span under
+   failure condition must be unreserved/de-allocated.  Several ways can
+   be used for that purpose: wait for the clear-out time interval to
+   elapse, initiate a deletion from the ingress or the egress node, or
+   trigger the initiation of deletion from an entity (such as an EMS or
+   NMS) capable of reacting upon reception of an appropriate
+   notification message.
+
+7.  Hierarchies
+
+   Recovery mechanisms are being made available at multiple (if not all)
+   transport layers within so-called "IP/MPLS-over-optical" networks.
+   However, each layer has certain recovery features, and one needs to
+   determine the exact impact of the interaction between the recovery
+   mechanisms provided by these layers.
+
+   Hierarchies are used to build scalable complex systems.  By hiding
+   the internal details, abstraction is used as a mechanism to build
+   large networks or as a technique for enforcing technology,
+   topological, or administrative boundaries.  The same hierarchical
+   concept can be applied to control the network survivability.  Network
+   survivability is the set of capabilities that allow a network to
+   restore affected traffic in the event of a failure.  Network
+   survivability is defined further in [RFC4427].  In general, it is
+   expected that the recovery action is taken by the recoverable
+   LSP/span closest to the failure in order to avoid the multiplication
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 27]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   of recovery actions.  Moreover, recovery hierarchies also can be
+   bound to control plane logical partitions (e.g., administrative or
+   topological boundaries).  Each logical partition may apply different
+   recovery mechanisms.
+
+   In brief, it is commonly accepted that the lower layers can provide
+   coarse but faster recovery while the higher layers can provide finer
+   but slower recovery.  Moreover, it is also desirable to avoid similar
+   layers with functional overlaps in order to optimize network resource
+   utilization and processing overhead, since repeating the same
+   capabilities at each layer does not create any added value for the
+   network as a whole.  In addition, even if a lower layer recovery
+   mechanism is enabled, it does not prevent the additional provision of
+   a recovery mechanism at the upper layer.  The inverse statement does
+   not necessarily hold; that is, enabling an upper layer recovery
+   mechanism may prevent the use of a lower layer recovery mechanism.
+   In this context, this section analyzes these hierarchical aspects
+   including the physical (passive) layer(s).
+
+7.1.  Horizontal Hierarchy (Partitioning)
+
+   A horizontal hierarchy is defined when partitioning a single-layer
+   network (and its control plane) into several recovery domains.
+   Within a domain, the recovery scope may extend over a link (or span),
+   LSP segment, or even an end-to-end LSP.  Moreover, an administrative
+   domain may consist of a single recovery domain or can be partitioned
+   into several smaller recovery domains.  The operator can partition
+   the network into recovery domains based on physical network topology,
+   control plane capabilities, or various traffic engineering
+   constraints.
+
+   An example often addressed in the literature is the metro-core-metro
+   application (sometimes extended to a metro-metro/core-core) within a
+   single transport layer (see Section 7.2).  For such a case, an end-
+   to-end LSP is defined between the ingress and egress metro nodes,
+   while LSP segments may be defined within the metro or core sub-
+   networks.  Each of these topological structures determines a so-
+   called "recovery domain" since each of the LSPs they carry can have
+   its own recovery type (or even scheme).  The support of multiple
+   recovery types and schemes within a sub-network is referred to as a
+   "multi-recovery capable domain" or simply "multi-recovery domain".
+
+7.2.  Vertical Hierarchy (Layers)
+
+   It is very challenging to combine the different recovery capabilities
+   available across the path (i.e., switching capable) and section
+   layers to ensure that certain network survivability objectives are
+   met for the network-supported services.
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 28]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   As a first analysis step, one can draw the following guidelines for
+   a vertical coordination of the recovery mechanisms:
+
+   - The lower the layer, the faster the notification and switching.
+
+   - The higher the layer, the finer the granularity of the recoverable
+     entity and therefore the granularity of the recovery resource.
+
+   Moreover, in the context of this analysis, a vertical hierarchy
+   consists of multiple layered transport planes providing different:
+
+   - Discrete bandwidth granularities for non-packet LSPs such as OCh,
+     ODUk, STS_SPE/HOVC, and VT_SPE/LOVC LSPs and continuous bandwidth
+     granularities for packet LSPs.
+
+   - Potential recovery capabilities with different temporal
+     granularities: ranging from milliseconds to tens of seconds
+
+   Note: based on the bandwidth granularity, we can determine four
+   classes of vertical hierarchies: (1) packet over packet, (2) packet
+   over circuit, (3) circuit over packet, and (4) circuit over circuit.
+   Below we briefly expand on (4) only. (2) is covered in [RFC3386]. (1)
+   is extensively covered by the MPLS Working Group, and (3) by the PWE3
+   Working Group.
+
+   In SONET/SDH environments, one typically considers the VT_SPE/LOVC
+   and STS SPE/HOVC as independent layers (for example, VT_SPE/LOVC LSP
+   uses the underlying STS_SPE/HOVC LSPs as links).  In OTN, the ODUk
+   path layers will lie on the OCh path layer, i.e., the ODUk LSPs use
+   the underlying OCh LSPs as OTUk links.  Note here that lower layer
+   LSPs may simply be provisioned and not necessarily dynamically
+   triggered or established (control driven approach).  In this context,
+   an LSP at the path layer (i.e., established using GMPLS signaling),
+   such as an optical channel LSP, appears at the OTUk layer as a link,
+   controlled by a link management protocol such as LMP.
+
+   The first key issue with multi-layer recovery is that achieving
+   individual or bulk LSP recovery will be as efficient as the
+   underlying link (local span) recovery.  In such a case, the span can
+   be either protected or unprotected, but the LSP it carries must be
+   (at least locally) recoverable.  Therefore, the span recovery process
+   can be either independent when protected (or restorable), or
+   triggered by the upper LSP recovery process.  The former case
+   requires coordination to achieve subsequent LSP recovery.  Therefore,
+   in order to achieve robustness and fast convergence, multi-layer
+   recovery requires a fine-tuned coordination mechanism.
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 29]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   Moreover, in the absence of adequate recovery mechanism coordination
+   (for instance, a pre-determined coordination when using a hold-off
+   timer), a failure notification may propagate from one layer to the
+   next one within a recovery hierarchy.  This can cause "collisions"
+   and trigger simultaneous recovery actions that may lead to race
+   conditions and, in turn, reduce the optimization of the resource
+   utilization and/or generate global instabilities in the network (see
+   [MANCHESTER]).  Therefore, a consistent and efficient escalation
+   strategy is needed to coordinate recovery across several layers.
+
+   One can expect that the definition of the recovery mechanisms and
+   protocol(s) is technology-independent so that they can be
+   consistently implemented at different layers; this would in turn
+   simplify their global coordination.  Moreover, as mentioned in
+   [RFC3386], some looser form of coordination and communication between
+   (vertical) layers such as a consistent hold-off timer configuration
+   (and setup through signaling during the working LSP establishment)
+   can be considered, thereby allowing the synchronization between
+   recovery actions performed across these layers.
+
+7.2.1.  Recovery Granularity
+
+   In most environments, the design of the network and the vertical
+   distribution of the LSP bandwidth are such that the recovery
+   granularity is finer at higher layers.  The OTN and SONET/SDH layers
+   can recover only the whole section or the individual connections they
+   transports whereas the IP/MPLS control plane can recover individual
+   packet LSPs or groups of packet LSPs independently of their
+   granularity.  On the other side, the recovery granularity at the
+   sub-wavelength level (i.e., SONET/SDH) can be provided only when the
+   network includes devices switching at the same granularity (and thus
+   not with optical channel level).  Therefore, the network layer can
+   deliver control-plane-driven recovery mechanisms on a per-LSP basis
+   if and only if these LSPs have their corresponding switching
+   granularity supported at the transport plane level.
+
+7.3.  Escalation Strategies
+
+   There are two types of escalation strategies (see [DEMEESTER]):
+   bottom-up and top-down.
+
+   The bottom-up approach assumes that lower layer recovery types and
+   schemes are more expedient and faster than upper layer ones.
+   Therefore, we can inhibit or hold off higher layer recovery.
+   However, this assumption is not entirely true.  Consider for instance
+   a SONET/SDH based protection mechanism (with a protection switching
+   time of less than 50 ms) lying on top of an OTN restoration mechanism
+   (with a restoration time of less than 200 ms).  Therefore, this
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 30]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   assumption should be (at least) clarified as: the lower layer
+   recovery mechanism is expected to be faster than the upper level one,
+   if the same type of recovery mechanism is used at each layer.
+
+   Consequently, taking into account the recovery actions at the
+   different layers in a bottom-up approach: if lower layer recovery
+   mechanisms are provided and sequentially activated in conjunction
+   with higher layer ones, the lower layers must have an opportunity to
+   recover normal traffic before the higher layers do.  However, if
+   lower layer recovery is slower than higher layer recovery, the lower
+   layer must either communicate the failure-related information to the
+   higher layer(s) (and allow it to perform recovery), or use a hold-off
+   timer in order to temporarily set the higher layer recovery action in
+   a "standby mode".  Note that the a priori information exchange
+   between layers concerning their efficiency is not within the current
+   scope of this document.  Nevertheless, the coordination functionality
+   between layers must be configurable and tunable.
+
+   For example, coordination between the optical and packet layer
+   control plane enables the optical layer to perform the failure
+   management operations (in particular, failure detection and
+   notification) while giving to the packet layer control plane the
+   authority to decide and perform the recovery actions.  If the packet
+   layer recovery action is unsuccessful, fallback at the optical layer
+   can be performed subsequently.
+
+   The top-down approach attempts service recovery at the higher layers
+   before invoking lower layer recovery.  Higher layer recovery is
+   service selective, and permits "per-CoS" or "per-connection" re-
+   routing.  With this approach, the most important aspect is that the
+   upper layer should provide its own reliable and independent failure
+   detection mechanism from the lower layer.
+
+   [DEMEESTER] also suggests recovery mechanisms incorporating a
+   coordinated effort shared by two adjacent layers with periodic status
+   updates.  Moreover, some of these recovery operations can be pre-
+   assigned (on a per-link basis) to a certain layer, e.g., a given link
+   will be recovered at the packet layer while another will be recovered
+   at the optical layer.
+
+7.4.  Disjointness
+
+   Having link and node diverse working and recovery LSPs/spans does not
+   guarantee their complete disjointness.  Due to the common physical
+   layer topology (passive), additional hierarchical concepts, such as
+   the Shared Risk Link Group (SRLG), and mechanisms, such as SRLG
+   diverse path computation, must be developed to provide complete
+   working and recovery LSP/span disjointness (see [IPO-IMP] and
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 31]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   [RFC4202]).  Otherwise, a failure affecting the working LSP/span
+   would also potentially affect the recovery LSP/span; one refers to
+   such an event as "common failure".
+
+7.4.1.  SRLG Disjointness
+
+   A Shared Risk Link Group (SRLG) is defined as the set of links
+   sharing a common risk (such as a common physical resource such as a
+   fiber link or a fiber cable).  For instance, a set of links L belongs
+   to the same SRLG s, if they are provisioned over the same fiber link
+   f.
+
+   The SRLG properties can be summarized as follows:
+
+   1) A link belongs to more than one SRLG if and only if it crosses one
+      of the resources covered by each of them.
+
+   2) Two links belonging to the same SRLG can belong individually to
+      (one or more) other SRLGs.
+
+   3) The SRLG set S of an LSP is defined as the union of the individual
+      SRLG s of the individual links composing this LSP.
+
+   SRLG disjointness is also applicable to LSPs:
+
+      The LSP SRLG disjointness concept is based on the following
+      postulate: an LSP (i.e., a sequence of links and nodes) covers an
+      SRLG if and only if it crosses one of the links or nodes belonging
+      to that SRLG.
+
+      Therefore, the SRLG disjointness for LSPs, can be defined as
+      follows: two LSPs are disjoint with respect to an SRLG s if and
+      only if they do not cover simultaneously this SRLG s.
+
+      Whilst the SRLG disjointness for LSPs with respect to a set S of
+      SRLGs, is defined as follows: two LSPs are disjoint with respect
+      to a set of SRLGs S if and only if the set of SRLGs that are
+      common to both LSPs is disjoint from set S.
+
+   The impact on recovery is noticeable: SRLG disjointness is a
+   necessary (but not a sufficient) condition to ensure network
+   survivability.  With respect to the physical network resources, a
+   working-recovery LSP/span pair must be SRLG-disjoint in case of
+   dedicated recovery type.  On the other hand, in case of shared
+   recovery, a group of working LSP/spans must be mutually SRLG-disjoint
+   in order to allow for a (single and common) shared recovery LSP that
+   is itself SRLG-disjoint from each of the working LSPs/spans.
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 32]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+8.  Recovery Mechanisms Analysis
+
+   In order to provide a structured analysis of the recovery mechanisms
+   detailed in the previous sections, the following dimensions can be
+   considered:
+
+   1. Fast convergence (performance): provide a mechanism that
+      aggregates multiple failures (implying fast failure detection and
+      correlation mechanisms) and fast recovery decision independently
+      of the number of failures occurring in the optical network (also
+      implying a fast failure notification).
+
+   2. Efficiency (scalability): minimize the switching time required for
+      LSP/span recovery independently of the number of LSPs/spans being
+      recovered (this implies efficient failure correlation, fast
+      failure notification, and time-efficient recovery mechanisms).
+
+   3. Robustness (availability): minimize the LSP/span downtime
+      independently of the underlying topology of the transport plane
+      (this implies a highly responsive recovery mechanism).
+
+   4. Resource optimization (optimality): minimize the resource
+      capacity, including LSPs/spans and nodes (switching capacity),
+      required for recovery purposes; this dimension can also be
+      referred to as optimizing the sharing degree of the recovery
+      resources.
+
+   5. Cost optimization: provide a cost-effective recovery type/scheme.
+
+   However, these dimensions are either outside the scope of this
+   document (such as cost optimization and recovery path computational
+   aspects) or mutually conflicting.  For instance, it is obvious that
+   providing a 1+1 LSP protection minimizes the LSP downtime (in case of
+   failure) while being non-scalable and consuming recovery resource
+   without enabling any extra-traffic.
+
+   The following sections analyze the recovery phases and mechanisms
+   detailed in the previous sections with respect to the dimensions
+   described above in order to assess the GMPLS protocol suite
+   capabilities and applicability.  In turn, this allows the evaluation
+   of the potential need for further GMPLS signaling and routing
+   extensions.
+
+
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 33]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+8.1.  Fast Convergence (Detection/Correlation and Hold-off Time)
+
+   Fast convergence is related to the failure management operations.  It
+   refers to the time elapsed between failure detection/correlation and
+   hold-off time, the point at which the recovery switching actions are
+   initiated.  This point has been detailed in Section 4.
+
+8.2.  Efficiency (Recovery Switching Time)
+
+   In general, the more pre-assignment/pre-planning of the recovery
+   LSP/span, the more rapid the recovery is.  Because protection implies
+   pre-assignment (and cross-connection) of the protection resources, in
+   general, protection recovers faster than restoration.
+
+   Span restoration is likely to be slower than most span protection
+   types; however this greatly depends on the efficiency of the span
+   restoration signaling.  LSP restoration with pre-signaled and pre-
+   selected recovery resources is likely to be faster than fully dynamic
+   LSP restoration, especially because of the elimination of any
+   potential crankback during the recovery LSP establishment.
+
+   If one excludes the crankback issue, the difference between dynamic
+   and pre-planned restoration depends on the restoration path
+   computation and selection time.  Since computational considerations
+   are outside the scope of this document, it is up to the vendor to
+   determine the average and maximum path computation time in different
+   scenarios and to the operator to decide whether or not dynamic
+   restoration is advantageous over pre-planned schemes that depend on
+   the network environment.  This difference also depends on the
+   flexibility provided by pre-planned restoration versus dynamic
+   restoration.  Pre-planned restoration implies a somewhat limited
+   number of failure scenarios (that can be due, for instance, to local
+   storage capacity limitation).  Dynamic restoration enables on-demand
+   path computation based on the information received through failure
+   notification message, and as such, it is more robust with respect to
+   the failure scenario scope.
+
+   Moreover, LSP segment restoration, in particular, dynamic restoration
+   (i.e., no path pre-computation, so none of the recovery resource is
+   pre-reserved) will generally be faster than end-to-end LSP
+   restoration.  However, local LSP restoration assumes that each LSP
+   segment end-point has enough computational capacity to perform this
+   operation while end-to-end LSP restoration requires only that LSP
+   end-points provide this path computation capability.
+
+   Recovery time objectives for SONET/SDH protection switching (not
+   including time to detect failure) are specified in [G.841] at 50 ms,
+   taking into account constraints on distance, number of connections
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 34]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   involved, and in the case of ring enhanced protection, number of
+   nodes in the ring.  Recovery time objectives for restoration
+   mechanisms have been proposed through a separate effort [RFC3386].
+
+8.3.  Robustness
+
+   In general, the less pre-assignment (protection)/pre-planning
+   (restoration) of the recovery LSP/span, the more robust the recovery
+   type or scheme is to a variety of single failures, provided that
+   adequate resources are available.  Moreover, the pre-selection of the
+   recovery resources gives (in the case of multiple failure scenarios)
+   less flexibility than no recovery resource pre-selection.  For
+   instance, if failures occur that affect two LSPs sharing a common
+   link along their restoration paths, then only one of these LSPs can
+   be recovered.  This occurs unless the restoration path of at least
+   one of these LSPs is re-computed, or the local resource assignment is
+   modified on the fly.
+
+   In addition, recovery types and schemes with pre-planned recovery
+   resources (in particular, LSP/spans for protection and LSPs for
+   restoration purposes) will not be able to recover from failures that
+   simultaneously affect both the working and recovery LSP/span.  Thus,
+   the recovery resources should ideally be as disjoint as possible
+   (with respect to link, node, and SRLG) from the working ones, so that
+   any single failure event will not affect both working and recovery
+   LSP/span.  In brief, working and recovery resources must be fully
+   diverse in order to guarantee that a given failure will not affect
+   simultaneously the working and the recovery LSP/span.  Also, the risk
+   of simultaneous failure of the working and the recovery LSPs can be
+   reduced.  It is reduced by computing a new recovery path whenever a
+   failure occurs along one of the recovery LSPs or by computing a new
+   recovery path and provision the corresponding LSP whenever a failure
+   occurs along a working LSP/span.  Both methods enable the network to
+   maintain the number of available recovery path constant.
+
+   The robustness of a recovery scheme is also determined by the amount
+   of pre-reserved (i.e., signaled) recovery resources within a given
+   shared resource pool: as the sharing degree of recovery resources
+   increases, the recovery scheme becomes less robust to multiple
+   LSP/span failure occurrences.  Recovery schemes, in particular
+   restoration, with pre-signaled resource reservation (with or without
+   pre-selection) should be capable of reserving an adequate amount of
+   resource to ensure recovery from any specific set of failure events,
+   such as any single SRLG failure, any two SRLG failures, etc.
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 35]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+8.4.  Resource Optimization
+
+   It is commonly admitted that sharing recovery resources provides
+   network resource optimization.  Therefore, from a resource
+   utilization perspective, protection schemes are often classified with
+   respect to their degree of sharing recovery resources with the
+   working entities.  Moreover, non-permanent bridging protection types
+   allow (under normal conditions) for extra-traffic over the recovery
+   resources.
+
+   From this perspective, the following statements are true:
+
+   1) 1+1 LSP/Span protection is the most resource-consuming protection
+      type because it does not allow for any extra traffic.
+
+   2) 1:1 LSP/span recovery requires dedicated recovery LSP/span
+      allowing for extra traffic.
+
+   3) 1:N and M:N LSP/span recovery require 1 (and M, respectively)
+      recovery LSP/span (shared between the N working LSP/span) allowing
+      for extra traffic.
+
+   Obviously, 1+1 protection precludes, and 1:1 recovery does not allow
+   for any recovery LSP/span sharing, whereas 1:N and M:N recovery do
+   allow sharing of 1 (M, respectively) recovery LSP/spans between N
+   working LSP/spans.  However, despite the fact that 1:1 LSP recovery
+   precludes the sharing of the recovery LSP, the recovery schemes that
+   can be built from it (e.g., (1:1)^n, see Section 5.4) do allow
+   sharing of its recovery resources.  In addition, the flexibility in
+   the usage of shared recovery resources (in particular, shared links)
+   may be limited because of network topology restrictions, e.g., fixed
+   ring topology for traditional enhanced protection schemes.
+
+   On the other hand, when using LSP restoration with pre-signaled
+   resource reservation, the amount of reserved restoration capacity is
+   determined by the local bandwidth reservation policies.  In LSP
+   restoration schemes with re-provisioning, a pool of spare resources
+   can be defined from which all resources are selected after failure
+   occurrence for the purpose of restoration path computation.  The
+   degree to which restoration schemes allow sharing amongst multiple
+   independent failures is then directly inferred from the size of the
+   resource pool.  Moreover, in all restoration schemes, spare resources
+   can be used to carry preemptible traffic (thus over preemptible
+   LSP/span) when the corresponding resources have not been committed
+   for LSP/span recovery purposes.
+
+   From this, it clearly follows that less recovery resources (i.e.,
+   LSP/spans and switching capacity) have to be allocated to a shared
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 36]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   recovery resource pool if a greater sharing degree is allowed.  Thus,
+   the network survivability level is determined by the policy that
+   defines the amount of shared recovery resources and by the maximum
+   sharing degree allowed for these recovery resources.
+
+8.4.1.  Recovery Resource Sharing
+
+   When recovery resources are shared over several LSP/Spans, the use of
+   the Maximum Reservable Bandwidth, the Unreserved Bandwidth, and the
+   Maximum LSP Bandwidth (see [RFC4202]) provides the information needed
+   to obtain the optimization of the network resources allocated for
+   shared recovery purposes.
+
+   The Maximum Reservable Bandwidth is defined as the Maximum Link
+   Bandwidth but it may be greater in case of link over-subscription.
+
+   The Unreserved Bandwidth (at priority p) is defined as the bandwidth
+   not yet reserved on a given TE link (its initial value for each
+   priority p corresponds to the Maximum Reservable Bandwidth).  Last,
+   the Maximum LSP Bandwidth (at priority p) is defined as the smaller
+   of Unreserved Bandwidth (at priority p) and Maximum Link Bandwidth.
+
+   Here, one generally considers a recovery resource sharing degree (or
+   ratio) to globally optimize the shared recovery resource usage.  The
+   distribution of the bandwidth utilization per TE link can be inferred
+   from the per-priority bandwidth pre-allocation.  By using the Maximum
+   LSP Bandwidth and the Maximum Reservable Bandwidth, the amount of
+   (over-provisioned) resources that can be used for shared recovery
+   purposes is known from the IGP.
+
+   In order to analyze this behavior, we define the difference between
+   the Maximum Reservable Bandwidth (in the present case, this value is
+   greater than the Maximum Link Bandwidth) and the Maximum LSP
+   Bandwidth per TE link i as the Maximum Shareable Bandwidth or
+   max_R[i].  Within this quantity, the amount of bandwidth currently
+   allocated for shared recovery per TE link i is defined as R[i].  Both
+   quantities are expressed in terms of discrete bandwidth units (and
+   thus, the Minimum LSP Bandwidth is of one bandwidth unit).
+
+   The knowledge of this information available per TE link can be
+   exploited in order to optimize the usage of the resources allocated
+   per TE link for shared recovery.  If one refers to r[i] as the actual
+   bandwidth per TE link i (in terms of discrete bandwidth units)
+   committed for shared recovery, then the following quantity must be
+   maximized over the potential TE link candidates:
+
+        sum {i=1}^N [(R{i} - r{i})/(t{i} - b{i})]
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 37]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+        or equivalently: sum {i=1}^N [(R{i} - r{i})/r{i}]
+
+        with R{i} >= 1 and r{i} >= 1 (in terms of per component
+        bandwidth unit)
+
+   In this formula, N is the total number of links traversed by a given
+   LSP, t[i] the Maximum Link Bandwidth per TE link i, and b[i] the sum
+   per TE link i of the bandwidth committed for working LSPs and other
+   recovery LSPs (thus except "shared bandwidth" LSPs).  The quantity
+   [(R{i} - r{i})/r{i}] is defined as the Shared (Recovery) Bandwidth
+   Ratio per TE link i.  In addition, TE links for which R[i] reaches
+   max_R[i] or for which r[i] = 0 are pruned during shared recovery path
+   computation as well as TE links for which max_R[i] = r[i] that can
+   simply not be shared.
+
+   More generally, one can draw the following mapping between the
+   available bandwidth at the transport and control plane level:
+
+                                 - ---------- Max Reservable Bandwidth
+                                |  -----  ^
+                                |R -----  |
+                                |  -----  |
+                                 - -----  |max_R
+                                   -----  |
+   --------  TE link Capacity    - ------ | - Maximum TE Link Bandwidth
+   -----                        |r -----  v
+   -----     <------ b ------>   - ---------- Maximum LSP Bandwidth
+   -----                           -----
+   -----                           -----
+   -----                           -----
+   -----                           -----
+   -----                           ----- <--- Minimum LSP Bandwidth
+   -------- 0                      ---------- 0
+
+   Note that the above approach does not require the flooding of any per
+   LSP information or any detailed distribution of the bandwidth
+   allocation per component link or individual ports or even any per-
+   priority shareable recovery bandwidth information (using a dedicated
+   sub-TLV).  The latter would provide the same capability as the
+   already defined Maximum LSP bandwidth per-priority information.  This
+   approach is referred to as a Partial (or Aggregated) Information
+   Routing as described in [KODIALAM1] and [KODIALAM2].  They show that
+   the difference obtained with a Full (or Complete) Information Routing
+   approach (where for the whole set of working and recovery LSPs, the
+   amount of bandwidth units they use per-link is known at each node and
+   for each link) is clearly negligible.  The Full Information Routing
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 38]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   approach is detailed in [GLI].  Note also that both approaches rely
+   on the deterministic knowledge (at different degrees) of the network
+   topology and resource usage status.
+
+   Moreover, extending the GMPLS signaling capabilities can enhance the
+   Partial Information Routing approach.  It is enhanced by allowing
+   working-LSP-related information and, in particular, its path
+   (including link and node identifiers) to be exchanged with the
+   recovery LSP request.  This enables more efficient admission control
+   at upstream nodes of shared recovery resources, and in particular,
+   links (see Section 8.4.3).
+
+8.4.2.  Recovery Resource Sharing and SRLG Recovery
+
+   Resource shareability can also be maximized with respect to the
+   number of times each SRLG is protected by a recovery resource (in
+   particular, a shared TE link) and methods can be considered for
+   avoiding contention of the shared recovery resources in case of
+   single SRLG failure.  These methods enable the sharing of recovery
+   resources between two (or more) recovery LSPs, if their respective
+   working LSPs are mutually disjoint with respect to link, node, and
+   SRLGs.  Then, a single failure does not simultaneously disrupt
+   several (or at least two) working LSPs.
+
+   For instance, [BOUILLET] shows that the Partial Information Routing
+   approach can be extended to cover recovery resource shareability with
+   respect to SRLG recoverability (i.e., the number of times each SRLG
+   is recoverable).  By flooding this aggregated information per TE
+   link, path computation and selection of SRLG-diverse recovery LSPs
+   can be optimized with respect to the sharing of recovery resource
+   reserved on each TE link.  This yields a performance difference of
+   less than 5%, which is negligible compared to the corresponding Full
+   Information Flooding approach (see [GLI]).
+
+   For this purpose, additional extensions to [RFC4202] in support of
+   path computation for shared mesh recovery have been often considered
+   in the literature.  TE link attributes would include, among others,
+   the current number of recovery LSPs sharing the recovery resources
+   reserved on the TE link, and the current number of SRLGs recoverable
+   by this amount of (shared) recovery resources reserved on the TE
+   link.  The latter is equivalent to the current number of SRLGs that
+   will be recovered by the recovery LSPs sharing the recovery resource
+   reserved on the TE link.  Then, if explicit SRLG recoverability is
+   considered, a TE link attribute would be added that includes the
+   explicit list of SRLGs (recoverable by the shared recovery resource
+   reserved on the TE link) and their respective shareable recovery
+   bandwidths.  The latter information is equivalent to the shareable
+   recovery bandwidth per SRLG (or per group of SRLGs), which implies
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 39]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   that the amount of shareable bandwidth and the number of listed SRLGs
+   will decrease over time.
+
+   Compared to the case of recovery resource sharing only (regardless of
+   SRLG recoverability, as described in Section 8.4.1), these additional
+   TE link attributes would potentially deliver better path computation
+   and selection (at a distinct ingress node) for shared mesh recovery
+   purposes.  However, due to the lack of evidence of better efficiency
+   and due to the complexity that such extensions would generate, they
+   are not further considered in the scope of the present analysis.  For
+   instance, a per-SRLG group minimum/maximum shareable recovery
+   bandwidth is restricted by the length that the corresponding (sub-)
+   TLV may take and thus the number of SRLGs that it can include.
+   Therefore, the corresponding parameter should not be translated into
+   GMPLS routing (or even signaling) protocol extensions in the form of
+   TE link sub-TLV.
+
+8.4.3.  Recovery Resource Sharing, SRLG Disjointness and Admission
+        Control
+
+   Admission control is a strict requirement to be fulfilled by nodes
+   giving access to shared links.  This can be illustrated using the
+   following network topology:
+
+      A ------ C ====== D
+      |        |        |
+      |        |        |
+      |        B        |
+      |        |        |
+      |        |        |
+       ------- E ------ F
+
+   Node A creates a working LSP to D (A-C-D), B creates simultaneously a
+   working LSP to D (B-C-D) and a recovery LSP (B-E-F-D) to the same
+   destination.  Then, A decides to create a recovery LSP to D (A-E-F-
+   D), but since the C-D span carries both working LSPs, node E should
+   either assign a dedicated resource for this recovery LSP or reject
+   this request if the C-D span has already reached its maximum recovery
+   bandwidth sharing ratio.  In the latter case, C-D span failure would
+   imply that one of the working LSP would not be recoverable.
+
+   Consequently, node E must have the required information to perform
+   admission control for the recovery LSP requests it processes
+   (implying for instance, that the path followed by the working LSP is
+   carried with the corresponding recovery LSP request).  If node E can
+   guarantee that the working LSPs (A-C-D and B-C-D) are SRLG disjoint
+   over the C-D span, it may securely accept the incoming recovery LSP
+   request and assign to the recovery LSPs (A-E-F-D and B-E-F-D) the
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 40]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   same resources on the link E-F.  This may occur if the link E-F has
+   not yet reached its maximum recovery bandwidth sharing ratio.  In
+   this example, one assumes that the node failure probability is
+   negligible compared to the link failure probability.
+
+   To achieve this, the path followed by the working LSP is transported
+   with the recovery LSP request and examined at each upstream node of
+   potentially shareable links.  Admission control is performed using
+   the interface identifiers (included in the path) to retrieve in the
+   TE DataBase the list of SRLG IDs associated to each of the working
+   LSP links.  If the working LSPs (A-C-D and B-C-D) have one or more
+   link or SRLG ID in common (in this example, one or more SRLG id in
+   common over the span C-D), node E should not assign the same resource
+   over link E-F to the recovery LSPs (A-E-F-D and B-E-F-D).  Otherwise,
+   one of these working LSPs would not be recoverable if C-D span
+   failure occurred.
+
+   There are some issues related to this method; the major one is the
+   number of SRLG IDs that a single link can cover (more than 100, in
+   complex environments).  Moreover, when using link bundles, this
+   approach may generate the rejection of some recovery LSP requests.
+   This occurs when the SRLG sub-TLV corresponding to a link bundle
+   includes the union of the SRLG id list of all the component links
+   belonging to this bundle (see [RFC4202] and [RFC4201]).
+
+   In order to overcome this specific issue, an additional mechanism may
+   consist of querying the nodes where the information would be
+   available (in this case, node E would query C).  The main drawback of
+   this method is that (in addition to the dedicated mechanism(s) it
+   requires) it may become complex when several common nodes are
+   traversed by the working LSPs.  Therefore, when using link bundles,
+   solving this issue is closely related to the sequence of the recovery
+   operations.  Per-component flooding of SRLG identifiers would deeply
+   impact the scalability of the link state routing protocol.
+   Therefore, one may rely on the usage of an on-line accessible network
+   management system.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 41]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+9.  Summary and Conclusions
+
+   The following table summarizes the different recovery types and
+   schemes analyzed throughout this document.
+
+   --------------------------------------------------------------------
+              |       Path Search (computation and selection)
+   --------------------------------------------------------------------
+              |       Pre-planned (a)      |         Dynamic (b)
+   --------------------------------------------------------------------
+          |   | faster recovery            | Does not apply
+          |   | less flexible              |
+          | 1 | less robust                |
+          |   | most resource-consuming    |
+   Path   |   |                            |
+   Setup   ------------------------------------------------------------
+          |   | relatively fast recovery   | Does not apply
+          |   | relatively flexible        |
+          | 2 | relatively robust          |
+          |   | resource consumption       |
+          |   |  depends on sharing degree |
+           ------------------------------------------------------------
+          |   | relatively fast recovery   | less faster (computation)
+          |   | more flexible              | most flexible
+          | 3 | relatively robust          | most robust
+          |   | less resource-consuming    | least resource-consuming
+          |   |  depends on sharing degree |
+   --------------------------------------------------------------------
+
+   1a. Recovery LSP setup (before failure occurrence) with resource
+       reservation (i.e., signaling) and selection is referred to as LSP
+       protection.
+
+   2a. Recovery LSP setup (before failure occurrence) with resource
+       reservation (i.e., signaling) and with resource pre-selection is
+       referred to as pre-planned LSP re-routing with resource pre-
+       selection.  This implies only recovery LSP activation after
+       failure occurrence.
+
+   3a. Recovery LSP setup (before failure occurrence) with resource
+       reservation (i.e., signaling) and without resource selection is
+       referred to as pre-planned LSP re-routing without resource pre-
+       selection.  This implies recovery LSP activation and resource
+       (i.e., label) selection after failure occurrence.
+
+   3b. Recovery LSP setup after failure occurrence is referred to as to
+       as LSP re-routing, which is full when recovery LSP path
+       computation occurs after failure occurrence.
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 42]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   Thus, the term pre-planned refers to recovery LSP path pre-
+   computation, signaling (reservation), and a priori resource selection
+   (optional), but not cross-connection.  Also, the shared-mesh recovery
+   scheme can be viewed as a particular case of 2a) and 3a), using the
+   additional constraint described in Section 8.4.3.
+
+   The implementation of these recovery mechanisms requires only
+   considering extensions to GMPLS signaling protocols (i.e., [RFC3471]
+   and [RFC3473]).  These GMPLS signaling extensions should mainly focus
+   in delivering (1) recovery LSP pre-provisioning for the cases 1a, 2a,
+   and 3a, (2) LSP failure notification, (3) recovery LSP switching
+   action(s), and (4) reversion mechanisms.
+
+   Moreover, the present analysis (see Section 8) shows that no GMPLS
+   routing extensions are expected to efficiently implement any of these
+   recovery types and schemes.
+
+10.  Security Considerations
+
+   This document does not introduce any additional security issue or
+   imply any specific security consideration from [RFC3945] to the
+   current RSVP-TE GMPLS signaling, routing protocols (OSPF-TE, IS-IS-
+   TE) or network management protocols.
+
+   However, the authorization of requests for resources by GMPLS-capable
+   nodes should determine whether a given party, presumably already
+   authenticated, has a right to access the requested resources.  This
+   determination is typically a matter of local policy control, for
+   example, by setting limits on the total bandwidth made available to
+   some party in the presence of resource contention.  Such policies may
+   become quite complex as the number of users, types of resources, and
+   sophistication of authorization rules increases.  This is
+   particularly the case for recovery schemes that assume pre-planned
+   sharing of recovery resources, or contention for resources in case of
+   dynamic re-routing.
+
+   Therefore, control elements should match the requests against the
+   local authorization policy.  These control elements must be capable
+   of making decisions based on the identity of the requester, as
+   verified cryptographically and/or topologically.
+
+11.  Acknowledgements
+
+   The authors would like to thank Fabrice Poppe (Alcatel) and Bart
+   Rousseau (Alcatel) for their revision effort, and Richard Rabbat
+   (Fujitsu Labs), David Griffith (NIST), and Lyndon Ong (Ciena) for
+   their useful comments.
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 43]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   Thanks also to Adrian Farrel for the thorough review of the document.
+
+12.  References
+
+12.1.  Normative References
+
+   [RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
+                Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+   [RFC3471]    Berger, L., "Generalized Multi-Protocol Label Switching
+                (GMPLS) Signaling Functional Description", RFC 3471,
+                January 2003.
+
+   [RFC3473]    Berger, L., "Generalized Multi-Protocol Label Switching
+                (GMPLS) Signaling Resource ReserVation Protocol-Traffic
+                Engineering (RSVP-TE) Extensions", RFC 3473, January
+                2003.
+
+   [RFC3945]    Mannie, E., "Generalized Multi-Protocol Label Switching
+                (GMPLS) Architecture", RFC 3945, October 2004.
+
+   [RFC4201]    Kompella, K., Rekhter, Y., and L. Berger, "Link Bundling
+                in MPLS Traffic Engineering (TE)", RFC 4201, October
+                2005.
+
+   [RFC4202]    Kompella, K., Ed. and Y. Rekhter, Ed., "Routing
+                Extensions in Support of Generalized Multi-Protocol
+                Label Switching (GMPLS)", RFC 4202, October 2005.
+
+   [RFC4204]    Lang, J., Ed., "Link Management Protocol (LMP)", RFC
+                4204, October 2005.
+
+   [RFC4209]    Fredette, A., Ed. and J. Lang, Ed., "Link Management
+                Protocol (LMP) for Dense Wavelength Division
+                Multiplexing (DWDM) Optical Line Systems", RFC 4209,
+                October 2005.
+
+   [RFC4427]    Mannie E., Ed. and D. Papadimitriou, Ed., "Recovery
+                (Protection and Restoration) Terminology for Generalized
+                Multi-Protocol Label Switching (GMPLS)", RFC 4427, March
+                2006.
+
+12.2.  Informative References
+
+   [BOUILLET]   E. Bouillet, et al., "Stochastic Approaches to Compute
+                Shared Meshed Restored Lightpaths in Optical Network
+                Architectures," IEEE Infocom 2002, New York City, June
+                2002.
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 44]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   [DEMEESTER]  P. Demeester, et al., "Resilience in Multilayer
+                Networks," IEEE Communications Magazine, Vol. 37, No. 8,
+                pp. 70-76, August 1998.
+
+   [GLI]        G. Li, et al., "Efficient Distributed Path Selection for
+                Shared Restoration Connections," IEEE Infocom 2002, New
+                York City, June 2002.
+
+   [IPO-IMP]    Strand, J. and A. Chiu, "Impairments and Other
+                Constraints on Optical Layer Routing", RFC 4054, May
+                2005.
+
+   [KODIALAM1]  M. Kodialam and T.V. Lakshman, "Restorable Dynamic
+                Quality of Service Routing," IEEE Communications
+                Magazine, pp. 72-81, June 2002.
+
+   [KODIALAM2]  M. Kodialam and T.V. Lakshman, "Dynamic Routing of
+                Restorable Bandwidth-Guaranteed Tunnels using Aggregated
+                Network Resource Usage Information," IEEE/ ACM
+                Transactions on Networking, pp. 399-410, June 2003.
+
+   [MANCHESTER] J. Manchester, P. Bonenfant and C. Newton, "The
+                Evolution of Transport Network Survivability," IEEE
+                Communications Magazine, August 1999.
+
+   [RFC3386]    Lai, W. and D. McDysan, "Network Hierarchy and
+                Multilayer Survivability", RFC 3386, November 2002.
+
+   [T1.105]     ANSI, "Synchronous Optical Network (SONET): Basic
+                Description Including Multiplex Structure, Rates, and
+                Formats," ANSI T1.105, January 2001.
+
+   [WANG]       J. Wang, L. Sahasrabuddhe, and B. Mukherjee, "Path vs.
+                Subpath vs. Link Restoration for Fault Management in
+                IP-over-WDM Networks: Performance Comparisons Using
+                GMPLS Control Signaling," IEEE Communications Magazine,
+                pp. 80-87, November 2002.
+
+   For information on the availability of the following documents,
+   please see http://www.itu.int
+
+   [G.707]      ITU-T, "Network Node Interface for the Synchronous
+                Digital Hierarchy (SDH)," Recommendation G.707, October
+                2000.
+
+   [G.709]      ITU-T, "Network Node Interface for the Optical Transport
+                Network (OTN)," Recommendation G.709, February 2001 (and
+                Amendment no.1, October 2001).
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 45]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+   [G.783]      ITU-T, "Characteristics of Synchronous Digital Hierarchy
+                (SDH) Equipment Functional Blocks," Recommendation
+                G.783, October 2000.
+
+   [G.798]      ITU-T, "Characteristics of optical transport network
+                hierarchy equipment functional block," Recommendation
+                G.798, June 2004.
+
+   [G.806]      ITU-T, "Characteristics of Transport Equipment -
+                Description Methodology and Generic Functionality",
+                Recommendation G.806, October 2000.
+
+   [G.841]      ITU-T, "Types and Characteristics of SDH Network
+                Protection Architectures," Recommendation G.841, October
+                1998.
+
+   [G.842]      ITU-T, "Interworking of SDH network protection
+                architectures," Recommendation G.842, October 1998.
+
+   [G.874]      ITU-T, "Management aspects of the optical transport
+                network element," Recommendation G.874, November 2001.
+
+Editors' Addresses
+
+   Dimitri Papadimitriou
+   Alcatel
+   Francis Wellesplein, 1
+   B-2018 Antwerpen, Belgium
+
+   Phone:  +32 3 240-8491
+   EMail: dimitri.papadimitriou@alcatel.be
+
+
+   Eric Mannie
+   Perceval
+   Rue Tenbosch, 9
+   1000 Brussels
+   Belgium
+
+   Phone: +32-2-6409194
+   EMail: eric.mannie@perceval.net
+
+
+
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 46]
+
+RFC 4428               GMPLS Recovery Mechanisms              March 2006
+
+
+Full Copyright Statement
+
+   Copyright (C) The Internet Society (2006).
+
+   This document is subject to the rights, licenses and restrictions
+   contained in BCP 78, and except as set forth therein, the authors
+   retain all their rights.
+
+   This document and the information contained herein are provided on an
+   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
+   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
+   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
+   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+   The IETF takes no position regarding the validity or scope of any
+   Intellectual Property Rights or other rights that might be claimed to
+   pertain to the implementation or use of the technology described in
+   this document or the extent to which any license under such rights
+   might or might not be available; nor does it represent that it has
+   made any independent effort to identify any such rights.  Information
+   on the procedures with respect to rights in RFC documents can be
+   found in BCP 78 and BCP 79.
+
+   Copies of IPR disclosures made to the IETF Secretariat and any
+   assurances of licenses to be made available, or the result of an
+   attempt made to obtain a general license or permission for the use of
+   such proprietary rights by implementers or users of this
+   specification can be obtained from the IETF on-line IPR repository at
+   http://www.ietf.org/ipr.
+
+   The IETF invites any interested party to bring to its attention any
+   copyrights, patents or patent applications, or other proprietary
+   rights that may cover technology that may be required to implement
+   this standard.  Please address the information to the IETF at
+   ietf-ipr@ietf.org.
+
+Acknowledgement
+
+   Funding for the RFC Editor function is provided by the IETF
+   Administrative Support Activity (IASA).
+
+
+
+
+
+
+
+Papadimitriou & Mannie       Informational                     [Page 47]
+