diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc9197.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc9197.txt')
-rw-r--r-- | doc/rfc/rfc9197.txt | 2184 |
1 files changed, 2184 insertions, 0 deletions
diff --git a/doc/rfc/rfc9197.txt b/doc/rfc/rfc9197.txt new file mode 100644 index 0000000..cc813a1 --- /dev/null +++ b/doc/rfc/rfc9197.txt @@ -0,0 +1,2184 @@ + + + + +Internet Engineering Task Force (IETF) F. Brockners, Ed. +Request for Comments: 9197 Cisco +Category: Standards Track S. Bhandari, Ed. +ISSN: 2070-1721 Thoughtspot + T. Mizrahi, Ed. + Huawei + May 2022 + + + Data Fields for In Situ Operations, Administration, and Maintenance + (IOAM) + +Abstract + + In situ Operations, Administration, and Maintenance (IOAM) collects + operational and telemetry information in the packet while the packet + traverses a path between two points in the network. This document + discusses the data fields and associated data types for IOAM. IOAM- + Data-Fields can be encapsulated into a variety of protocols, such as + Network Service Header (NSH), Segment Routing, Generic Network + Virtualization Encapsulation (Geneve), or IPv6. IOAM can be used to + complement OAM mechanisms based on, e.g., ICMP or other types of + probe packets. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc9197. + +Copyright Notice + + Copyright (c) 2022 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Revised BSD License text as described in Section 4.e of the + Trust Legal Provisions and are provided without warranty as described + in the Revised BSD License. + +Table of Contents + + 1. Introduction + 2. Conventions + 3. Scope, Applicability, and Assumptions + 4. IOAM Data-Fields, Types, and Nodes + 4.1. IOAM Data-Fields and Option-Types + 4.2. IOAM-Domains and Types of IOAM Nodes + 4.3. IOAM-Namespaces + 4.4. IOAM Trace Option-Types + 4.4.1. Pre-allocated and Incremental Trace Option-Types + 4.4.2. IOAM Node Data Fields and Associated Formats + 4.4.2.1. Hop_Lim and node_id Short + 4.4.2.2. ingress_if_id and egress_if_id Short + 4.4.2.3. Timestamp Seconds + 4.4.2.4. Timestamp Fraction + 4.4.2.5. Transit Delay + 4.4.2.6. Namespace-Specific Data + 4.4.2.7. Queue Depth + 4.4.2.8. Checksum Complement + 4.4.2.9. Hop_Lim and node_id Wide + 4.4.2.10. ingress_if_id and egress_if_id Wide + 4.4.2.11. Namespace-Specific Data Wide + 4.4.2.12. Buffer Occupancy + 4.4.2.13. Opaque State Snapshot + 4.4.3. Examples of IOAM Node Data + 4.5. IOAM Proof of Transit Option-Type + 4.5.1. IOAM Proof of Transit Type 0 + 4.6. IOAM Edge-to-Edge Option-Type + 5. Timestamp Formats + 5.1. PTP Truncated Timestamp Format + 5.2. NTP 64-Bit Timestamp Format + 5.3. POSIX-Based Timestamp Format + 6. IOAM Data Export + 7. IANA Considerations + 7.1. IOAM Option-Type Registry + 7.2. IOAM Trace-Type Registry + 7.3. IOAM Trace-Flags Registry + 7.4. IOAM POT-Type Registry + 7.5. IOAM POT-Flags Registry + 7.6. IOAM E2E-Type Registry + 7.7. IOAM Namespace-ID Registry + 8. Management and Deployment Considerations + 9. Security Considerations + 10. References + 10.1. Normative References + 10.2. Informative References + Acknowledgements + Contributors + Authors' Addresses + +1. Introduction + + This document defines data fields for In situ Operations, + Administration, and Maintenance (IOAM). IOAM records OAM information + within the packet while the packet traverses a particular network + domain. The term "in situ" refers to the fact that the OAM data is + added to the data packets rather than being sent within packets + specifically dedicated to OAM. IOAM is used to complement + mechanisms, such as Ping or Traceroute. In terms of "active" or + "passive" OAM, IOAM can be considered a hybrid OAM type. "In situ" + mechanisms do not require extra packets to be sent. IOAM adds + information to the already available data packets and therefore + cannot be considered passive. In terms of the classification given + in [RFC7799], IOAM could be portrayed as Hybrid Type I. IOAM + mechanisms can be leveraged where mechanisms using, e.g., ICMP do not + apply or do not offer the desired results, such as proving that a + certain traffic flow takes a predefined path, Service Level Agreement + (SLA) verification for the data traffic, detailed statistics on + traffic distribution paths in networks that distribute traffic across + multiple paths, or scenarios in which probe traffic is potentially + handled differently from regular data traffic by the network devices. + + The term "in situ OAM" was originally motivated by the use of OAM- + related mechanisms that add information into a packet. This document + uses IOAM as a term defining the IOAM technology. IOAM includes "in + situ" mechanisms but also mechanisms that could trigger the creation + of additional packets dedicated to OAM. + +2. Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in BCP + 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + Abbreviations and definitions used in this document: + + E2E: Edge to Edge + + Geneve: Generic Network Virtualization Encapsulation [RFC8926] + + IOAM: In situ Operations, Administration, and Maintenance + + MTU: Maximum Transmission Unit + + NSH: Network Service Header [RFC8300] + + OAM: Operations, Administration, and Maintenance + + PMTU: Path MTU + + POT: Proof of Transit + + Short format: refers to an IOAM-Data-Field that comprises 4 octets + + SID: Segment Identifier + + SR: Segment Routing + + VXLAN-GPE: Virtual eXtensible Local Area Network, Generic + Protocol Extension [NVO3-VXLAN-GPE] + + Wide format: refers to an IOAM-Data-Field that comprises 8 octets + +3. Scope, Applicability, and Assumptions + + IOAM assumes a set of constraints as well as guiding principles and + concepts that go hand in hand with the definition of the IOAM-Data- + Fields. These constraints, guiding principles, and concepts are + described in this section. A discussion of how IOAM-Data-Fields and + the associated concepts are applied to an IOAM deployment are out of + scope for this document. Please refer to [IPPM-IOAM-DEPLOYMENT] for + IOAM deployment considerations. + + Scope: + This document defines the data fields and associated data types + for IOAM. The IOAM-Data-Fields can be encapsulated in a variety + of protocols, including NSH, Segment Routing, Geneve, and IPv6. + Specification details for these different protocols are outside + the scope of this document. It is expected that each such + encapsulation would be specified by an RFC and jointly designed by + the working group that develops or maintains the encapsulation + protocol and the IETF IP Performance Measurement (IPPM) Working + Group. + + Domain (or scope) of in situ OAM deployment: + IOAM is focused on "limited domains", as defined in [RFC8799]. + For IOAM, a limited domain could, for example, be an enterprise + campus using physical connections between devices or an overlay + network using virtual connections/tunnels for connectivity between + said devices. A limited domain that uses IOAM may constitute one + or multiple "IOAM-Domains", each disambiguated through separate + namespace identifiers. An IOAM-Domain is bounded by its perimeter + or edge. IOAM-Domains may overlap inside the limited domain. + Designers of protocol encapsulations for IOAM specify mechanisms + to ensure that IOAM data stays within an IOAM-Domain. In + addition, the operator of such a domain is expected to put + provisions in place to ensure that IOAM data does not leak beyond + the edge of an IOAM-Domain using, for example, packet filtering + methods. The operator SHOULD consider the potential operational + impact of IOAM to mechanisms, such as ECMP processing (e.g., load- + balancing schemes based on packet length could be impacted by the + increased packet size due to IOAM), PMTU (i.e., ensure that the + MTU of all links within a domain is sufficiently large to support + the increased packet size due to IOAM), and ICMP message handling + (i.e., in case of IPv6, IOAM support for ICMPv6 echo request/reply + is desired, which would translate into ICMPv6 extensions to enable + IOAM-Data-Fields to be copied from an echo request message to an + echo reply message). + + IOAM control points: + IOAM-Data-Fields are added to or removed from the user traffic by + the devices that form the edge of a domain. Devices that form an + IOAM-Domain can add, update, or remove IOAM-Data-Fields. Edge + devices of an IOAM-Domain can be hosts or network devices. + + Traffic sets that IOAM is applied to: + IOAM can be deployed on all or only on subsets of the user + traffic. Using IOAM on a selected set of traffic (e.g., per + interface, based on an access control list or flow specification + defining a specific set of traffic, etc.) could be useful in + deployments where the cost of processing IOAM-Data-Fields by + encapsulating, transit, or decapsulating nodes might be a concern + from a performance or operational perspective. Thus, limiting the + amount of traffic IOAM is applied to could be beneficial in some + deployments. + + Encapsulation independence: + The definition of IOAM-Data-Fields is independent from the + protocols the IOAM-Data-Fields are encapsulated into. IOAM-Data- + Fields can be encapsulated into several encapsulating protocols. + + Layering: + If several encapsulation protocols (e.g., in case of tunneling) + are stacked on top of each other, IOAM-Data-Fields could be + present at multiple layers. The behavior follows the "ships-in- + the-night" model, i.e., IOAM-Data-Fields in one layer are + independent from IOAM-Data-Fields in another layer. Layering + allows operators to instrument the protocol layer they want to + measure. The different layers could, but do not have to, share + the same IOAM encapsulation mechanisms. + + IOAM implementation: + The definition of the IOAM-Data-Fields takes the specifics of + devices with hardware data planes and software data planes into + account. + +4. IOAM Data-Fields, Types, and Nodes + + This section details IOAM-related nomenclature and describes data + types, such as IOAM-Data-Fields, IOAM-Types, IOAM-Namespaces, as well + as the different types of IOAM nodes. + +4.1. IOAM Data-Fields and Option-Types + + An IOAM-Data-Field is a set of bits with a defined format and + meaning, which can be stored at a certain place in a packet for the + purpose of IOAM. + + To accommodate the different uses of IOAM, IOAM-Data-Fields fall into + different categories. In IOAM, these categories are referred to as + "IOAM-Option-Types". A common registry is maintained for IOAM- + Option-Types (see Section 7.1 for details). Corresponding to these + IOAM-Option-Types, different IOAM-Data-Fields are defined. + + This document defines four IOAM-Option-Types: + + * Pre-allocated Trace Option-Type + + * Incremental Trace Option-Type + + * POT Option-Type + + * E2E Option-Type + + Future IOAM-Option-Types can be allocated by IANA, as described in + Section 7.1. + +4.2. IOAM-Domains and Types of IOAM Nodes + + Section 3 already mentioned that IOAM is expected to be deployed in a + limited domain [RFC8799]. One or more IOAM-Option-Types are added to + a packet upon entering an IOAM-Domain and are removed from the packet + when exiting the domain. Within the IOAM-Domain, the IOAM-Data- + Fields MAY be updated by network nodes that the packet traverses. An + IOAM-Domain consists of "IOAM encapsulating nodes", "IOAM + decapsulating nodes", and "IOAM transit nodes". The role of a node + (i.e., encapsulating, transit, and decapsulating) is defined within + an IOAM-Namespace (see below). A node can have different roles in + different IOAM-Namespaces. + + A device that adds at least one IOAM-Option-Type to the packet is + called an "IOAM encapsulating node", whereas a device that removes an + IOAM-Option-Type is referred to as an "IOAM decapsulating node". + Nodes within the domain that are aware of IOAM data and read, write, + and/or process IOAM data are called "IOAM transit nodes". IOAM nodes + that add or remove the IOAM-Data-Fields can also update the IOAM- + Data-Fields at the same time. Or, in other words, IOAM encapsulating + or decapsulating nodes can also serve as IOAM transit nodes at the + same time. Note that not every node in an IOAM-Domain needs to be an + IOAM transit node. For example, a deployment might require that + packets traverse a set of firewalls that support IOAM. In that case, + only the set of firewall nodes would be IOAM transit nodes, rather + than all nodes. + + An IOAM encapsulating node incorporates one or more IOAM-Option-Types + (from the list of IOAM-Types, see Section 7.1) into packets that IOAM + is enabled for. If IOAM is enabled for a selected subset of the + traffic, the IOAM encapsulating node is responsible for applying the + IOAM functionality to the selected subset. + + An IOAM transit node reads, writes, and/or processes one or more of + the IOAM-Data-Fields. If both the Pre-allocated and the Incremental + Trace Option-Types are present in the packet, each IOAM transit node, + based on configuration and available implementation of IOAM, might + populate IOAM trace data in either a Pre-allocated or Incremental + Trace Option-Type but not both. Note that not populating any of the + Trace Option-Types is also valid behavior for an IOAM transit node. + A transit node MUST ignore IOAM-Option-Types that it does not + understand. A transit node MUST NOT add new IOAM-Option-Types to a + packet, MUST NOT remove IOAM-Option-Types from a packet, and MUST NOT + change the IOAM-Data-Fields of an IOAM Edge-to-Edge Option-Type. + + An IOAM decapsulating node removes IOAM-Option-Type(s) from packets. + + The role of an IOAM encapsulating, IOAM transit, or IOAM + decapsulating node is always performed within a specific IOAM- + Namespace. This means that an IOAM node that is, e.g., an IOAM + decapsulating node for IOAM-Namespace "A" but not for IOAM-Namespace + "B" will only remove the IOAM-Option-Types for IOAM-Namespace "A" + from the packet. Note that this applies even for IOAM-Option-Types + that the node does not understand, for example, an IOAM-Option-Type + other than the four described above, which is added in a future + revision. + + IOAM-Namespaces allow for a namespace-specific definition and + interpretation of IOAM-Data-Fields. An interface identifier could, + for example, point to a physical interface (e.g., to understand which + physical interface of an aggregated link is used when receiving or + transmitting a packet), whereas, in another case, it could refer to a + logical interface (e.g., in case of tunnels). Please refer to + Section 4.3 for details on IOAM-Namespaces. + +4.3. IOAM-Namespaces + + IOAM-Namespaces add further context to IOAM-Option-Types and + associated IOAM-Data-Fields. The IOAM-Option-Types and associated + IOAM-Data-Fields are interpreted as defined in this document, + regardless of the value of the IOAM-Namespace. However, IOAM- + Namespaces provide a way to group nodes to support different + deployment approaches of IOAM (see a few example use cases below). + IOAM-Namespaces also help to resolve potential issues that can occur + due to IOAM-Data-Fields not being globally unique (e.g., IOAM node + identifiers do not have to be globally unique). The significance of + IOAM-Data-Fields is always within a particular IOAM-Namespace. Given + that IOAM-Data-Fields are always interpreted as the context of a + specific namespace, the Namespace-ID field always needs to be carried + along with the IOAM data-fields themselves. + + An IOAM-Namespace is identified by a 16-bit namespace identifier + (Namespace-ID). The IOAM-Namespace field is included in all the + IOAM-Option-Types defined in this document and MUST be included in + all future IOAM-Option-Types. The Namespace-ID value is divided into + two subranges: + + * an operator-assigned range from 0x0001 to 0x7FFF and + + * an IANA-assigned range from 0x8000 to 0xFFFF. + + The IANA-assigned range is intended to allow future extensions to + have new and interoperable IOAM functionality, while the operator- + assigned range is intended to be domain specific and managed by the + network operator. The Namespace-ID value of 0x0000 is the "Default- + Namespace-ID". The Default-Namespace-ID indicates that no specific + namespace is associated with the IOAM-Data-Fields in the packet. The + Default-Namespace-ID MUST be supported by all nodes implementing + IOAM. A use case for the Default-Namespace-ID are deployments that + do not leverage specific namespaces for some or all of their packets + that carry IOAM-Data-Fields. + + Namespace identifiers allow devices that are IOAM capable to + determine: + + * whether one or more IOAM-Option-Types need to be processed by a + device. If the Namespace-ID contained in a packet does not match + any Namespace-ID the node is configured to operate on, then the + node MUST NOT change the contents of the IOAM-Data-Fields. + + * which IOAM-Option-Type needs to be processed/updated in case there + are multiple IOAM-Option-Types present in the packet. Multiple + IOAM-Option-Types can be present in a packet in case of + overlapping IOAM-Domains or in case of a layered IOAM deployment. + + * whether one or more IOAM-Option-Types have to be removed from the + packet, e.g., at a domain edge or domain boundary. + + IOAM-Namespaces support several different uses: + + * IOAM-Namespaces can be used by an operator to distinguish + different IOAM-Domains. Devices at edges of an IOAM-Domain can + filter on Namespace-IDs to provide for proper IOAM-Domain + isolation. + + * IOAM-Namespaces provide additional context for IOAM-Data-Fields + and, thus, can be used to ensure that IOAM-Data-Fields are unique + and are interpreted properly by management stations or network + controllers. The node identifier field (node_id, see below) does + not need to be unique in a deployment. This could be the case if + an operator wishes to use different node identifiers for different + IOAM layers, even within the same device, or node identifiers + might not be unique for other organizational reasons, such as + after a merger of two formerly separated organizations. The + Namespace-ID can be used as a context identifier, such that the + combination of node_id and Namespace-ID will always be unique. + + * Similarly, IOAM-Namespaces can be used to define how certain IOAM- + Data-Fields are interpreted; IOAM offers three different timestamp + format options. The Namespace-ID can be used to determine the + timestamp format. IOAM-Data-Fields (e.g., buffer occupancy) that + do not have a unit associated are to be interpreted within the + context of an IOAM-Namespace. + + * IOAM-Namespaces can be used to identify different sets of devices + (e.g., different types of devices) in a deployment; if an operator + wants to insert different IOAM-Data-Fields based on the device, + the devices could be grouped into multiple IOAM-Namespaces. This + could be due to the fact that the IOAM feature set differs between + different sets of devices, or it could be for reasons of optimized + space usage in the packet header. It could also stem from + hardware or operational limitations on the size of the trace data + that can be added and processed, preventing collection of a full + trace for a flow. + + * By assigning different IOAM Namespace-IDs to different sets of + nodes or network partitions and using a separate instance of an + IOAM-Option-Type for each Namespace-ID, a full trace for a flow + could be collected and constructed via partial traces from each + IOAM-Option-Type in each of the packets in the flow. For example, + an operator could choose to group the devices of a domain into two + IOAM-Namespaces in a way that each IOAM-Namespace is represented + by one of two IOAM-Option-Types in the packet. Each node would + record data only for the IOAM-Namespace that it belongs to, + ignoring the other IOAM-Option-Type with an IOAM-Namespace to + which it doesn't belong. To retrieve a full view of the + deployment, the captured IOAM-Data-Fields of the two IOAM- + Namespaces need to be correlated. + +4.4. IOAM Trace Option-Types + + In a typical deployment, all nodes in an IOAM-Domain would + participate in IOAM; thus, they would be IOAM transit nodes, IOAM + encapsulating nodes, or IOAM decapsulating nodes. If not all nodes + within a domain support IOAM functionality as defined in this + document, IOAM tracing information (i.e., node data, see below) can + only be collected on those nodes that support IOAM functionality as + defined in this document. Nodes that do not support IOAM + functionality as defined in this document will forward the packet + without any changes to the IOAM-Data-Fields. The maximum number of + hops and the minimum PMTU of the IOAM-Domain is assumed to be known. + An overflow indicator (O-bit) is defined as one of the ways to deal + with situations where the PMTU was underestimated, i.e., where the + number of hops that are IOAM capable exceeds the available space in + the packet. + + To optimize hardware and software implementations, IOAM tracing is + defined as two separate options. A deployment can choose to + configure and support one or both of the following options. + + Pre-allocated Trace-Option: + This trace option is defined as a container of node data fields + (see below) with pre-allocated space for each node to populate its + information. This option is useful for implementations where it + is efficient to allocate the space once and index into the array + to populate the data during transit (e.g., software forwarders + often fall into this class). The IOAM encapsulating node + allocates space for the Pre-allocated Trace Option-Type in the + packet and sets corresponding fields in this IOAM-Option-Type. + The IOAM encapsulating node allocates an array that is used to + store operational data retrieved from every node while the packet + traverses the domain. IOAM transit nodes update the content of + the array and possibly update the checksums of outer headers. A + pointer that is part of the IOAM trace data points to the next + empty slot in the array. An IOAM transit node that updates the + content of the Pre-allocated Trace-Option also updates the value + of the pointer, which specifies where the next IOAM transit node + fills in its data. The "node data list" array (see below) in the + packet is populated iteratively as the packet traverses the + network, starting with the last entry of the array, i.e., "node + data list [n]" is the first entry to be populated, "node data list + [n-1]" is the second one, etc. + + Incremental Trace-Option: + This trace option is defined as a container of node data fields, + where each node allocates and pushes its node data immediately + following the option header. This type of trace recording is + useful for some of the hardware implementations, as it eliminates + the need for the transit network elements to read the full array + in the option and allows for as arbitrarily long packets as the + MTU allows. The IOAM encapsulating node allocates space for the + Incremental Trace Option-Type. Based on the operational state and + configuration, the IOAM encapsulating node sets the fields in the + Option-Type that control what IOAM-Data-Fields have to be + collected and how large the node data list can grow. IOAM transit + nodes push their node data to the node data list subject to any + protocol constraints of the encapsulating layer. They then + decrease the remaining length available to subsequent nodes and + adjust the lengths and possibly checksums in outer headers. + + IOAM encapsulating nodes and IOAM decapsulating nodes that support + tracing MUST support both Trace Option-Types. For IOAM transit + nodes, it is sufficient to support one of the Trace Option-Types. In + the event that both options are utilized in a deployment at the same + time, the Incremental Trace-Option MUST be placed before the Pre- + allocated Trace-Option. Deployments that mix devices with either the + Incremental Trace-Option or the Pre-allocated Trace-Option could + result in both Option-Types being present in a packet. Given that + the operator knows which equipment is deployed in a particular IOAM- + Domain, the operator will decide by means of configuration which + type(s) of trace options will be used for a particular domain. + + Every node data entry holds information for a particular IOAM transit + node that is traversed by a packet. The IOAM decapsulating node + removes the IOAM-Option-Types and processes and/or exports the + associated data. Like all IOAM-Data-Fields, the IOAM-Data-Fields of + the IOAM Trace Option-Types are defined in the context of an IOAM- + Namespace. + + IOAM tracing can collect the following types of information: + + * Identification of the IOAM node. An IOAM node identifier can + match to a device identifier or a particular control point or + subsystem within a device. + + * Identification of the interface that a packet was received on, + i.e., ingress interface. + + * Identification of the interface that a packet was sent out on, + i.e., egress interface. + + * Time of day when the packet was processed by the node, as well as + the transit delay. Different definitions of processing time are + feasible and expected, though it is important that all devices of + an IOAM-Domain follow the same definition. + + * Generic data, i.e., format-free information where syntax and + semantics of the information is defined by the operator in a + specific deployment. For a specific IOAM-Namespace, all IOAM + nodes have to interpret the generic data the same way. Examples + for generic IOAM data include geolocation information (location of + the node at the time the packet was processed), buffer queue fill + level or cache fill level at the time the packet was processed, or + even a battery-charge level. + + * Information to detect whether IOAM trace data was added at every + hop or whether certain hops in the domain weren't IOAM transit + nodes. + + It should be noted that the semantics of some of the node data fields + that are defined below, such as the queue depth and buffer occupancy, + are implementation specific. This approach is intended to allow IOAM + nodes with various different architectures. + +4.4.1. Pre-allocated and Incremental Trace Option-Types + + The IOAM Pre-allocated Trace-Option and the IOAM Incremental Trace- + Option have similar formats. Except where noted below, the internal + formats and fields of the two trace options are identical. Both + trace options consist of a fixed-size "trace option header" and a + variable data space to store gathered data, i.e., the "node data + list". An IOAM transit node (that is, not an IOAM encapsulating node + or IOAM decapsulating node) MUST NOT modify any of the fields in the + fixed-size "trace option header", other than Flags" and + "RemainingLen", i.e., an IOAM transit node MUST NOT modify the + Namespace-ID, NodeLen, IOAM Trace-Type, or Reserved fields. + + The Pre-allocated and Incremental Trace-Option headers: + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Namespace-ID |NodeLen | Flags | RemainingLen| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | IOAM Trace-Type | Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + The trace option data MUST be aligned by 4 octets: + + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ + | | | + | node data list [0] | | + | | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ D + | | a + | node data list [1] | t + | | a + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ~ ... ~ S + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ p + | | a + | node data list [n-1] | c + | | e + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | | | + | node data list [n] | | + | | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ + + Namespace-ID: + 16-bit identifier of an IOAM-Namespace. The Namespace-ID value of + 0x0000 is defined as the "Default-Namespace-ID" (see Section 4.3) + and MUST be known to all the nodes implementing IOAM. For any + other Namespace-ID value that does not match any Namespace-ID the + node is configured to operate on, the node MUST NOT change the + contents of the IOAM-Data-Fields. + + NodeLen: + 5-bit unsigned integer. This field specifies the length of data + added by each node in multiples of 4 octets, excluding the length + of the "Opaque State Snapshot" field. + + If IOAM Trace-Type Bit 22 is not set, then NodeLen specifies the + actual length added by each node. If IOAM Trace-Type Bit 22 is + set, then the actual length added by a node would be (NodeLen + + length of the "Opaque State Snapshot" field) in 4-octet units. + + For example, if 3 IOAM Trace-Type bits are set and none of them + are in wide format, then NodeLen would be 3. If 3 IOAM Trace-Type + bits are set and 2 of them are wide, then NodeLen would be 5. + + An IOAM encapsulating node MUST set NodeLen. + + A node receiving an IOAM Pre-allocated or Incremental Trace-Option + relies on the NodeLen value. + + Flags: + 4-bit field. Flags are allocated by IANA, as specified in + Section 7.3. This document allocates a single flag as follows: + + Bit 0: + "Overflow" (O-bit) (most significant bit). In case a network + element is supposed to add node data to a packet but detects + that there are not enough octets left to record the node data, + the network element MUST NOT add any fields and MUST set the + overflow "O-bit" to "1" in the IOAM Trace-Option header. This + is useful for transit nodes to ignore further processing of the + option. + + RemainingLen: + 7-bit unsigned integer. This field specifies the data space in + multiples of 4 octets remaining for recording the node data before + the node data list is considered to have overflowed. The sender + MUST assign the initial value of the RemainingLen field. The + sender MAY calculate the value of the RemainingLen field by + computing the number of node data bytes allowed before exceeding + the PMTU, given that the PMTU is known to the sender. Subsequent + nodes can carry out a simple comparison between RemainingLen and + NodeLen, along with the length of the "Opaque State Snapshot", if + applicable, to determine whether or not data can be added by this + node. When node data is added, the node MUST decrease + RemainingLen by the amount of data added. In the Pre-allocated + Trace-Option, RemainingLen is used to derive the offset in data + space to record the node data element. Specifically, the + recording of the node data element would start from RemainingLen - + NodeLen - size of (opaque snapshot) in 4-octet units. If + RemainingLen in a Pre-allocated Trace-Option exceeds the length of + the option, as specified in the lower-layer header (which is not + within the scope of this document), then the node MUST NOT add any + fields. + + IOAM Trace-Type: + 24-bit identifier that specifies which data types are used in this + node data list. + + The IOAM Trace-Type value is a bit field. The following bits are + defined in this document, with details on each bit described in + Section 4.4.2. The order of packing the data fields in each node + data element follows the bit order of the IOAM Trace-Type field as + follows: + + Bit 0 Most significant bit. When set, indicates the presence + of Hop_Lim and node_id (short format) in the node data. + + Bit 1 When set, indicates the presence of ingress_if_id and + egress_if_id (short format) in the node data. + + Bit 2 When set, indicates the presence of timestamp seconds in + the node data. + + Bit 3 When set, indicates the presence of timestamp fraction + in the node data. + + Bit 4 When set, indicates the presence of transit delay in the + node data. + + Bit 5 When set, indicates the presence of IOAM-Namespace- + specific data in short format in the node data. + + Bit 6 When set, indicates the presence of queue depth in the + node data. + + Bit 7 When set, indicates the presence of the Checksum + Complement node data. + + Bit 8 When set, indicates the presence of Hop_Lim and node_id + in wide format in the node data. + + Bit 9 When set, indicates the presence of ingress_if_id and + egress_if_id in wide format in the node data. + + Bit 10 When set, indicates the presence of IOAM-Namespace- + specific data in wide format in the node data. + + Bit 11 When set, indicates the presence of buffer occupancy in + the node data. + + Bits 12-21 Undefined. These values are available for future + assignment in the IOAM Trace-Type Registry + (Section 7.2). Every future node data field + corresponding to one of these bits MUST be 4 octets + long. An IOAM encapsulating node MUST set the value of + each undefined bit to 0. If an IOAM transit node + receives a packet with one or more of these bits set to + 1, it MUST either: + + 1. add corresponding node data filled with the reserved + value 0xFFFFFFFF after the node data fields for the + IOAM Trace-Type bits defined above, such that the + total node data added by this node in units of 4 + octets is equal to NodeLen or + + 2. not add any node data fields to the packet, even for + the IOAM Trace-Type bits defined above. + + Bit 22 When set, indicates the presence of the variable-length + Opaque State Snapshot field. + + Bit 23 Reserved; MUST be set to zero upon transmission and be + ignored upon receipt. This bit is reserved to allow for + future extensions of the IOAM Trace-Type bit field. + + Section 4.4.2 describes the IOAM-Data-Types and their formats. + Within an IOAM-Domain, possible combinations of these bits making + the IOAM Trace-Type can be restricted by configuration knobs. + + Reserved: + 8 bits. An IOAM encapsulating node MUST set the value to zero + upon transmission. IOAM transit nodes MUST ignore the received + value. + + Node data List [n]: + Variable-length field. This is a list of node data elements where + the content of each node data element is determined by the IOAM + Trace-Type. The order of packing the data fields in each node + data element follows the bit order of the IOAM Trace-Type field. + Each node MUST prepend its node data element in front of the node + data elements that it received, such that the transmitted node + data list begins with this node's data element as the first + populated element in the list. The last node data element in this + list is the node data of the first IOAM-capable node in the path. + Populating the node data list in this way ensures that the order + of the node data list is the same for Incremental and Pre- + allocated Trace-Options. In the Pre-allocated Trace-Option, the + index contained in RemainingLen identifies the offset for current + active node data to be populated. + +4.4.2. IOAM Node Data Fields and Associated Formats + + All the IOAM-Data-Fields MUST be aligned by 4 octets. If a node that + is supposed to update an IOAM-Data-Field is not capable of populating + the value of a field set in the IOAM Trace-Type, the field value MUST + be set to 0xFFFFFFFF for 4-octet fields or 0xFFFFFFFFFFFFFFFF for + 8-octet fields, indicating that the value is not populated, except + when explicitly specified in the field description below. + + Some IOAM-Data-Fields defined below, such as interface identifiers or + IOAM-Namespace-specific data, are defined in both "short format" and + "wide format". The use of "short format" or "wide format" is not + mutually exclusive. A deployment could choose to leverage both. For + example, ingress_if_id_(short format) could be an identifier for the + physical interface, whereas ingress_if_id_(wide format) could be an + identifier for a logical sub-interface of that physical interface. + + Data fields and associated data types for each of the IOAM-Data- + Fields are specified in the following sections. The definition of + IOAM-Data-Fields focuses on the syntax of the data fields and avoids + specifying the semantics where feasible. This is why no units are + defined for data fields, e.g., like "buffer occupancy" or "queue + depth". With this approach, nodes can supply the information in + their original format and are not required to perform unit or format + conversions. Systems that further process IOAM information, e.g., + like a network management system, are assumed to also handle unit + conversions as part of their IOAM-Data-Fields processing. The + combination of a particular data field and the Namespace-ID provides + for the context to interpret the provided data appropriately. + +4.4.2.1. Hop_Lim and node_id Short + + The "Hop_Lim and node_id short" field is a 4-octet field that is + defined as follows: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Hop_Lim: + 1-octet unsigned integer. It is set to the Hop Limit value in the + packet at egress from the node that records this data. Hop Limit + information is used to identify the location of the node in the + communication path. This is copied from the lower layer, e.g., + TTL value in IPv4 header or Hop Limit field from IPv6 header of + the packet when the packet is ready for transmission. The + semantics of the Hop_Lim field depend on the lower-layer protocol + that IOAM is encapsulated into; therefore, its specific semantics + are outside the scope of this memo. The value of this field MUST + be set to 0xff when the lower level does not have a field + equivalent to TTL / Hop Limit. + + node_id: + 3-octet unsigned integer. A node identifier field to uniquely + identify a node within the IOAM-Namespace and associated IOAM- + Domain. The procedure to allocate, manage, and map the node_ids + is beyond the scope of this document. See [IPPM-IOAM-DEPLOYMENT] + for a discussion of deployment-related aspects of the node_id. + +4.4.2.2. ingress_if_id and egress_if_id Short + + The "ingress_if_id and egress_if_id" field is a 4-octet field that is + defined as follows: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ingress_if_id | egress_if_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + ingress_if_id: + 2-octet unsigned integer. An interface identifier to record the + ingress interface the packet was received on. + + egress_if_id: + 2-octet unsigned integer. An interface identifier to record the + egress interface the packet is forwarded out of. + + Note that due to the fact that IOAM uses its own IOAM-Namespaces for + IOAM-Data-Fields, data fields, like interface identifiers, can be + used in a flexible way to represent system resources that are + associated with ingressing or egressing packets, i.e., ingress_if_id + could represent a physical interface, a virtual or logical interface, + or even a queue. + +4.4.2.3. Timestamp Seconds + + The "timestamp seconds" field is a 4-octet unsigned integer field. + It contains the absolute timestamp in seconds that specifies the time + at which the packet was received by the node. This field has three + possible formats, based on either the Precision Time Protocol (PTP) + (see e.g., [RFC8877]), NTP [RFC5905], or POSIX [POSIX]. The three + timestamp formats are specified in Section 5. In all three cases, + the timestamp seconds field contains the 32 most significant bits of + the timestamp format that is specified in Section 5. If a node is + not capable of populating this field, it assigns the value + 0xFFFFFFFF. Note that this is a legitimate value that is valid for 1 + second in approximately 136 years; the analyzer has to correlate + several packets or compare the timestamp value to its own time of day + in order to detect the error indication. + +4.4.2.4. Timestamp Fraction + + The "timestamp fraction" field is a 4-octet unsigned integer field. + Fraction specifies the fractional portion of the number of seconds + since the NTP epoch [RFC8877]. The field specifies the time at which + the packet was received by the node. This field has three possible + formats, based on either PTP (see e.g., [RFC8877]), NTP [RFC5905], or + POSIX [POSIX]. The three timestamp formats are specified in + Section 5. In all three cases, the timestamp fraction field contains + the 32 least significant bits of the timestamp format that is + specified in Section 5. If a node is not capable of populating this + field, it assigns the value 0xFFFFFFFF. Note that this is a + legitimate value in the NTP format, valid for approximately 233 + picoseconds in every second. If the NTP format is used, the analyzer + has to correlate several packets in order to detect the error + indication. + +4.4.2.5. Transit Delay + + The "transit delay" field is a 4-octet unsigned integer in the range + 0 to 2^31-1. It is the time in nanoseconds the packet spent in the + transit node. This can serve as an indication of the queuing delay + at the node. If the transit delay exceeds 2^31-1 nanoseconds, then + the top bit 'O' is set to indicate overflow and value set to + 0x80000000. When this field is part of the data field but a node + populating the field is not able to fill it, the field position in + the field MUST be filled with value 0xFFFFFFFF to mean not populated. + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |O| transit delay | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +4.4.2.6. Namespace-Specific Data + + The "namespace-specific data" field is a 4-octet field that can be + used by the node to add IOAM-Namespace-specific data. This + represents a "free-format" 4-octet bit field with its semantics + defined in the context of a specific IOAM-Namespace. + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | namespace-specific data | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +4.4.2.7. Queue Depth + + The "queue depth" field is a 4-octet unsigned integer field. This + field indicates the current length of the egress interface queue of + the interface from where the packet is forwarded out. The queue + depth is expressed as the current amount of memory buffers used by + the queue (a packet could consume one or more memory buffers, + depending on its size). + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | queue depth | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +4.4.2.8. Checksum Complement + + The "Checksum Complement" field is a 4-octet node data that contains + the Checksum Complement value. The Checksum Complement is useful + when IOAM is transported over encapsulations that make use of a UDP + transport, such as VXLAN-GPE or Geneve. Without the Checksum + Complement, nodes adding IOAM node data update the UDP Checksum field + following the recommendation of the encapsulation protocols. When + the Checksum Complement is present, an IOAM encapsulating node or + IOAM transit node adding node data MUST carry out one of the + following two alternatives in order to maintain the correctness of + the UDP Checksum value: + + 1. recompute the UDP Checksum field or + + 2. use the Checksum Complement to make a checksum-neutral update in + the UDP payload; the Checksum Complement is assigned a value that + complements the rest of the node data fields that were added by + the current node, causing the existing UDP Checksum field to + remain correct. + + IOAM decapsulating nodes MUST recompute the UDP Checksum field, since + they do not know whether previous hops modified the UDP Checksum + field or the Checksum Complement field. + + Checksum Complement fields are used in a similar manner in [RFC7820] + and [RFC7821]. + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Checksum Complement | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +4.4.2.9. Hop_Lim and node_id Wide + + The "Hop_Lim and node_id wide" field is an 8-octet field defined as + follows: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id ~ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ~ node_id (contd) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Hop_Lim: + 1-octet unsigned integer. See Section 4.4.2.1 for the definition + of the field. + + node_id: + 7-octet unsigned integer. It is a node identifier field to + uniquely identify a node within the IOAM-Namespace and associated + IOAM-Domain. The procedure to allocate, manage, and map the + node_ids is beyond the scope of this document. + +4.4.2.10. ingress_if_id and egress_if_id Wide + + The "ingress_if_id and egress_if_id wide" field is an 8-octet field, + which is defined as follows: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ingress_if_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | egress_if_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + ingress_if_id: + 4-octet unsigned integer. It is an interface identifier to record + the ingress interface the packet was received on. + + egress_if_id: + 4-octet unsigned integer. It is an interface identifier to record + the egress interface the packet is forwarded out of. + +4.4.2.11. Namespace-Specific Data Wide + + The "namespace-specific data wide" field is an 8-octet field that can + be used by the node to add IOAM-Namespace-specific data. This + represents a "free-format" 8-octet bit field with its semantics + defined in the context of a specific IOAM-Namespace. + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | namespace-specific data ~ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + ~ namespace-specific data (contd) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +4.4.2.12. Buffer Occupancy + + The "buffer occupancy" field is a 4-octet unsigned integer field. + This field indicates the current status of the occupancy of the + common buffer pool used by a set of queues. The units of this field + are implementation specific. Hence, the units are interpreted within + the context of an IOAM-Namespace and/or node identifier if used. The + authors acknowledge that, in some operational cases, there is a need + for the units to be consistent across a packet path through the + network; hence, it is recommended for implementations to use standard + units, such as bytes. + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | buffer occupancy | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +4.4.2.13. Opaque State Snapshot + + The "Opaque State Snapshot" field is a variable-length field and + follows the fixed-length IOAM-Data-Fields defined above. It allows + the network element to store an arbitrary state in the node data + field without a predefined schema. The schema is to be defined + within the context of an IOAM-Namespace. The schema needs to be made + known to the analyzer by some out-of-band mechanism. The + specification of this mechanism is beyond the scope of this document. + A 24-bit "Schema ID" field, interpreted within the context of an + IOAM-Namespace, indicates which particular schema is used and has to + be configured on the network element by the operator. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Length | Schema ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | | + | Opaque data | + ~ ~ + . . + . . + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Length: + 1-octet unsigned integer. It is the length in multiples of 4 + octets of the Opaque data field that follows Schema ID. + + Schema ID: + 3-octet unsigned integer identifying the schema of Opaque data. + + Opaque data: + Variable-length field. This field is interpreted as specified by + the schema identified by the Schema ID. + + When this field is part of the data field, but a node populating the + field has no opaque state data to report, the Length MUST be set to 0 + and the Schema ID MUST be set to 0xFFFFFF to mean no schema. + +4.4.3. Examples of IOAM Node Data + + The format used for the entries in a packet's "node data list" array + can vary from packet to packet and deployment to deployment. Some + deployments might only be interested in recording the node + identifiers, whereas others might be interested in recording node + identifiers and timestamps. This section provides example entries of + the "node data list" array. + + 0xD40000: If the IOAM Trace-Type is 0xD40000 + (0b110101000000000000000000), then the format of node data is: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ingress_if_id | egress_if_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp fraction | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | namespace-specific data | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + 0xC00000: If the IOAM Trace-Type is 0xC00000 + (0b110000000000000000000000), then the format is: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | ingress_if_id | egress_if_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + 0x900000: If the IOAM Trace-Type is 0x900000 + (0b100100000000000000000000), then the format is: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp fraction | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + 0x840000: If the IOAM Trace-Type is 0x840000 + (0b100001000000000000000000), then the format is: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | namespace-specific data | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + 0x940000: If the IOAM Trace-Type is 0x940000 + (0b100101000000000000000000), then the format is: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp fraction | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | namespace-specific data | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + 0x308002: If the IOAM Trace-Type is 0x308002 + (0b001100001000000000000010), then the format is: + + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp seconds | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp fraction | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hop_Lim | node_id | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | node_id(contd) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Length | Schema ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | | + | Opaque data | + ~ ~ + . . + . . + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +4.5. IOAM Proof of Transit Option-Type + + The IOAM Proof of Transit Option-Type is used to support path or + service function chain [RFC7665] verification use cases, i.e., prove + that traffic transited a defined path. While the details on how the + IOAM data for the Proof of Transit Option-Type is processed at IOAM + encapsulating, decapsulating, and transit nodes are outside the scope + of the document, Proof of Transit approaches share the need to + uniquely identify a packet, as well as iteratively operate on a set + of information that is handed from node to node. Correspondingly, + two pieces of information are added as IOAM-Data-Fields to the + packet: + + PktID: + unique identifier for the packet + + Cumulative: + information that is handed from node to node and updated by every + node according to a verification algorithm + + The IOAM Proof of Transit Option-Type consist of a fixed-size "IOAM + Proof of Transit Option header" and "IOAM Proof of Transit Option + data fields": + + IOAM Proof of Transit Option header: + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Namespace-ID |IOAM POT-Type | IOAM POT flags| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + IOAM Proof of Transit Option-Type IOAM-Data-Fields MUST be aligned by + 4 octets: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | POT Option data field determined by IOAM POT-Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Namespace-ID: + 16-bit identifier of an IOAM-Namespace. The Namespace-ID value of + 0x0000 is defined as the "Default-Namespace-ID" (see Section 4.3) + and MUST be known to all the nodes implementing IOAM. For any + other Namespace-ID value that does not match any Namespace-ID the + node is configured to operate on, the node MUST NOT change the + contents of the IOAM-Data-Fields. + + IOAM POT-Type: + 8-bit identifier of a particular POT variant that specifies the + POT data that is included. This document defines IOAM POT-Type 0: + + 0: POT data is a 16-octet field to carry data associated to POT + procedures. + + If a node receives an IOAM POT-Type value that it does not + understand, the node MUST NOT change, add to, or remove the + contents of the IOAM-Data-Fields. + + IOAM POT flags: + 8 bits. This document does not define any flags. Bits 0-7 are + available for assignment (see Section 7.5). Bits that have not + been assigned MUST be set to zero upon transmission and be ignored + upon receipt. + + POT Option data: + Variable-length field. The type of which is determined by the + IOAM POT-Type. + +4.5.1. IOAM Proof of Transit Type 0 + + IOAM Proof of Transit Option of IOAM POT-Type 0: + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Namespace-ID |IOAM POT-Type=0|R R R R R R R R| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ + | PktID | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ P + | PktID (contd) | O + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ T + | Cumulative | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | Cumulative (contd) | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+<-+ + + Namespace-ID: + 16-bit identifier of an IOAM-Namespace (see Section 4.3 above). + + IOAM POT-Type: + 8-bit identifier of a particular POT variant that specifies the + POT data that is included (see Section 4.5 above). For this case + here, IOAM POT-Type is set to the value 0. + + Bit 0-7: + Undefined (see Section 4.5 above). + + PktID: + 64-bit packet identifier. + + Cumulative: + 64-bit Cumulative that is updated at specific nodes by processing + per packet PktID field and configured parameters. + + | Note: Larger or smaller sizes of "PktID" and "Cumulative" data + | are feasible and could be required for certain deployments, + | e.g., in case of space constraints in the encapsulation + | protocols used. Future documents could introduce different + | sizes of data for "Proof of Transit". + +4.6. IOAM Edge-to-Edge Option-Type + + The IOAM Edge-to-Edge Option-Type carries data that is added by the + IOAM encapsulating node and interpreted by the IOAM decapsulating + node. The IOAM transit nodes MAY process the data but MUST NOT + modify it. + + The IOAM Edge-to-Edge Option-Type consist of a fixed-size "IOAM Edge- + to-Edge Option-Type header" and "IOAM Edge-to-Edge Option-Type data + fields": + + IOAM Edge-to-Edge Option-Type header: + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Namespace-ID | IOAM E2E-Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + The IOAM Edge-to-Edge Option-Type IOAM-Data-Fields MUST be aligned by + 4 octets: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | E2E Option data field determined by IOAM-E2E-Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Namespace-ID: + 16-bit identifier of an IOAM-Namespace. The Namespace-ID value of + 0x0000 is defined as the "Default-Namespace-ID" (see Section 4.3) + and MUST be known to all the nodes implementing IOAM. For any + other Namespace-ID value that does not match any Namespace-ID the + node is configured to operate on, the node MUST NOT change the + contents of the IOAM-Data-Fields. + + IOAM-E2E-Type: + 16-bit identifier that specifies which data types are used in the + E2E Option data. The IOAM-E2E-Type value is a bit field. The + order of packing the E2E Option data field elements follows the + bit order of the IOAM E2E-Type field as follows: + + Bit 0 Most significant bit. When set, it indicates the + presence of a 64-bit sequence number added to a specific + "packet group" that is used to detect packet loss, packet + reordering, or packet duplication within the group. The + "packet group" is deployment dependent and defined at the + IOAM encapsulating node, e.g., by n-tuple-based + classification of packets. When this bit is set, "Bit 1" + (for a 32-bit sequence number, see below) MUST be zero. + + Bit 1 When set, it indicates the presence of a 32-bit sequence + number added to a specific "packet group" that is used to + detect packet loss, packet reordering, or packet + duplication within that group. The "packet group" is + deployment dependent and defined at the IOAM + encapsulating node, e.g., by n-tuple-based classification + of packets. When this bit is set, "Bit 0" (for a 64-bit + sequence number, see above) MUST be zero. + + Bit 2 When set, it indicates the presence of timestamp seconds, + representing the time at which the packet entered the + IOAM-Domain. Within the IOAM encapsulating node, the + time that the timestamp is retrieved can depend on the + implementation. Some possibilities are 1) the time at + which the packet was received by the node, 2) the time at + which the packet was transmitted by the node, or 3) when + a tunnel encapsulation is used, the point at which the + packet is encapsulated into the tunnel. Each + implementation has to document when the E2E timestamp + that is going to be put in the packet is retrieved. This + 4-octet field has three possible formats, based on either + PTP (see e.g., [RFC8877]), NTP [RFC5905], or POSIX + [POSIX]. The three timestamp formats are specified in + Section 5. In all three cases, the timestamp seconds + field contains the 32 most significant bits of the + timestamp format that is specified in Section 5. If a + node is not capable of populating this field, it assigns + the value 0xFFFFFFFF. Note that this is a legitimate + value that is valid for 1 second in approximately 136 + years; the analyzer has to correlate several packets or + compare the timestamp value to its own time of day in + order to detect the error indication. + + Bit 3 When set, it indicates the presence of timestamp + fraction, representing the time at which the packet + entered the IOAM-Domain. This 4-octet field has three + possible formats, based on either PTP (see e.g., + [RFC8877]), NTP [RFC5905], or POSIX [POSIX]. The three + timestamp formats are specified in Section 5. In all + three cases, the timestamp fraction field contains the 32 + least significant bits of the timestamp format that is + specified in Section 5. If a node is not capable of + populating this field, it assigns the value 0xFFFFFFFF. + Note that this is a legitimate value in the NTP format, + valid for approximately 233 picoseconds in every second. + If the NTP format is used, the analyzer has to correlate + several packets in order to detect the error indication. + + Bit 4-15 Undefined. An IOAM encapsulating node MUST set the + value of these bits to zero upon transmission and ignore + them upon receipt. + + E2E Option data: + Variable-length field. The type of which is determined by the + IOAM E2E-Type. + +5. Timestamp Formats + + The IOAM-Data-Fields include a timestamp field that is represented in + one of three possible timestamp formats. It is assumed that the + management plane is responsible for determining which timestamp + format is used. + +5.1. PTP Truncated Timestamp Format + + The Precision Time Protocol (PTP) uses an 80-bit timestamp format. + The truncated timestamp format is a 64-bit field, which is the 64 + least significant bits of the 80-bit PTP timestamp. The PTP + truncated format is specified in Section 4.3 of [RFC8877], and the + details are presented below for the sake of completeness. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Seconds | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Nanoseconds | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Timestamp field format: + Seconds: Specifies the integer portion of the number of seconds + since the PTP epoch + + Size: 32 bits + + Units: seconds + + Nanoseconds: Specifies the fractional portion of the number of + seconds since the PTP epoch + + Size: 32 bits + + Units: nanoseconds. The value of this field is in the range 0 + to (10^9)-1. + + Epoch: + PTP epoch. For details, see e.g., [RFC8877]. + + Resolution: + The resolution is 1 nanosecond. + + Wraparound: + This time format wraps around every 2^32 seconds, which is roughly + 136 years. The next wraparound will occur in the year 2106. + + Synchronization Aspects: + It is assumed that the nodes that run this protocol are + synchronized among themselves. Nodes MAY be synchronized to a + global reference time. Note that if PTP is used for + synchronization, the timestamp MAY be derived from the PTP- + synchronized clock, allowing the timestamp to be measured with + respect to the clock of a PTP Grandmaster clock. + +5.2. NTP 64-Bit Timestamp Format + + The Network Time Protocol (NTP) [RFC5905] timestamp format is 64 bits + long. This specification uses the NTP timestamp format that is + specified in Section 4.2.1 of [RFC8877], and the details are + presented below for the sake of completeness. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Seconds | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Fraction | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Timestamp field format: + Seconds: specifies the integer portion of the number of seconds + since the NTP epoch + + Size: 32 bits + + Units: seconds + + Fraction: specifies the fractional portion of the number of + seconds since the NTP epoch + + Size: 32 bits + + Units: the unit is 2^(-32) seconds, which is roughly equal to + 233 picoseconds. + + Epoch: + NTP epoch. For details, see [RFC5905]. + + Resolution: + The resolution is 2^(-32) seconds. + + Wraparound: + This time format wraps around every 2^32 seconds, which is roughly + 136 years. The next wraparound will occur in the year 2036. + + Synchronization Aspects: + Nodes that use this timestamp format will typically be + synchronized to UTC using NTP [RFC5905]. Thus, the timestamp MAY + be derived from the NTP-synchronized clock, allowing the timestamp + to be measured with respect to the clock of an NTP server. + +5.3. POSIX-Based Timestamp Format + + This timestamp format is based on the POSIX time format [POSIX]. The + detailed specification of the timestamp format used in this document + is presented below. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Seconds | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Microseconds | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Timestamp field format: + Seconds: specifies the integer portion of the number of seconds + since the POSIX epoch + + Size: 32 bits + + Units: seconds + + Microseconds: specifies the fractional portion of the number of + seconds since the POSIX epoch + + Size: 32 bits + + Units: the unit is microseconds. The value of this field is + in the range 0 to (10^6)-1. + + Epoch: + POSIX epoch. For details, see [POSIX], Appendix A.4.16. + + Resolution: + The resolution is 1 microsecond. + + Wraparound: + This time format wraps around every 2^32 seconds, which is roughly + 136 years. The next wraparound will occur in the year 2106. + + Synchronization Aspects: + It is assumed that nodes that use this timestamp format run the + Linux operating system and hence use the POSIX time. In some + cases, nodes MAY be synchronized to UTC using a synchronization + mechanism that is outside the scope of this document, such as NTP + [RFC5905]. Thus, the timestamp MAY be derived from the NTP- + synchronized clock, allowing the timestamp to be measured with + respect to the clock of an NTP server. + +6. IOAM Data Export + + IOAM nodes collect information for packets traversing a domain that + supports IOAM. IOAM decapsulating nodes, as well as IOAM transit + nodes, can choose to retrieve IOAM information from the packet, + process the information further, and export the information using + e.g., IP Flow Information Export (IPFIX). The mechanisms and + associated data formats for exporting IOAM data are outside the scope + of this document. + + A way to perform raw data export of IOAM data using IPFIX is + discussed in [IPPM-IOAM-RAWEXPORT]. + +7. IANA Considerations + + IANA has defined a registry group named "In Situ OAM (IOAM)". + + This group includes the following registries: + + IOAM Option-Type + + IOAM Trace-Type + + IOAM Trace-Flags + + IOAM POT-Type + + IOAM POT-Flags + + IOAM E2E-Type + + IOAM Namespace-ID + + The subsequent subsections detail the registries therein contained. + +7.1. IOAM Option-Type Registry + + This registry defines 128 code points for the IOAM Option-Type field + for identifying IOAM-Option-Types, as explained in Section 4. The + following code points are defined in this document: + + 0: IOAM Pre-allocated Trace Option-Type + + 1: IOAM Incremental Trace Option-Type + + 2: IOAM POT Option-Type + + 3: IOAM E2E Option-Type + + Code points 4-127 are available for assignment via the "IETF Review" + process, as per [RFC8126]. + + New registration requests MUST use the following template: + + Name: name of the newly registered Option-Type + + Code point: desired value of the requested code point + + Description: brief description of the newly registered Option-Type + + Reference: reference to the document that defines the new Option- + Type + + The evaluation of a new registration request MUST also include + checking whether the new IOAM-Option-Type includes an IOAM-Namespace + field and that the IOAM-Namespace field is the first field in the + newly defined header of the new Option-Type. + +7.2. IOAM Trace-Type Registry + + This registry defines code points for each bit in the 24-bit IOAM + Trace-Type field for the Pre-allocated Trace Option-Type and + Incremental Trace Option-Type defined in Section 4.4. Bits 0-11 are + defined in this document in Paragraph 5 of Section 4.4.1: + + Bit 0: hop_Lim and node_id in short format + + Bit 1: ingress_if_id and egress_if_id in short format + + Bit 2: timestamp seconds + + Bit 3: timestamp fraction + + Bit 4: transit delay + + Bit 5: namespace-specific data in short format + + Bit 6: queue depth + + Bit 7: checksum complement + + Bit 8: hop_Lim and node_id in wide format + + Bit 9: ingress_if_id and egress_if_id in wide format + + Bit 10: namespace-specific data in wide format + + Bit 11: buffer occupancy + + Bit 22: variable-length Opaque State Snapshot + + Bit 23: reserved + + Bits 12-21 are available for assignment via the "IETF Review" + process, as per [RFC8126]. + + New registration requests MUST use the following template: + + Bit: desired bit to be allocated in the 24-bit IOAM Trace Option- + Type field for the Pre-allocated Trace Option-Type and Incremental + Trace Option-Type + + Description: brief description of the newly registered bit + + Reference: reference to the document that defines the new bit + +7.3. IOAM Trace-Flags Registry + + This registry defines code points for each bit in the 4-bit flags for + the Pre-allocated Trace-Option and Incremental Trace-Option defined + in Section 4.4. The meaning of Bit 0 (the most significant bit) for + trace flags is defined in this document in Paragraph 3 of + Section 4.4.1: + + Bit 0: "Overflow" (O-bit) + + Bits 1-3 are available for assignment via the "IETF Review" process, + as per [RFC8126]. + + New registration requests MUST use the following template: + + Bit: desired bit to be allocated in the 8-bit flags field of the + Pre-allocated Trace Option-Type and Incremental Trace Option-Type + + Description: brief description of the newly registered bit + + Reference: reference to the document that defines the new bit + +7.4. IOAM POT-Type Registry + + This registry defines 256 code points to define the IOAM POT-Type for + the IOAM Proof of Transit Option (Section 4.5). The code point value + 0 is defined in this document: + + 0: 16-Octet POT data + + Code points 1-255 are available for assignment via the "IETF Review" + process, as per [RFC8126]. + + New registration requests MUST use the following template: + + Name: name of the newly registered POT-Type + + Code point: desired value of the requested code point + + Description: brief description of the newly registered POT-Type + + Reference: reference to the document that defines the new POT-Type + +7.5. IOAM POT-Flags Registry + + This registry defines code points for each bit in the 8-bit flags for + the IOAM POT Option-Type defined in Section 4.5. + + Bits 0-7 are available for assignment via the "IETF Review" process, + as per [RFC8126]. + + New registration requests MUST use the following template: + + Bit: desired bit to be allocated in the 8-bit flags field of the + IOAM POT Option-Type + + Description: brief description of the newly registered bit + + Reference: reference to the document that defines the new bit + +7.6. IOAM E2E-Type Registry + + This registry defines code points for each bit in the 16-bit IOAM + E2E-Type field for the IOAM E2E Option (Section 4.6). Bits 0-3 are + defined in this document: + + Bit 0: 64-bit sequence number + + Bit 1: 32-bit sequence number + + Bit 2: timestamp seconds + + Bit 3: timestamp fraction + + Bits 4-15 are available for assignment via the "IETF Review" process, + as per [RFC8126]. + + New registration requests MUST use the following template: + + Bit: desired bit to be allocated in the 16-bit IOAM E2E-Type field + + Description: brief description of the newly registered bit + + Reference: reference to the document that defines the new bit + +7.7. IOAM Namespace-ID Registry + + IANA has set up the "IOAM Namespace-ID" registry that contains 16-bit + values and follows the template for requests shown below. The + meaning of 0x0000 is defined in this document. IANA has reserved the + values 0x0001 to 0x7FFF for private use (managed by operators), as + specified in Section 4.3 of this document. Registry entries for the + values 0x8000 to 0xFFFF are to be assigned via the "Expert Review" + policy, as per [RFC8126]. + + Upon receiving a new allocation request, a designated expert will + perform the following: + + * Review whether the request is complete, i.e., the registration + template has been filled in. The expert will send incomplete + requests back to the requester. + + * Check whether the request is neither a duplicate of nor + conflicting with either an already existing allocation or a + pending allocation. In case of duplicates or conflicts, the + expert will ask the requester to update the allocation request + accordingly. + + * Solicit feedback from relevant working groups and communities to + ensure that the new allocation request has been properly reviewed + and that rough consensus on the request exists. At a minimum, the + expert will solicit feedback from the IPPM Working Group by + posting the request to the ippm@ietf.org mailing list. The expert + will allow for a 3-week review period on the mailing lists. If + the feedback received from the relevant working groups and + communities within the review period indicates rough consensus on + the request, the expert will approve the request and ask IANA to + allocate the new Namespace-ID. In case the expert senses a lack + of consensus from the feedback received, the expert will ask the + requester to engage with the corresponding working groups and + communities to further review and refine the request. + + It is intended that any allocation will be accompanied by a published + RFC. In order to allow for the allocation of code points prior to + the RFC being approved for publication, the designated expert can + approve allocations once it seems clear that an RFC will be + published. + + 0x0000: default namespace (known to all IOAM nodes) + + 0x0001 - 0x7FFF: reserved for private use + + 0x8000 - 0xFFFF: unassigned + + New registration requests MUST use the following template: + + Name: name of the newly registered Namespace-ID + + Code point: desired value of the requested Namespace-ID + + Description: brief description of the newly registered Namespace-ID + + Reference: reference to the document that defines the new Namespace- + ID + + Status of the registration: Status can be either "permanent" or + "provisional". Namespace-ID registrations following a successful + expert review will have the status "provisional". Once the RFC + that defines the new Namespace-ID is published, the status is + changed to "permanent". + +8. Management and Deployment Considerations + + This document defines the structure and use of IOAM-Data-Fields. + This document does not define the encapsulation of IOAM-Data-Fields + into different protocols. Management and deployment aspects for IOAM + have to be considered within the context of the protocol IOAM-Data- + Fields are encapsulated into and, as such, are out of scope for this + document. For a discussion of IOAM deployment, please also refer to + [IPPM-IOAM-DEPLOYMENT], which outlines a framework for IOAM + deployment and provides best current practices. + +9. Security Considerations + + As discussed in [RFC7276], a successful attack on an OAM protocol in + general, and specifically on IOAM, can prevent the detection of + failures or anomalies or create a false illusion of nonexistent ones. + In particular, these threats are applicable by compromising the + integrity of IOAM data, either by maliciously modifying IOAM options + in transit or by injecting packets with maliciously generated IOAM + options. All nodes in the path of an IOAM-carrying packet can + perform such an attack. + + The Proof of Transit Option-Type (see Section 4.5) is used for + verifying the path of data packets, i.e., proving that packets + transited through a defined set of nodes. + + In case an attacker gains access to several nodes in a network and + would be able to change the system software of these nodes, IOAM- + Data-Fields could be misused and repurposed for a use different from + what is specified in this document. One type of misuse is the + implementation of a covert channel between network nodes. + + From a confidentiality perspective, although IOAM options are not + expected to contain user data, they can be used for network + reconnaissance, allowing attackers to collect information about + network paths, performance, queue states, buffer occupancy, etc. + Moreover, if IOAM data leaks from the IOAM-Domain, it could enable + reconnaissance beyond the scope of the IOAM-Domain. One possible + application of such reconnaissance is to gauge the effectiveness of + an ongoing attack, e.g., if buffers and queues are overflowing. + + IOAM can be used as a means for implementing Denial-of-Service (DoS) + attacks or for amplifying them. For example, a malicious attacker + can add an IOAM header to packets in order to consume the resources + of network devices that take part in IOAM or entities that receive, + collect, or analyze the IOAM data. Another example is a packet + length attack in which an attacker pushes headers associated with + IOAM-Option-Types into data packets, causing these packets to be + increased beyond the MTU size, resulting in fragmentation or in + packet drops. In case POT is used, an attacker could corrupt the POT + data fields in the packet, resulting in a verification failure of the + POT data, even if the packet followed the correct path. + + Since IOAM options can include timestamps, if network devices use + synchronization protocols, then any attack on the time protocol + [RFC7384] can compromise the integrity of the timestamp-related data + fields. + + At the management plane, attacks can be set up by misconfiguring or + by maliciously configuring IOAM-enabled nodes in a way that enables + other attacks. IOAM configuration should only be managed by + authorized processes or users. + + IETF protocols require features to ensure their security. While + IOAM-Data-Fields don't represent a protocol by themselves, the IOAM- + Data-Fields add to the protocol that the IOAM-Data-Fields are + encapsulated into. Any specification that defines how IOAM-Data- + Fields carried in an encapsulating protocol MUST provide for a + mechanism for cryptographic integrity protection of the IOAM-Data- + Fields. Cryptographic integrity protection could be achieved through + a mechanism of the encapsulating protocol, or it could incorporate + the mechanisms specified in [IPPM-IOAM-DATA-INTEGRITY]. + + The current document does not define a specific IOAM encapsulation. + It has to be noted that some IOAM encapsulation types can introduce + specific security considerations. A specification that defines an + IOAM encapsulation is expected to address the respective + encapsulation-specific security considerations. + + Notably, IOAM is expected to be deployed in limited domains, thus + confining the potential attack vectors to within the limited domain. + A limited administrative domain provides the operator with the means + to select, monitor, and control the access of all the network + devices, making these devices trusted by the operator. Indeed, in + order to limit the scope of threats mentioned above to within the + current limited domain, the network operator is expected to enforce + policies that prevent IOAM traffic from leaking outside of the IOAM- + Domain and prevent IOAM data from outside the domain to be processed + and used within the domain. + + This document does not define the data contents of custom fields, + like "Opaque State Snapshot" and "namespace-specific data" IOAM-Data- + Fields. These custom data fields will have security considerations + corresponding to their defined data contents that need to be + described where those formats are defined. + + IOAM deployments that leverage both IOAM Trace Option-Types, i.e., + the Pre-allocated Trace Option-Type and Incremental Trace Option- + Type, can suffer from incomplete visibility if the information + gathered via the two Trace Option-Types is not correlated and + aggregated appropriately. If IOAM transit nodes leverage the IOAM- + Data-Fields in the packet for further actions or insights, then IOAM + transit nodes that only support one IOAM Trace Option-Type in an IOAM + deployment that leverages both Trace Option-Types have limited + visibility and thus can draw inappropriate conclusions or take wrong + actions. + + The security considerations of a system that deploys IOAM, much like + any system, has to be reviewed on a per-deployment-scenario basis + based on a systems-specific threat analysis, which can lead to + specific security solutions that are beyond the scope of the current + document. Specifically, in an IOAM deployment that is not confined + to a single LAN but spans multiple inter-connected sites (for + example, using an overlay network), the inter-site links can be + secured (e.g., by IPsec) in order to avoid external threats. + + IOAM deployment considerations, including approaches to mitigate the + above discussed threads and potential attacks, are outside the scope + of this document. IOAM deployment considerations are discussed in + [IPPM-IOAM-DEPLOYMENT]. + +10. References + +10.1. Normative References + + [POSIX] IEEE, "IEEE/Open Group 1003.1-2017 - IEEE Standard for + Information Technology--Portable Operating System + Interface (POSIX(TM)) Base Specifications, Issue 7", IEEE + Std 1003.1-2017, January 2018, + <https://standards.ieee.org/ieee/1003.1/7101/>. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + <https://www.rfc-editor.org/info/rfc2119>. + + [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, + "Network Time Protocol Version 4: Protocol and Algorithms + Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, + <https://www.rfc-editor.org/info/rfc5905>. + + [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for + Writing an IANA Considerations Section in RFCs", BCP 26, + RFC 8126, DOI 10.17487/RFC8126, June 2017, + <https://www.rfc-editor.org/info/rfc8126>. + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, <https://www.rfc-editor.org/info/rfc8174>. + +10.2. Informative References + + [IPPM-IOAM-DATA-INTEGRITY] + Brockners, F., Bhandari, S., Mizrahi, T., and J. Iurman, + "Integrity of In-situ OAM Data Fields", Work in Progress, + Internet-Draft, draft-ietf-ippm-ioam-data-integrity-01, 2 + March 2022, <https://datatracker.ietf.org/doc/html/draft- + ietf-ippm-ioam-data-integrity-01>. + + [IPPM-IOAM-DEPLOYMENT] + Brockners, F., Bhandari, S., Bernier, D., and T. Mizrahi, + "In-situ OAM Deployment", Work in Progress, Internet- + Draft, draft-ietf-ippm-ioam-deployment-01, 11 April 2022, + <https://datatracker.ietf.org/doc/html/draft-ietf-ippm- + ioam-deployment-01>. + + [IPPM-IOAM-RAWEXPORT] + Spiegel, M., Brockners, F., Bhandari, S., and R. + Sivakolundu, "In-situ OAM raw data export with IPFIX", + Work in Progress, Internet-Draft, draft-spiegel-ippm-ioam- + rawexport-06, 21 February 2022, + <https://datatracker.ietf.org/doc/html/draft-spiegel-ippm- + ioam-rawexport-06>. + + [IPV6-RECORD-ROUTE] + Kitamura, H., "Record Route for IPv6 (RR6) Hop-by-Hop + Option Extension", Work in Progress, Internet-Draft, + draft-kitamura-ipv6-record-route-00, 17 November 2000, + <https://datatracker.ietf.org/doc/html/draft-kitamura- + ipv6-record-route-00>. + + [NVO3-VXLAN-GPE] + Maino, F., Ed., Kreeger, L., Ed., and U. Elzur, Ed., + "Generic Protocol Extension for VXLAN (VXLAN-GPE)", Work + in Progress, Internet-Draft, draft-ietf-nvo3-vxlan-gpe-12, + 22 September 2021, <https://datatracker.ietf.org/doc/html/ + draft-ietf-nvo3-vxlan-gpe-12>. + + [RFC7276] Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. + Weingarten, "An Overview of Operations, Administration, + and Maintenance (OAM) Tools", RFC 7276, + DOI 10.17487/RFC7276, June 2014, + <https://www.rfc-editor.org/info/rfc7276>. + + [RFC7384] Mizrahi, T., "Security Requirements of Time Protocols in + Packet Switched Networks", RFC 7384, DOI 10.17487/RFC7384, + October 2014, <https://www.rfc-editor.org/info/rfc7384>. + + [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function + Chaining (SFC) Architecture", RFC 7665, + DOI 10.17487/RFC7665, October 2015, + <https://www.rfc-editor.org/info/rfc7665>. + + [RFC7799] Morton, A., "Active and Passive Metrics and Methods (with + Hybrid Types In-Between)", RFC 7799, DOI 10.17487/RFC7799, + May 2016, <https://www.rfc-editor.org/info/rfc7799>. + + [RFC7820] Mizrahi, T., "UDP Checksum Complement in the One-Way + Active Measurement Protocol (OWAMP) and Two-Way Active + Measurement Protocol (TWAMP)", RFC 7820, + DOI 10.17487/RFC7820, March 2016, + <https://www.rfc-editor.org/info/rfc7820>. + + [RFC7821] Mizrahi, T., "UDP Checksum Complement in the Network Time + Protocol (NTP)", RFC 7821, DOI 10.17487/RFC7821, March + 2016, <https://www.rfc-editor.org/info/rfc7821>. + + [RFC8300] Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., + "Network Service Header (NSH)", RFC 8300, + DOI 10.17487/RFC8300, January 2018, + <https://www.rfc-editor.org/info/rfc8300>. + + [RFC8799] Carpenter, B. and B. Liu, "Limited Domains and Internet + Protocols", RFC 8799, DOI 10.17487/RFC8799, July 2020, + <https://www.rfc-editor.org/info/rfc8799>. + + [RFC8877] Mizrahi, T., Fabini, J., and A. Morton, "Guidelines for + Defining Packet Timestamps", RFC 8877, + DOI 10.17487/RFC8877, September 2020, + <https://www.rfc-editor.org/info/rfc8877>. + + [RFC8926] Gross, J., Ed., Ganga, I., Ed., and T. Sridhar, Ed., + "Geneve: Generic Network Virtualization Encapsulation", + RFC 8926, DOI 10.17487/RFC8926, November 2020, + <https://www.rfc-editor.org/info/rfc8926>. + +Acknowledgements + + The authors would like to thank Éric Vyncke, Nalini Elkins, Srihari + Raghavan, Ranganathan T S, Karthik Babu Harichandra Babu, Akshaya + Nadahalli, LJ Wobker, Erik Nordmark, Vengada Prasad Govindan, Andrew + Yourtchenko, Aviv Kfir, Tianran Zhou, Zhenbin (Robin), and Greg + Mirsky for the comments and advice. + + This document leverages and builds on top of several concepts + described in [IPV6-RECORD-ROUTE]. The authors would like to + acknowledge the work done by the author Hiroshi Kitamura and people + involved in writing it. + + The authors would like to gracefully acknowledge useful review and + insightful comments received from Joe Clarke, Al Morton, Tom Herbert, + Carlos J. Bernardos, Haoyu Song, Mickey Spiegel, Roman Danyliw, + Benjamin Kaduk, Murray S. Kucherawy, Ian Swett, Martin Duke, + Francesca Palombini, Lars Eggert, Alvaro Retana, Erik Kline, Robert + Wilton, Zaheduzzaman Sarker, Dan Romascanu, and Barak Gafni. + +Contributors + + This document was the collective effort of several authors. The text + and content were contributed by the editors and the coauthors listed + below. + + Carlos Pignataro + Cisco Systems, Inc. + Research Triangle Park + 7200-11 Kit Creek Road + NC 27709 + United States of America + Email: cpignata@cisco.com + + + Mickey Spiegel + Barefoot Networks, an Intel company + 101 Innovation Drive + San Jose, CA 95134-1941 + United States of America + Email: mickey.spiegel@intel.com + + + Barak Gafni + Nvidia + Suite 100 + 350 Oakmead Parkway + Sunnyvale, CA 94085 + United States of America + Email: gbarak@nvidia.com + + + Jennifer Lemon + Broadcom + 270 Innovation Drive + San Jose, CA 95134 + United States of America + Email: jennifer.lemon@broadcom.com + + + Hannes Gredler + RtBrick Inc. + Email: hannes@rtbrick.com + + + John Leddy + United States of America + Email: john@leddy.net + + + Stephen Youell + JP Morgan Chase + 25 Bank Street + London + E14 5JP + United Kingdom + Email: stephen.youell@jpmorgan.com + + + David Mozes + Email: mosesster@gmail.com + + + Petr Lapukhov + Facebook + 1 Hacker Way + Menlo Park, CA 94025 + United States of America + Email: petr@fb.com + + + Remy Chang + Barefoot Networks, an Intel company + 101 Innovation Drive + San Jose, CA 95134-1941 + United States of America + Email: remy.chang@intel.com + + + Daniel Bernier + Bell Canada + Canada + Email: daniel.bernier@bell.ca + + +Authors' Addresses + + Frank Brockners (editor) + Cisco Systems, Inc. + 3rd Floor + Nordhein-Westfalen + Hansaallee 249 + 40549 Duesseldorf + Germany + Email: fbrockne@cisco.com + + + Shwetha Bhandari (editor) + Thoughtspot + 3rd Floor + Indiqube Orion + Garden Layout + HSR Layout + 24th Main Rd + Bangalore 560 102 + Karnataka + India + Email: shwetha.bhandari@thoughtspot.com + + + Tal Mizrahi (editor) + Huawei + 8-2 Matam + Haifa 3190501 + Israel + Email: tal.mizrahi.phd@gmail.com |