diff options
Diffstat (limited to 'doc/rfc/rfc9222.txt')
-rw-r--r-- | doc/rfc/rfc9222.txt | 1275 |
1 files changed, 1275 insertions, 0 deletions
diff --git a/doc/rfc/rfc9222.txt b/doc/rfc/rfc9222.txt new file mode 100644 index 0000000..eded0cc --- /dev/null +++ b/doc/rfc/rfc9222.txt @@ -0,0 +1,1275 @@ + + + + +Internet Engineering Task Force (IETF) B. Carpenter +Request for Comments: 9222 Univ. of Auckland +Category: Informational L. Ciavaglia +ISSN: 2070-1721 Rakuten Mobile + S. Jiang + Huawei Technologies Co., Ltd + P. Peloso + Nokia + March 2022 + + + Guidelines for Autonomic Service Agents + +Abstract + + This document proposes guidelines for the design of Autonomic Service + Agents for autonomic networks. Autonomic Service Agents, together + with the Autonomic Network Infrastructure, the Autonomic Control + Plane, and the GeneRic Autonomic Signaling Protocol, constitute base + elements of an autonomic networking ecosystem. + +Status of This Memo + + This document is not an Internet Standards Track specification; it is + published for informational purposes. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Not all documents + approved by the IESG are candidates for any level of Internet + Standard; see Section 2 of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc9222. + +Copyright Notice + + Copyright (c) 2022 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Revised BSD License text as described in Section 4.e of the + Trust Legal Provisions and are provided without warranty as described + in the Revised BSD License. + +Table of Contents + + 1. Introduction + 2. Terminology + 3. Logical Structure of an Autonomic Service Agent + 4. Interaction with the Autonomic Networking Infrastructure + 4.1. Interaction with the Security Mechanisms + 4.2. Interaction with the Autonomic Control Plane + 4.3. Interaction with GRASP and its API + 4.4. Interaction with Policy Mechanisms + 5. Interaction with Non-autonomic Components and Systems + 6. Design of GRASP Objectives + 7. Life Cycle + 7.1. Installation Phase + 7.1.1. Installation Phase Inputs and Outputs + 7.2. Instantiation Phase + 7.2.1. Operator's Goal + 7.2.2. Instantiation Phase Inputs and Outputs + 7.2.3. Instantiation Phase Requirements + 7.3. Operation Phase + 7.4. Removal Phase + 8. Coordination and Data Models + 8.1. Coordination between Autonomic Functions + 8.2. Coordination with Traditional Management Functions + 8.3. Data Models + 9. Robustness + 10. Security Considerations + 11. IANA Considerations + 12. References + 12.1. Normative References + 12.2. Informative References + Appendix A. Example Logic Flows + Acknowledgements + Authors' Addresses + +1. Introduction + + This document proposes guidelines for the design of Autonomic Service + Agents (ASAs) in the context of an Autonomic Network (AN) based on + the Autonomic Network Infrastructure (ANI) outlined in the autonomic + networking reference model [RFC8993]. This infrastructure makes use + of the Autonomic Control Plane (ACP) [RFC8994] and the GeneRic + Autonomic Signaling Protocol (GRASP) [RFC8990]. A general + introduction to this environment may be found at [IPJ], which also + includes explanatory diagrams, and a summary of terminology is in + Section 2. + + This document is a contribution to the description of an autonomic + networking ecosystem, recognizing that a deployable autonomic network + needs more than just ACP and GRASP implementations. Such an + autonomic network must achieve management tasks that a Network + Operations Center (NOC) cannot readily achieve manually, such as + continuous resource optimization or automated fault detection and + repair. These tasks, and other management automation goals, are + described at length in [RFC7575]. The net result should be + significant operational improvement. To achieve this, the autonomic + networking ecosystem must include at least a library of ASAs and + corresponding GRASP technical objective definitions. A GRASP + objective [RFC8990] is a data structure whose main contents are a + name and a value. The value consists of a single configurable + parameter or a set of parameters of some kind. + + There must also be tools to deploy and oversee ASAs, and integration + with existing operational mechanisms [RFC8368]. However, this + document focuses on the design of ASAs, with some reference to + implementation and operational aspects. + + There is considerable literature about autonomic agents with a + variety of proposals about how they should be characterized. Some + examples are [DEMOLA06], [HUEBSCHER08], [MOVAHEDI12], and [GANA13]. + However, for the present document, the basic definitions and goals + for autonomic networking given in [RFC7575] apply. According to RFC + 7575, an Autonomic Service Agent is "An agent implemented on an + autonomic node that implements an autonomic function, either in part + (in the case of a distributed function) or whole." + + ASAs must be distinguished from other forms of software components. + They are components of network or service management; they do not in + themselves provide services to end users. They do, however, provide + management services to network operators and administrators. For + example, the services envisaged for network function virtualization + (NFV) [NFV] or for service function chaining (SFC) [RFC7665] might be + managed by an ASA rather than by traditional configuration tools. + + Another example is that an existing script running within a router to + locally monitor or configure functions or services could be upgraded + to an ASA that could communicate with peer scripts on neighboring or + remote routers. A high-level API will allow such upgraded scripts to + take full advantage of the secure ACP and the discovery, negotiation, + and synchronization features of GRASP. Familiar tasks such as + configuring an Interior Gateway Protocol (IGP) on neighboring routers + or even exchanging IGP security keys could be performed securely in + this way. This document mainly addresses issues affecting quite + complex ASAs, but initially, the most useful ASAs may in fact be + rather simple evolutions of existing scripts. + + The reference model [RFC8993] for autonomic networks explains further + the functionality of ASAs by adding the following: + + | [An ASA is] a process that makes use of the features provided by + | the ANI to achieve its own goals, usually including interaction + | with other ASAs via GRASP [RFC8990] or otherwise. Of course, it + | also interacts with the specific targets of its function, using + | any suitable mechanism. Unless its function is very simple, the + | ASA will need to handle overlapping asynchronous operations. It + | may therefore be a quite complex piece of software in its own + | right, forming part of the application layer above the ANI. + + As mentioned, there will certainly be simple ASAs that manage a + single objective in a straightforward way and do not need + asynchronous operations. In nodes where computing power and memory + space are limited, ASAs should run at a much lower frequency than the + primary workload, so CPU load should not be a big issue, but memory + footprint in a constrained node is certainly a concern. ASAs + installed in constrained devices will have limited functionality. In + such cases, many aspects of the current document do not apply. + However, in the general case, an ASA may be a relatively complex + software component that will in many cases control and monitor + simpler entities in the same or remote host(s). For example, a + device controller that manages tens or hundreds of simple devices + might contain a single ASA. + + The remainder of this document offers guidance on the design of + complex ASAs. Some of the material may be familiar to those + experienced in distributed fault-tolerant and real-time control + systems. Robustness and security are of particular importance in + autonomic networks and are discussed in Sections 9 and 10. + +2. Terminology + + This section summarizes various acronyms and terminology used in the + document. Where no other reference is given, please consult + [RFC8993] or [RFC7575]. + + Autonomic: self-managing (self-configuring, self-protecting, self- + healing, self-optimizing), but allowing high-level guidance by a + central entity such as a NOC + + Autonomic Function: a function that adapts on its own to a changing + environment + + Autonomic Node: a node that employs autonomic functions + + ACP: Autonomic Control Plane [RFC8994] + + AN: Autonomic Network; a network of autonomic nodes, which interact + directly with each other + + ANI: Autonomic Network Infrastructure + + ASA: Autonomic Service Agent; an agent installed on an autonomic + node that implements an autonomic function, either partially (in + the case of a distributed function) or completely + + BRSKI: Bootstrapping Remote Secure Key Infrastructure [RFC8995] + + CBOR: Concise Binary Object Representation[RFC8949] + + GRASP: GeneRric Autonomic Signaling Protocol [RFC8990] + + GRASP API: GRASP Application Programming Interface [RFC8991] + + NOC: Network Operations Center [RFC8368] + + Objective: A GRASP technical objective is a data structure whose + main contents are a name and a value. The value consists of a + single configurable parameter or a set of parameters of some kind + [RFC8990]. + +3. Logical Structure of an Autonomic Service Agent + + As mentioned above, all but the simplest ASAs will need to support + asynchronous operations. Different programming environments support + asynchronicity in different ways. In this document, we use an + explicit multi-threading model to describe operations. This is + illustrative, and alternatives to multi-threading are discussed in + detail in connection with the GRASP API (see Section 4.3). + + A typical ASA will have a main thread that performs various initial + housekeeping actions such as: + + * obtain authorization credentials, if needed + + * register the ASA with GRASP + + * acquire relevant policy parameters + + * declare data structures for relevant GRASP objectives + + * register with GRASP those objectives that it will actively manage + + * launch a self-monitoring thread + + * enter its main loop + + The logic of the main loop will depend on the details of the + autonomic function concerned. Whenever asynchronous operations are + required, extra threads may be launched. Examples of such threads + include: + + * repeatedly flood an objective to the AN so that any ASA can + receive the objective's latest value + + * accept incoming synchronization requests for an objective managed + by this ASA + + * accept incoming negotiation requests for an objective managed by + this ASA, and then conduct the resulting negotiation with the + counterpart ASA + + * manage subsidiary non-autonomic devices directly + + These threads should all either exit after their job is done or enter + a wait state for new work to avoid wasting system resources. + + According to the degree of parallelism needed by the application, + some of these threads might be launched in multiple instances. In + particular, if negotiation sessions with other ASAs are expected to + be long or to involve wait states, the ASA designer might allow for + multiple simultaneous negotiating threads, with appropriate use of + queues and synchronization primitives to maintain consistency. + + The main loop itself could act as the initiator of synchronization + requests or negotiation requests when the ASA needs data or resources + from other ASAs. In particular, the main loop should watch for + changes in policy parameters that affect its operation and, if + appropriate, occasionally refresh authorization credentials. It + should also do whatever is required to avoid unnecessary resource + consumption, for example, by limiting its frequency of execution. + + The self-monitoring thread is of considerable importance. Failure of + autonomic service agents is highly undesirable. To a large extent, + this depends on careful coding and testing, with no unhandled error + returns or exceptions, but if there is nevertheless some sort of + failure, the self-monitoring thread should detect it, fix it if + possible, and, in the worst case, restart the entire ASA. + + Appendix A presents some example logic flows in informal pseudocode. + +4. Interaction with the Autonomic Networking Infrastructure + +4.1. Interaction with the Security Mechanisms + + An ASA by definition runs in an autonomic node. Before any normal + ASAs are started, such nodes must be bootstrapped into the autonomic + network's secure key infrastructure, typically in accordance with + [RFC8995]. This key infrastructure will be used to secure the ACP + (next section) and may be used by ASAs to set up additional secure + interactions with their peers, if needed. + + Note that the secure bootstrap process itself incorporates simple + special-purpose ASAs that use a restricted mode of GRASP (Section 4 + of [RFC8995]). + +4.2. Interaction with the Autonomic Control Plane + + In a normal autonomic network, ASAs will run as clients of the ACP, + which will provide a fully secured network environment for all + communication with other ASAs, in most cases mediated by GRASP (next + section). + + Note that the ACP formation process itself incorporates simple + special-purpose ASAs that use a restricted mode of GRASP (Section 6.4 + of [RFC8994]). + +4.3. Interaction with GRASP and its API + + In a node where a significant number of ASAs are installed, GRASP + [RFC8990] is likely to run as a separate process with its API + [RFC8991] available in user space. Thus, ASAs may operate without + special privilege, unless they need it for other reasons. The ASA's + view of GRASP is built around GRASP objectives (Section 6), defined + as data structures containing administrative information such as the + objective's unique name, and its current value. The format and size + of the value is not restricted by the protocol, except that it must + be possible to serialize it for transmission in Concise Binary Object + Representation (CBOR) [RFC8949], subject only to GRASP's maximum + message size as discussed in Section 6. + + As discussed in Section 3, GRASP is an asynchronous protocol, and + this document uses a multi-threading model to describe operations. + In many programming environments, an "event loop" model is used + instead, in which case each thread would be implemented as an event + handler called in turn by the main loop. For this case, the GRASP + API must provide non-blocking calls and possibly support callbacks. + This topic is discussed in more detail in [RFC8991], and other + asynchronicity models are also possible. Whenever necessary, the + GRASP session identifier will be used to distinguish simultaneous + operations. + + The GRASP API should offer the following features: + + * Registration functions, so that an ASA can register itself and the + objectives that it manages. + + * A discovery function, by which an ASA can discover other ASAs + supporting a given objective. + + * A negotiation request function, by which an ASA can start + negotiation of an objective with a counterpart ASA. With this, + there is a corresponding listening function for an ASA that wishes + to respond to negotiation requests and a set of functions to + support negotiating steps. Once a negotiation starts, it is a + symmetric process with both sides sending successive objective + values to each other until agreement is reached (or the + negotiation fails). + + * A synchronization function, by which an ASA can request the + current value of an objective from a counterpart ASA. With this, + there is a corresponding listening function for an ASA that wishes + to respond to synchronization requests. Unlike negotiation, + synchronization is an asymmetric process in which the listener + sends a single objective value to the requester. + + * A flood function, by which an ASA can cause the current value of + an objective to be flooded throughout the AN so that any ASA can + receive it. + + For further details and some additional housekeeping functions, see + [RFC8991]. + + The GRASP API is intended to support the various interactions + expected between most ASAs, such as the interactions outlined in + Section 3. However, if ASAs require additional communication between + themselves, they can do so directly over the ACP to benefit from its + security. One option is to use GRASP discovery and synchronization + as a rendezvous mechanism between two ASAs, passing communication + parameters such as a TCP port number via GRASP. The use of TLS over + the ACP for such communications is advisable, as described in + Section 6.9.2 of [RFC8994]. + +4.4. Interaction with Policy Mechanisms + + At the time of writing, the policy mechanisms for the ANI are + undefined. In particular, the use of declarative policies (aka + Intents) for the definition and management of an ASA's behaviors + remains a research topic [IBN-CONCEPTS]. + + In the cases where ASAs are defined as closed control loops, the + specifications defined in [ZSM009-1] regarding imperative and + declarative goal statements may be applicable. + + In the ANI, policy dissemination is expected to operate by an + information distribution mechanism (e.g., via GRASP [RFC8990]) that + can reach all autonomic nodes and therefore every ASA. However, each + ASA must be capable of operating "out of the box" in the absence of + locally defined policy, so every ASA implementation must include + carefully chosen default values and settings for all policy + parameters. + +5. Interaction with Non-autonomic Components and Systems + + To have any external effects, an ASA must also interact with non- + autonomic components of the node where it is installed. For example, + an ASA whose purpose is to manage a resource must interact with that + resource. An ASA managing an entity that is also managed by local + software must interact with that software. For example, if such + management is performed by NETCONF [RFC6241], the ASA must interact + with the NETCONF server as an independent NETCONF client in the same + node to avoid any inconsistency between configuration changes + delivered via NETCONF and configuration changes made by the ASA. + + In an environment where systems are virtualized and specialized using + techniques such as network function virtualization or network + slicing, there will be a design choice whether ASAs are deployed once + per physical node or once per virtual context. A related issue is + whether the ANI as a whole is deployed once on a physical network or + whether several virtual ANIs are deployed. This aspect needs to be + considered by the ASA designer. + +6. Design of GRASP Objectives + + The design of an ASA will often require the design of a new GRASP + objective. The general rules for the format of GRASP objectives, + their names, and IANA registration are given in [RFC8990]. + Additionally, that document discusses various general considerations + for the design of objectives, which are not repeated here. However, + note that GRASP, like HTTP, does not provide transactional integrity. + In particular, steps in a GRASP negotiation are not idempotent. The + design of a GRASP objective and the logic flow of the ASA should take + this into account. One approach, which should be used when possible, + is to design objectives with idempotent semantics. If this is not + possible, typically if an ASA is allocating part of a shared resource + to other ASAs, it needs to ensure that the same part of the resource + is not allocated twice. The easiest way is to run only one + negotiation at a time. If an ASA is capable of overlapping several + negotiations, it must avoid interference between these negotiations. + + Negotiations will always end, normally because one end or the other + declares success or failure. If this does not happen, either a + timeout or exhaustion of the loop count will occur. The definition + of a GRASP objective should describe a specific negotiation policy if + it is not self-evident. + + GRASP allows a "dry run" mode of negotiation, where a negotiation + session follows its normal course but is not committed at either end + until a subsequent live negotiation session. If dry run mode is + defined for the objective, its specification, and every + implementation, must consider what state needs to be saved following + a dry run negotiation, such that a subsequent live negotiation can be + expected to succeed. It must be clear how long this state is kept + and what happens if the live negotiation occurs after this state is + deleted. An ASA that requests a dry run negotiation must take + account of the possibility that a successful dry run is followed by a + failed live negotiation. Because of these complexities, the dry run + mechanism should only be supported by objectives and ASAs where there + is a significant benefit from it. + + The actual value field of an objective is limited by the GRASP + protocol definition to any data structure that can be expressed in + Concise Binary Object Representation (CBOR) [RFC8949]. For some + objectives, a single data item will suffice, for example, an integer, + a floating point number, a UTF-8 string, or an arbitrary byte string. + For more complex cases, a simple tuple structure such as [item1, + item2, item3] could be used. Since CBOR is closely linked to JSON, + it is also rather easy to define an objective whose value is a JSON + structure. The formats acceptable by the GRASP API will limit the + options in practice. A generic solution is for the API to accept and + deliver the value field in raw CBOR, with the ASA itself encoding and + decoding it via a CBOR library (Section 2.3.2.4 of [RFC8991]). + + The maximum size of the value field of an objective is limited by the + GRASP maximum message size. If the default maximum size specified as + GRASP_DEF_MAX_SIZE by [RFC8990] is not enough, the specification of + the objective must indicate the required maximum message size for + both unicast and multicast messages. + + A mapping from YANG to CBOR is defined by [CBOR-YANG]. Subject to + the size limit defined for GRASP messages, nothing prevents + objectives transporting YANG in this way. + + The flexibility of CBOR implies that the value field of many + objectives can be extended in service, to add additional information + or alternative content, especially if JSON-like structures are used. + This has consequences for the robustness of ASAs, as discussed in + Section 9. + +7. Life Cycle + + The ASA life cycle is discussed in [AUTONOMIC-FUNCTION], from which + the following text was derived. It does not cover all details, and + some of the terms used would require precise definitions in a given + implementation. + + In simple cases, autonomic functions could be permanent, in the sense + that ASAs are shipped as part of a product and persist throughout the + product's life. However, in complex cases, a more likely situation + is that ASAs need to be installed or updated dynamically because of + new requirements or bugs. This section describes one approach to the + resulting life cycle of individual ASAs. It does not consider wider + issues such as updates of shared libraries. + + Because continuity of service is fundamental to autonomic networking, + the process of seamlessly replacing a running instance of an ASA with + a new version needs to be part of the ASA's design. The implication + of service continuity on the design of ASAs can be illustrated along + the three main phases of the ASA life cycle, namely installation, + instantiation, and operation. + + +--------------+ + Undeployed ------>| |------> Undeployed + | Installed | + +-->| |---+ + Mandate | +--------------+ | Receives a + is revoked | +--------------+ | Mandate + +---| |<--+ + | Instantiated | + +-->| |---+ + set | +--------------+ | set + down | +--------------+ | up + +---| |<--+ + | Operational | + | | + +--------------+ + + Figure 1: Life Cycle of an Autonomic Service Agent + +7.1. Installation Phase + + We define "installation" to mean that a piece of software is loaded + into a device, along with any necessary libraries, but is not yet + activated. + + Before being able to instantiate and run ASAs, the operator will + first provision the infrastructure with the sets of ASA software + corresponding to its needs and objectives. Such software must be + checked for integrity and authenticity before installation. The + provisioning of the infrastructure is realized in the installation + phase and consists of installing (or checking the availability of) + the pieces of software of the different ASAs in a set of Installation + Hosts within the autonomic network. + + There are three properties applicable to the installation of ASAs: + + * The dynamic installation property allows installing an ASA on + demand, on any hosts compatible with the ASA. + + * The decoupling property allows an ASA on one machine to control + resources in another machine (known as "decoupled mode"). + + * The multiplicity property allows controlling multiple sets of + resources from a single ASA. + + These three properties are very important in the context of the + installation phase as their variations condition how the ASA could be + installed on the infrastructure. + +7.1.1. Installation Phase Inputs and Outputs + + Inputs are: + + * [ASA_type]: specifies which ASA to install. + + * [Installation_target_infrastructure]: specifies the candidate + installation Hosts. + + * [ASA_placement_function]: specifies how the installation phase + will meet the operator's needs and objectives for the provision of + the infrastructure. This function is only useful in the decoupled + mode. It can be as simple as an explicit list of hosts on which + the ASAs are to be installed, or it could consist of operator- + defined criteria and constraints. + + The main output of the installation phase is a [List_of_ASAs] + installed on [List_of_hosts]. This output is also useful for the + coordination function where it acts as a static interaction map (see + Section 8.1). + + The condition to validate in order to pass to next phase is to ensure + that [List_of_ASAs] are correctly installed on [List_of_hosts]. A + minimum set of primitives to support the installation of ASAs could + be the following: install (List_of_ASAs, + Installation_target_infrastructure, ASA_placement_function) and + uninstall (List_of_ASAs). + +7.2. Instantiation Phase + + We define "instantiation" as the operation of creating a single ASA + instance from the corresponding piece of installed software. + + Once the ASAs are installed on the appropriate hosts in the network, + these ASAs may start to operate. From the operator viewpoint, an + operating ASA means the ASA manages the network resources as per the + objectives given. At the ASA local level, operating means executing + their control loop algorithm. + + There are two aspects to take into consideration. First, having a + piece of code installed and available to run on a host is not the + same as having an agent based on this piece of code running inside + the host. Second, in a coupled case, determining which resources are + controlled by an ASA is straightforward (the ASA runs on the same + autonomic node as the resources it is controlling). In a decoupled + mode, determining this is a bit more complex: a starting agent will + have to either discover the set of resources it ought to control, or + such information has to be communicated to the ASA. + + The instantiation phase of an ASA covers both these aspects: starting + the agent code (when this does not start automatically) and + determining which resources have to be controlled (when this is not + straightforward). + +7.2.1. Operator's Goal + + Through this phase, the operator wants to control its autonomic + network regarding at least two aspects: + + 1. determine the scope of autonomic functions by instructing which + network resources have to be managed by which autonomic function + (and more precisely by which release of the ASA software code, + e.g., version number or provider). + + 2. determine how the autonomic functions are organized by + instantiating a set of ASAs across one or more autonomic nodes + and instructing them accordingly about the other ASAs in the set + as necessary. + + In this phase, the operator may also want to set goals for autonomic + functions, e.g., by configuring GRASP objectives. + + The operator's goal can be summarized in an instruction to the + autonomic ecosystem matching the following format, explained in + detail in the next sub-section: + + [Instances_of_ASA_type] ready to control + [Instantiation_target_infrastructure] with + [Instantiation_target_parameters] + +7.2.2. Instantiation Phase Inputs and Outputs + + Inputs are: + + * [Instances_of_ASA_type]: specifies which ASAs to instantiate + + * [Instantiation_target_infrastructure]: specifies which resources + are to be managed by the autonomic function; this can be the whole + network or a subset of it like a domain, a physical segment, or + even a specific list of resources. + + * [Instantiation_target_parameters]: specifies which GRASP + objectives are to be sent to ASAs (e.g., an optimization target) + + Outputs are: + + * [Set_of_ASA_resources_relations]: describes which resources are + managed by which ASA instances; this is not a formal message but a + resulting configuration log for a set of ASAs. + +7.2.3. Instantiation Phase Requirements + + The instructions described in Section 7.2 could be either of the + following: + + * Sent to a targeted ASA. In this case, the receiving Agent will + have to manage the specified list of + [Instantiation_target_infrastructure], with the + [Instantiation_target_parameters]. + + * Broadcast to all ASAs. In this case, the ASAs would determine + from the list which ASAs would handle which + [Instantiation_target_infrastructure], with the + [Instantiation_target_parameters]. + + These instructions may be grouped as a specific data structure + referred to as an ASA Instance Mandate. The specification of such an + ASA Instance Mandate is beyond the scope of this document. + + The conclusion of this instantiation phase is a set of ASA instances + ready to operate. These ASA instances are characterized by the + resources they manage, the metrics being monitored, and the actions + that can be executed (like modifying certain parameter values). The + description of the ASA instance may be defined in an ASA Instance + Manifest data structure. The specification of such an ASA Instance + Manifest is beyond the scope of this document. + + The ASA Instance Manifest does not only serve informational purposes + such as acknowledgement of successful instantiation to the operator + but is also necessary for further autonomic operations with: + + * coordinated entities (see Section 8.1) + + * collaborative entities with purposes such as to establish + knowledge exchange (some ASAs may produce knowledge or monitor + metrics that would be useful for other ASAs) + +7.3. Operation Phase + + During the operation phase, the operator can: + + * activate/deactivate ASAs: enable/disable their autonomic loops + + * modify ASA targets: set different technical objectives + + * modify ASAs managed resources: update the Instance Mandate to + specify a different set of resources to manage (only applicable to + decoupled ASAs) + + During the operation phase, running ASAs can interact with other + ASAs: + + * in order to exchange knowledge (e.g., an ASA providing traffic + predictions to a load balancing ASA) + + * in order to collaboratively reach an objective (e.g., ASAs + pertaining to the same autonomic function will collaborate, e.g., + in the case of a load balancing function, by modifying link + metrics according to neighboring resource loads) + + During the operation phase, running ASAs are expected to apply + coordination schemes as per Section 8.1. + +7.4. Removal Phase + + When an ASA is removed from service and uninstalled, the above steps + are reversed. It is important that its data, especially any security + key material, is purged. + +8. Coordination and Data Models + +8.1. Coordination between Autonomic Functions + + Some autonomic functions will be completely independent of each + other. However, others are at risk of interfering with each other; + for example, two different optimization functions might both attempt + to modify the same underlying parameter in different ways. In a + complete system, a method is needed for identifying ASAs that might + interfere with each other and coordinating their actions when + necessary. + +8.2. Coordination with Traditional Management Functions + + Some ASAs will have functions that overlap with existing + configuration tools and network management mechanisms such as + command-line interfaces, DHCP, DHCPv6, SNMP, NETCONF, and RESTCONF. + This is, of course, an existing problem whenever multiple + configuration tools are in use by the NOC. Each ASA designer will + need to consider this issue and how to avoid clashes and + inconsistencies in various deployment scenarios. Some specific + considerations for interaction with OAM tools are given in [RFC8368]. + As another example, [RFC8992] describes how autonomic management of + IPv6 prefixes can interact with prefix delegation via DHCPv6. The + description of a GRASP objective and of an ASA using it should + include a discussion of any such interactions. + +8.3. Data Models + + Management functions often include a shared data model, quite likely + to be expressed in a formal notation such as YANG. This aspect + should not be an afterthought in the design of an ASA. To the + contrary, the design of the ASA and of its GRASP objectives should + match the data model; as noted in Section 6, YANG serialized as CBOR + may be used directly as the value of a GRASP objective. + +9. Robustness + + It is of great importance that all components of an autonomic system + are highly robust. Although ASA designers should aim for their + component to never fail, it is more important to design the ASA to + assume that failures will happen and to gracefully recover from those + failures when they occur. Hence, this section lists various aspects + of robustness that ASA designers should consider: + + 1. If despite all precautions, an ASA does encounter a fatal error, + it should in any case restart automatically and try again. To + mitigate a loop in case of persistent failure, a suitable pause + should be inserted before such a restart. The length of the + pause depends on the use case; randomization and exponential + backoff should be considered. + + 2. If a newly received or calculated value for a parameter falls + out of bounds, the corresponding parameter should be either left + unchanged or restored to a value known to be safe in all + configurations. + + 3. If a GRASP synchronization or negotiation session fails for any + reason, it may be repeated after a suitable pause. The length + of the pause depends on the use case; randomization and + exponential backoff should be considered. + + 4. If a session fails repeatedly, the ASA should consider that its + peer has failed, and it should cause GRASP to flush its + discovery cache and repeat peer discovery. + + 5. In any case, it may be prudent to repeat discovery periodically, + depending on the use case. + + 6. Any received GRASP message should be checked. If it is wrongly + formatted, it should be ignored. Within a unicast session, an + Invalid message (M_INVALID) may be sent. This function may be + provided by the GRASP implementation itself. + + 7. Any received GRASP objective should be checked. Basic + formatting errors like invalid CBOR will likely be detected by + GRASP itself, but the ASA is responsible for checking the + precise syntax and semantics of a received objective. If it is + wrongly formatted, it should be ignored. Within a negotiation + session, a Negotiation End message (M_END) with a Decline option + (O_DECLINE) should be sent. An ASA may log such events for + diagnostic purposes. + + 8. On the other hand, the definitions of GRASP objectives are very + likely to be extended, using the flexibility of CBOR or JSON. + Therefore, ASAs should be able to deal gracefully with unknown + components within the values of objectives. The specification + of an objective should describe how unknown components are to be + handled (ignored, logged and ignored, or rejected as an error). + + 9. If an ASA receives either an Invalid message (M_INVALID) or a + Negotiation End message (M_END) with a Decline option + (O_DECLINE), one possible reason is that the peer ASA does not + support a new feature of either GRASP or the objective in + question. In such a case, the ASA may choose to repeat the + operation concerned without using that new feature. + + 10. All other possible exceptions should be handled in an orderly + way. There should be no such thing as an unhandled exception + (but see point 1 above). + + At a slightly more general level, ASAs are not services in + themselves, but they automate services. This has a fundamental + impact on how to design robust ASAs. In general, when an ASA + observes a particular state (1) of operations of the services/ + resources it controls, it typically aims to improve this state to a + better state, say (2). Ideally, the ASA is built so that it can + ensure that any error encountered can still lead to returning to (1) + instead of a state (3), which is worse than (1). One example + instance of this principle is "make-before-break" used in + reconfiguration of routing protocols in manual operations. This + principle of operations can accordingly be coded into the operation + of an ASA. The GRASP dry run option mentioned in Section 6 is + another tool helpful for this ASA design goal of "test-before-make". + +10. Security Considerations + + ASAs are intended to run in an environment that is protected by the + Autonomic Control Plane [RFC8994], admission to which depends on an + initial secure bootstrap process such as BRSKI [RFC8995]. Those + documents describe security considerations relating to the use of and + properties provided by the ACP and BRSKI, respectively. Such an ACP + can provide keying material for mutual authentication between ASAs as + well as confidential communication channels for messages between + ASAs. In some deployments, a secure partition of the link layer + might be used instead. GRASP itself has significant security + considerations [RFC8990]. However, this does not relieve ASAs of + responsibility for security. When ASAs configure or manage network + elements outside the ACP, potentially in a different physical node, + they must interact with other non-autonomic software components to + perform their management functions. The details are specific to each + case, but this has an important security implication. An ASA might + act as a loophole by which the managed entity could penetrate the + security boundary of the ANI. Thus, ASAs must be designed to avoid + loopholes such as passing on executable code or proxying unverified + commands and should, if possible, operate in an unprivileged mode. + In particular, they must use secure coding practices, e.g., carefully + validate all incoming information and avoid unnecessary elevation of + privilege. This will apply in particular when an ASA interacts with + a management component such as a NETCONF server. + + A similar situation will arise if an ASA acts as a gateway between + two separate autonomic networks, i.e., it has access to two separate + ACPs. Such an ASA must also be designed to avoid loopholes and to + validate incoming information from both sides. + + As a reminder, GRASP does not intrinsically provide transactional + integrity (Section 6). + + As appropriate to their specific functions, ASAs should take account + of relevant privacy considerations [RFC6973]. + + The initial version of the autonomic infrastructure assumes that all + autonomic nodes are trusted by virtue of their admission to the ACP. + ASAs are therefore trusted to manipulate any GRASP objective simply + because they are installed on a node that has successfully joined the + ACP. In the general case, a node may have multiple roles, and a role + may use multiple ASAs, each using multiple GRASP objectives. + Additional mechanisms for the fine-grained authorization of nodes and + ASAs to manipulate specific GRASP objectives could be designed. + Meanwhile, we repeat that ASAs should run without special privilege + if possible. Independently of this, interfaces between ASAs and the + router configuration and monitoring services of the node can be + subject to authentication that provides more fine-grained + authorization for specific services. These additional authentication + parameters could be passed to an ASA during its instantiation phase. + +11. IANA Considerations + + This document has no IANA actions. + +12. References + +12.1. Normative References + + [RFC8949] Bormann, C. and P. Hoffman, "Concise Binary Object + Representation (CBOR)", STD 94, RFC 8949, + DOI 10.17487/RFC8949, December 2020, + <https://www.rfc-editor.org/info/rfc8949>. + + [RFC8990] Bormann, C., Carpenter, B., Ed., and B. Liu, Ed., "GeneRic + Autonomic Signaling Protocol (GRASP)", RFC 8990, + DOI 10.17487/RFC8990, May 2021, + <https://www.rfc-editor.org/info/rfc8990>. + + [RFC8994] Eckert, T., Ed., Behringer, M., Ed., and S. Bjarnason, "An + Autonomic Control Plane (ACP)", RFC 8994, + DOI 10.17487/RFC8994, May 2021, + <https://www.rfc-editor.org/info/rfc8994>. + + [RFC8995] Pritikin, M., Richardson, M., Eckert, T., Behringer, M., + and K. Watsen, "Bootstrapping Remote Secure Key + Infrastructure (BRSKI)", RFC 8995, DOI 10.17487/RFC8995, + May 2021, <https://www.rfc-editor.org/info/rfc8995>. + +12.2. Informative References + + [AUTONOMIC-FUNCTION] + Pierre, P. and L. Ciavaglia, "A Day in the Life of an + Autonomic Function", Work in Progress, Internet-Draft, + draft-peloso-anima-autonomic-function-01, 21 March 2016, + <https://datatracker.ietf.org/doc/html/draft-peloso-anima- + autonomic-function-01>. + + [CBOR-YANG] + Veillette, M., Ed., Petrov, I., Ed., Pelov, A., Bormann, + C., and M. Richardson, "CBOR Encoding of Data Modeled with + YANG", Work in Progress, Internet-Draft, draft-ietf-core- + yang-cbor-18, December 2021, + <https://datatracker.ietf.org/doc/html/draft-ietf-core- + yang-cbor-18>. + + [DEMOLA06] De Mola, F. and R. Quitadamo, "Towards an Agent Model for + Future Autonomic Communications", Proceedings of the 7th + WOA 2006 Workshop From Objects to Agents 51-59, September + 2006. + + [GANA13] ETSI, "Autonomic network engineering for the self-managing + Future Internet (AFI); Generic Autonomic Network + Architecture (An Architectural Reference Model for + Autonomic Networking, Cognitive Networking and Self- + Management)", GS AFI 002, V1.1.1, April 2013, + <https://www.etsi.org/deliver/etsi_gs/ + AFI/001_099/002/01.01.01_60/gs_afi002v010101p.pdf>. + + [HUEBSCHER08] + Huebscher, M. C. and J. A. McCann, "A survey of autonomic + computing - degrees, models, and applications", ACM + Computing Surveys (CSUR), Volume 40, Issue 3, + DOI 10.1145/1380584.1380585, August 2008, + <https://doi.org/10.1145/1380584.1380585>. + + [IBN-CONCEPTS] + Clemm, A., Ciavaglia, L., Granville, L. Z., and J. + Tantsura, "Intent-Based Networking - Concepts and + Definitions", Work in Progress, Internet-Draft, draft- + irtf-nmrg-ibn-concepts-definitions-09, 24 March 2022, + <https://datatracker.ietf.org/doc/html/draft-irtf-nmrg- + ibn-concepts-definitions-09>. + + [IPJ] Behringer, M., Bormann, C., Carpenter, B. E., Eckert, T., + Campos Nobre, J., Jiang, S., Li, Y., and M. C. Richardson, + "Autonomic Networking Gets Serious", The Internet Protocol + Journal, Volume 24, Issue 3, Page(s) 2 - 18, ISSN + 1944-1134, October 2021, <https://ipj.dreamhosters.com/wp- + content/uploads/2021/10/243-ipj.pdf>. + + [MOVAHEDI12] + Movahedi, Z., Ayari, M., Langar, R., and G. Pujolle, "A + Survey of Autonomic Network Architectures and Evaluation + Criteria", IEEE Communications Surveys & Tutorials, Volume + 14, Issue 2, Pages 464 - 490, + DOI 10.1109/SURV.2011.042711.00078, 2012, + <https://doi.org/10.1109/SURV.2011.042711.00078>. + + [NFV] ETSI, "Network Functions Virtualisation", SDN and OpenFlow + World Congress, October 2012, + <https://portal.etsi.org/NFV/NFV_White_Paper.pdf>. + + [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., + and A. Bierman, Ed., "Network Configuration Protocol + (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, + <https://www.rfc-editor.org/info/rfc6241>. + + [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., + Morris, J., Hansen, M., and R. Smith, "Privacy + Considerations for Internet Protocols", RFC 6973, + DOI 10.17487/RFC6973, July 2013, + <https://www.rfc-editor.org/info/rfc6973>. + + [RFC7575] Behringer, M., Pritikin, M., Bjarnason, S., Clemm, A., + Carpenter, B., Jiang, S., and L. Ciavaglia, "Autonomic + Networking: Definitions and Design Goals", RFC 7575, + DOI 10.17487/RFC7575, June 2015, + <https://www.rfc-editor.org/info/rfc7575>. + + [RFC7665] Halpern, J., Ed. and C. Pignataro, Ed., "Service Function + Chaining (SFC) Architecture", RFC 7665, + DOI 10.17487/RFC7665, October 2015, + <https://www.rfc-editor.org/info/rfc7665>. + + [RFC8368] Eckert, T., Ed. and M. Behringer, "Using an Autonomic + Control Plane for Stable Connectivity of Network + Operations, Administration, and Maintenance (OAM)", + RFC 8368, DOI 10.17487/RFC8368, May 2018, + <https://www.rfc-editor.org/info/rfc8368>. + + [RFC8991] Carpenter, B., Liu, B., Ed., Wang, W., and X. Gong, + "GeneRic Autonomic Signaling Protocol Application Program + Interface (GRASP API)", RFC 8991, DOI 10.17487/RFC8991, + May 2021, <https://www.rfc-editor.org/info/rfc8991>. + + [RFC8992] Jiang, S., Ed., Du, Z., Carpenter, B., and Q. Sun, + "Autonomic IPv6 Edge Prefix Management in Large-Scale + Networks", RFC 8992, DOI 10.17487/RFC8992, May 2021, + <https://www.rfc-editor.org/info/rfc8992>. + + [RFC8993] Behringer, M., Ed., Carpenter, B., Eckert, T., Ciavaglia, + L., and J. Nobre, "A Reference Model for Autonomic + Networking", RFC 8993, DOI 10.17487/RFC8993, May 2021, + <https://www.rfc-editor.org/info/rfc8993>. + + [ZSM009-1] ETSI, "Zero-touch network and Service Management (ZSM); + Closed-Loop Automation; Part 1: Enablers", GS ZSM 009-1, + Version 1.1.1, June 2021, + <https://www.etsi.org/deliver/etsi_gs/ + ZSM/001_099/00901/01.01.01_60/gs_ZSM00901v010101p.pdf>. + +Appendix A. Example Logic Flows + + This appendix describes generic logic flows that combine to act as an + Autonomic Service Agent (ASA) for resource management. Note that + these are illustrative examples and are in no sense requirements. As + long as the rules of GRASP are followed, a real implementation could + be different. The reader is assumed to be familiar with GRASP + [RFC8990] and its conceptual API [RFC8991]. + + A complete autonomic function for a distributed resource will consist + of a number of instances of the ASA placed at relevant points in a + network. Specific details will, of course, depend on the resource + concerned. One example is IP address prefix management, as specified + in [RFC8992]. In this case, an instance of the ASA will exist in + each delegating router. + + An underlying assumption is that there is an initial source of the + resource in question, referred to here as an origin ASA. The other + ASAs, known as delegators, obtain supplies of the resource from the + origin, delegate quantities of the resource to consumers that request + it, and recover it when no longer needed. + + Another assumption is there is a set of network-wide policy + parameters, which the origin will provide to the delegators. These + parameters will control how the delegators decide how much resource + to provide to consumers. Thus, the ASA logic has two operating + modes: origin and delegator. When running as an origin, it starts by + obtaining a quantity of the resource from the NOC, and it acts as a + source of policy parameters, via both GRASP flooding and GRASP + synchronization. (In some scenarios, flooding or synchronization + alone might be sufficient, but this example includes both.) + + When running as a delegator, it starts with an empty resource pool, + acquires the policy parameters by GRASP synchronization, and + delegates quantities of the resource to consumers that request it. + Both as an origin and as a delegator, when its pool is low, it seeks + quantities of the resource by requesting GRASP negotiation with peer + ASAs. When its pool is sufficient, it hands out resource to peer + ASAs in response to negotiation requests. Thus, over time, the + initial resource pool held by the origin will be shared among all the + delegators according to demand. + + In theory, a network could include any number of origins and any + number of delegators, with the only condition being that each + origin's initial resource pool is unique. A realistic scenario is to + have exactly one origin and as many delegators as you like. A + scenario with no origin is useless. + + An implementation requirement is that resource pools are kept in + stable storage. Otherwise, if a delegator exits for any reason, all + the resources it has obtained or delegated are lost. If an origin + exits, its entire spare pool is lost. The logic for using stable + storage and for crash recovery is not included in the pseudocode + below, which focuses on communication between ASAs. Since GRASP + operations are not intrinsically idempotent, data integrity during + failure scenarios is the responsibility of the ASA designer. This is + a complex topic in its own right that is not discussed in the present + document. + + The description below does not implement GRASP's dry run function. + That would require temporarily marking any resource handed out in a + dry run negotiation as reserved, until either the peer obtains it in + a live run, or a suitable timeout occurs. + + The main data structures used in each instance of the ASA are: + + * resource_pool: an ordered list of available resources, for + example. Depending on the nature of the resource, units of + resource are split when appropriate, and a background garbage + collector recombines split resources if they are returned to the + pool. + + * delegated_list: where a delegator stores the resources it has + given to subsidiary devices. + + Possible main logic flows are below, using a threaded implementation + model. As noted above, alternative approaches to asynchronous + operations are possible. The transformation to an event loop model + should be apparent; each thread would correspond to one event in the + event loop. + + The GRASP objectives are as follows: + + * ["EX1.Resource", flags, loop_count, value], where the value + depends on the resource concerned but will typically include its + size and identification. + + * ["EX1.Params", flags, loop_count, value], where the value will be, + for example, a JSON object defining the applicable parameters. + + In the outline logic flows below, these objectives are represented + simply by their names. + + MAIN PROGRAM: + + Create empty resource_pool (and an associated lock) + Create empty delegated_list + Determine whether to act as origin + if origin: + Obtain initial resource_pool contents from NOC + Obtain value of EX1.Params from NOC + Register ASA with GRASP + Register GRASP objectives EX1.Resource and EX1.Params + if origin: + Start FLOODER thread to flood EX1.Params + Start SYNCHRONIZER listener for EX1.Params + Start MAIN_NEGOTIATOR thread for EX1.Resource + if not origin: + Obtain value of EX1.Params from GRASP flood or synchronization + Start DELEGATOR thread + Start GARBAGE_COLLECTOR thread + good_peer = none + do forever: + if resource_pool is low: + Calculate amount A of resource needed + Discover peers using GRASP M_DISCOVER / M_RESPONSE + if good_peer in peers: + peer = good_peer + else: + peer = #any choice among peers + grasp.request_negotiate("EX1.Resource", peer) + #i.e., send negotiation request + Wait for response (M_NEGOTIATE, M_END or M_WAIT) + if OK: + if offered amount of resource sufficient: + Send M_END + O_ACCEPT #negotiation succeeded + Add resource to pool + good_peer = peer #remember this choice + else: + Send M_END + O_DECLINE #negotiation failed + good_peer = none #forget this choice + sleep() #periodic timer suitable for application scenario + + MAIN_NEGOTIATOR thread: + + do forever: + grasp.listen_negotiate("EX1.Resource") + #i.e., wait for negotiation request + Start a separate new NEGOTIATOR thread for requested amount A + + NEGOTIATOR thread: + + Request resource amount A from resource_pool + if not OK: + while not OK and A > Amin: + A = A-1 + Request resource amount A from resource_pool + if OK: + Offer resource amount A to peer by GRASP M_NEGOTIATE + if received M_END + O_ACCEPT: + #negotiation succeeded + elif received M_END + O_DECLINE or other error: + #negotiation failed + Return resource to resource_pool + else: + Send M_END + O_DECLINE #negotiation failed + #thread exits + + DELEGATOR thread: + + do forever: + Wait for request or release for resource amount A + if request: + Get resource amount A from resource_pool + if OK: + Delegate resource to consumer #atomic + Record in delegated_list #operation + else: + Signal failure to consumer + Signal main thread that resource_pool is low + else: + Delete resource from delegated_list + Return resource amount A to resource_pool + + SYNCHRONIZER thread: + + do forever: + Wait for M_REQ_SYN message for EX1.Params + Reply with M_SYNCH message for EX1.Params + + FLOODER thread: + + do forever: + Send M_FLOOD message for EX1.Params + sleep() #periodic timer suitable for application scenario + + GARBAGE_COLLECTOR thread: + + do forever: + Search resource_pool for adjacent resources + Merge adjacent resources + sleep() #periodic timer suitable for application scenario + +Acknowledgements + + Valuable comments were received from Michael Behringer, Menachem + Dodge, Martin Dürst, Toerless Eckert, Thomas Fossati, Alex Galis, + Bing Liu, Benno Overeinder, Michael Richardson, Rob Wilton, and other + IESG members. + +Authors' Addresses + + Brian Carpenter + School of Computer Science + University of Auckland + PB 92019 + Auckland 1142 + New Zealand + Email: brian.e.carpenter@gmail.com + + + Laurent Ciavaglia + Rakuten Mobile + Paris + France + Email: laurent.ciavaglia@rakuten.com + + + Sheng Jiang + Huawei Technologies Co., Ltd + Q14 Huawei Campus + 156 Beiqing Road + Hai-Dian District + Beijing + 100095 + China + Email: jiangsheng@huawei.com + + + Pierre Peloso + Nokia + Villarceaux + 91460 Nozay + France + Email: pierre.peloso@nokia.com |