diff options
Diffstat (limited to 'doc/rfc/rfc7019.txt')
-rw-r--r-- | doc/rfc/rfc7019.txt | 2299 |
1 files changed, 2299 insertions, 0 deletions
diff --git a/doc/rfc/rfc7019.txt b/doc/rfc/rfc7019.txt new file mode 100644 index 0000000..2939fce --- /dev/null +++ b/doc/rfc/rfc7019.txt @@ -0,0 +1,2299 @@ + + + + + + +Internet Research Task Force (IRTF) J. Buford +Request for Comments: 7019 Avaya Labs Research +Category: Experimental M. Kolberg, Ed. +ISSN: 2070-1721 University of Stirling + September 2013 + + + Application-Layer Multicast Extensions + to REsource LOcation And Discovery (RELOAD) + +Abstract + + We define a REsource LOcation And Discovery (RELOAD) Usage for + Application-Layer Multicast (ALM) as well as a mapping to the RELOAD + experimental message type to support ALM. The ALM Usage is intended + to support a variety of ALM control algorithms in an overlay- + independent way. Two example algorithms are defined, based on Scribe + and P2PCast. + + This document is a product of the Scalable Adaptive Multicast + Research Group (SAM RG). + +Status of This Memo + + This document is not an Internet Standards Track specification; it is + published for examination, experimental implementation, and + evaluation. + + This document defines an Experimental Protocol for the Internet + community. This document is a product of the Internet Research Task + Force (IRTF). The IRTF publishes the results of Internet-related + research and development activities. These results might not be + suitable for deployment. This RFC represents the consensus of the + Scalable Adaptive Multicast Research Group of the Internet Research + Task Force (IRTF). Documents approved for publication by the IRSG + are not a candidate for any level of Internet Standard; see Section 2 + of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc7019. + + + + + + + + + + +Buford & Kolberg Experimental [Page 1] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +Copyright Notice + + Copyright (c) 2013 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. + +Table of Contents + + 1. Introduction ....................................................4 + 1.1. Requirements Language ......................................5 + 2. Definitions .....................................................5 + 2.1. Overlay Network ............................................5 + 2.2. Overlay Multicast ..........................................5 + 2.3. Source-Specific Multicast (SSM) ............................6 + 2.4. Any-Source Multicast (ASM) .................................6 + 2.5. Peer .......................................................6 + 3. Assumptions .....................................................6 + 3.1. Overlay ....................................................6 + 3.2. Overlay Multicast ..........................................7 + 3.3. RELOAD .....................................................7 + 3.4. NAT ........................................................7 + 3.5. Tree Topology ..............................................7 + 4. Architecture Extensions to RELOAD ...............................7 + 5. RELOAD ALM Usage ................................................9 + 6. ALM Tree Control Signaling ......................................9 + 7. ALM Messages Mapped to RELOAD ..................................11 + 7.1. Introduction ..............................................11 + 7.2. Tree Lifecycle Messages ...................................12 + 7.2.1. CreateALMTree ......................................12 + 7.2.2. CreateALMTreeResponse ..............................13 + 7.2.3. Join ...............................................13 + 7.2.4. JoinAccept (Join Response) .........................14 + 7.2.5. JoinReject (Join Response) .........................15 + 7.2.6. JoinConfirm ........................................15 + 7.2.7. JoinConfirmResponse ................................16 + 7.2.8. JoinDecline ........................................16 + 7.2.9. JoinDeclineResponse ................................16 + 7.2.10. Leave .............................................17 + 7.2.11. LeaveResponse .....................................17 + 7.2.12. Reform or Optimize Tree ...........................17 + 7.2.13. ReformResponse ....................................18 + 7.2.14. Heartbeat .........................................18 + + + +Buford & Kolberg Experimental [Page 2] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + 7.2.15. Heartbeat Response ................................18 + 7.2.16. NodeQuery .........................................19 + 7.2.17. NodeQueryResponse .................................19 + 7.2.18. Push ..............................................21 + 7.2.19. PushResponse ......................................22 + 8. Scribe Algorithm ...............................................22 + 8.1. Overview ..................................................22 + 8.2. Create ....................................................23 + 8.3. Join ......................................................24 + 8.4. Leave .....................................................24 + 8.5. JoinConfirm ...............................................24 + 8.6. JoinDecline ...............................................24 + 8.7. Multicast .................................................24 + 9. P2PCast Algorithm ..............................................25 + 9.1. Overview ..................................................25 + 9.2. Message Mapping ...........................................25 + 9.3. Create ....................................................26 + 9.4. Join ......................................................26 + 9.5. Leave .....................................................28 + 9.6. JoinConfirm ...............................................28 + 9.7. Multicast .................................................28 + 10. Message Format ................................................28 + 10.1. ALMHeader Definition .....................................30 + 10.2. ALMMessageContents Definition ............................31 + 10.3. Response Codes ...........................................31 + 11. Examples ......................................................32 + 11.1. Create Tree ..............................................32 + 11.2. Join Tree ................................................33 + 11.3. Leave Tree ...............................................35 + 11.4. Push Data ................................................35 + 12. Kind Definitions ..............................................36 + 12.1. ALMTree Kind Definition ..................................36 + 13. RELOAD Configuration File Extensions ..........................37 + 14. IANA Considerations ...........................................37 + 14.1. ALM Algorithm Types ......................................37 + 14.2. Message Code Registration ................................38 + 14.3. Error Code Registration ..................................38 + 15. Security Considerations .......................................39 + 16. Acknowledgements ..............................................40 + 17. References ....................................................40 + 17.1. Normative Reference ......................................40 + 17.2. Informative References ...................................40 + + + + + + + + + +Buford & Kolberg Experimental [Page 3] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +1. Introduction + + The concept of scalable adaptive multicast includes both scaling + properties and adaptability properties. Scalability is intended to + cover: + + o large group size + + o large numbers of small groups + + o rate of group membership change + + o admission control for QoS + + o use with network-layer QoS mechanisms + + o varying degrees of reliability + + o trees connecting nodes over the global Internet + + Adaptability includes + + o use of different control mechanisms for different multicast trees + depending on initial application parameters or application classes + + o changing multicast tree structure depending on changes in + application requirements, network conditions, and membership + + Application-Layer Multicast (ALM) has been demonstrated to be a + viable multicast technology where native multicast isn't available. + Many ALM designs have been proposed. This ALM Usage focuses on: + + o ALM implemented in RELOAD-based overlays + + o Support for a variety of ALM control algorithms + + o Providing a basis for defining a separate hybrid ALM RELOAD Usage + + RELOAD [RELOAD] has an application extension mechanism in which a new + type of application defines a Usage. A RELOAD Usage defines a set of + data types and rules for their use. In addition, this document + describes additional message types and a new ALM algorithm plugin + architectural component. + + This document represents the consensus of the SAM RG. It was + repeatedly discussed within the research group, as well as with other + Application-Layer Multicast experts. + + + + +Buford & Kolberg Experimental [Page 4] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +1.1. Requirements Language + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in RFC 2119 [RFC2119]. + +2. Definitions + + We adopt the terminology defined in Section 3 of [RELOAD], + specifically the distinction between "node", "peer", and "client". + +2.1. Overlay Network + + Overlay network: An application-layer virtual or logical network with + addressable end points that provides connectivity, routing, and + messaging between end points. Overlay networks are frequently used + as a substrate for deploying new network services or for providing a + routing topology not available from the underlying physical network. + Many peer-to-peer systems are overlay networks that run on top of the + Internet. In Figure 1, "P" indicates overlay peers, and peers are + connected in a logical address space. The links shown in the figure + represent predecessor/successor links. Depending on the overlay + routing model, additional or different links may be present. + + P P P P P + ..+....+....+...+.....+... + . +P + P+ . + . +P + ..+....+....+...+.....+... + P P P P P + + Figure 1: Overlay Network Example + +2.2. Overlay Multicast + + Overlay Multicast (OM): Hosts participating in a multicast session + form an overlay network and utilize unicast connections among pairs + of hosts for data dissemination [BUFORD2009] [KOLBERG2010] + [BUFORD2008]. The hosts in overlay multicast exclusively handle + group management, routing, and tree construction, without any support + from Internet routers. This is also commonly known as Application- + Layer Multicast (ALM) or End-System Multicast (ESM). We call systems + that use proxies connected in an overlay multicast backbone "proxied + overlay multicast" or POM. + + + + + + +Buford & Kolberg Experimental [Page 5] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +2.3. Source-Specific Multicast (SSM) + + SSM tree: The creator of the tree is the source. It sends data + messages to the tree root that are forwarded down the tree. + +2.4. Any-Source Multicast (ASM) + + ASM tree: A node sending a data message sends the message to its + parent and its children. Each node receiving a data message from one + edge forwards it to the remaining tree edges to which it is + connected. + +2.5. Peer + + Peer: An autonomous end system that is connected to the physical + network and participates in and contributes resources to overlay + construction, routing, and maintenance. Some peers may also perform + additional roles such as connection relays, super nodes, NAT + traversal assistance, and data storage. + +3. Assumptions + +3.1. Overlay + + Peers connect in a large-scale overlay, which may be used for a + variety of peer-to-peer applications in addition to multicast + sessions. Peers may assume additional roles in the overlay beyond + participation in the overlay and in multicast trees. We assume a + single-structured overlay routing algorithm is used. Any of a + variety of multi-hop, one-hop, or variable-hop overlay algorithms + could be used. + + Castro, et al. [CASTRO2003] compared multi-hop overlays and found + that tree-based construction in a single overlay outperformed using + separate overlays for each multicast session. We use a single + overlay rather than separate overlays per multicast session. + + An overlay multicast algorithm may leverage the overlay's mechanism + for maintaining overlay state in the face of churn. For example, a + peer may store a number of DHT (Distributed Hash Table) entries. + When the peer gracefully leaves the overlay, it transfers those + entries to the nearest peer. When another peer joins that is closer + to some of the entries than the current peer that holds those + entries, than those entries are migrated. Overlay churn affects + multicast trees as well; remedies include automatic migration of the + tree state and automatic rejoin operations for dislocated child + nodes. + + + + +Buford & Kolberg Experimental [Page 6] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +3.2. Overlay Multicast + + The overlay supports concurrent multiple multicast trees. The limit + on the number of concurrent trees depends on peer and network + resources and is not an intrinsic property of the overlay. + +3.3. RELOAD + + We use RELOAD [RELOAD] as the peer-to-peer (P2P) overlay for data + storage and the mechanism by which the peers interconnect and route + messages. RELOAD is a generic P2P overlay, and application support + is defined by profiles called Usages. + +3.4. NAT + + Some nodes in the overlay may be in a private address space and + behind firewalls. We use the RELOAD mechanisms for NAT traversal. + We permit clients to be leaf nodes in an ALM tree. + +3.5. Tree Topology + + All tree control messages are routed in the overlay. Two types of + data or media topologies are envisioned: 1) tree edges are paths in + the overlay, and 2) tree edges are direct connections between a + parent and child peer in the tree, formed using the RELOAD AppAttach + method. + +4. Architecture Extensions to RELOAD + + There are two changes as depicted in Figure 2. New ALM messages are + mapped to RELOAD Message Transport using the RELOAD experimental + message type. A plugin for ALM algorithms handles the ALM state and + control. The ALM algorithm is under control of the application via + the Group API [COMMON-API]. + + + + + + + + + + + + + + + + + +Buford & Kolberg Experimental [Page 7] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + +---------+ + |Group API| + +---------+ + | + ------------------- Application ------------------------ + +-------+ | + | ALM | | + | Usage | | + +-------+ | + -------------- Messaging Service Boundary -------------- + | + +--------+ +-----------+---------+ +---------+ + | Storage|<---> | RELOAD | ALM |<-->| ALM Alg | + +--------+ | Message | Messages| +---------+ + ^ | Transport | | + | +-----------+---------+ + v | | + +-------------+ | + | Topology | | + | Plugin | | + +-------------+ | + ^ | + v v + +-------------------+ + | Forwarding & | + | Link Management | + +-------------------+ + + ---------- Overlay Link Service Boundary -------------- + + Figure 2: RELOAD Architecture Extensions + + The ALM components interact with RELOAD as follows: + + o ALM uses the RELOAD data storage functionality to store an ALMTree + instance when a new ALM tree is created in the overlay and to + retrieve ALMTree instance(s) for existing ALM trees. + + o ALM applications and management tools may use the RELOAD data + storage functionality to store diagnostic information about the + operation of trees, including average number of trees, delay from + source to leaf nodes, bandwidth use, and packet loss rate. In + addition, diagnostic information may include statistics specific + to the tree root or to any node in the tree. + + + + + + + +Buford & Kolberg Experimental [Page 8] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +5. RELOAD ALM Usage + + Applications of RELOAD are restricted in the data types that can be + stored in the DHT. The profile of accepted data types for an + application is referred to as a Usage. RELOAD is designed so that + new applications can easily define new Usages. New RELOAD Usages are + needed for multicast applications since the data types in base RELOAD + and existing Usages are not sufficient. + + We define an ALM Usage in RELOAD. This ALM Usage is sufficient for + applications that require ALM functionality in the overlay. Figure 2 + shows the internal structure of the ALM Usage. This contains the + Group API ([COMMON-API]), an ALM algorithm plugin (e.g., Scribe), and + the ALM messages that are then sent out to the RELOAD network. + + A RELOAD Usage is required [RELOAD] to define the following: + + o Kind-ID and code points + + o data structures for each Kind + + o access control rules for each Kind + + o the Resource Name used to hash to the Resource ID that determines + where the Kind is stored + + o address restoration after recovery from a network partition (to + form a single coherent network) + + o the types of connections that can be initiated using AppConnect + + An ALM group_id is a RELOAD node_id. The owner of an ALM group + creates a RELOAD node_id as specified in [RELOAD]. This means that a + group_id is used as a RELOAD Destination for overlay routing + purposes. + +6. ALM Tree Control Signaling + + Peers use the overlay to support ALM operations such as: + + o CreateALMTree + + o Join + + o Leave + + o Reform or optimize tree + + + + +Buford & Kolberg Experimental [Page 9] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + There are a variety of algorithms for peers to form multicast trees + in the overlay. The approach presented here permits multiple such + algorithms to be supported in the overlay since different algorithms + may be more suitable for certain application requirements; the + approach also supports experimentation. Therefore, overlay messaging + corresponding to the set of overlay multicast operations MUST carry + algorithm identification information. + + For example, for small groups, the join point might be directly + assigned by the rendezvous point, while for large trees the Join + request might be propagated down the tree with candidate parents + forwarding their position directly to the new node. + + Here is a simplistic notation for forming a multicast tree in the + overlay. Its main advantage is the use of the overlay for routing + both control and data messages. The group creator does not have to + be the root of the tree or even in the tree. It does not consider + per-node load, admission control, or alternative paths. After the + creation of a tree, the group_id is expected to be advertised or + distributed out of band, perhaps by publishing in the DHT. + Similarly, joining peers will discover the group_id out of band, + perhaps by a lookup in the tree. + + As stated earlier, multiple algorithms will coexist in the overlay. + + 1. Peer that initiates multicast group: + + group_id = create(); // Allocate a unique group_id. + // The root is the nearest + // peer in the overlay. + + 2. Any joining peer: + + joinTree(group_id); // sends "join group_id" message + + The overlay routes the Join request using the overlay routing + mechanism toward the peer with the nearest ID to the group_id. + This peer is the root. Peers on the path to the root join the + tree as forwarding points. + + 3. Leave Tree: + + leaveTree(group_id); // removes this node from the tree + + + + + + + + +Buford & Kolberg Experimental [Page 10] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + Propagates a Leave request to each child node and to the parent + node. If the parent node is a forwarding node and this is its + last child, then it propagates a Leave request to its parent. A + child node receiving a Leave request from a parent sends a Join + request to the group_id. + + 4. Message forwarding: + + multicastMsg(group_id, msg); + + For message forwarding, both Any-Source Multicast (ASM) and + Source-Specific Multicast (SSM) approaches may be used. + +7. ALM Messages Mapped to RELOAD + +7.1. Introduction + + In this document, we define messages for overlay multicast tree + creation, using an existing protocol (RELOAD) in the P2P-SIP WG + [RELOAD] for a universal structured peer-to-peer overlay protocol. + RELOAD provides the mechanism to support a number of overlay + topologies. Hence, the overlay multicast framework defined in this + document can be used with P2P-SIP and makes the Scalable Adaptive + Multicast (SAM) framework overlay agnostic. + + As discussed in the SAM requirements document [SAM-GENERIC], there + are a variety of ALM tree formation and tree maintenance algorithms. + The intent of this specification is to be algorithm agnostic, similar + to how RELOAD is overlay algorithm agnostic. We assume that all + control messages are propagated using overlay routed messages. + + The message types needed for ALM behavior are divided into the + following categories: + + o Tree lifecycle (Create, Join, Leave, Reform, Heartbeat) + + o Peer region and multicast properties + + The message codes are defined in Section 14.2 of this document. + Messages are mapped to the RELOAD experimental message type. + + In the following sections, the protocol messages as mapped to RELOAD + are discussed. Detailed example message flows are provided in + Section 11. + + + + + + + +Buford & Kolberg Experimental [Page 11] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + In the following descriptions, we use the datatype Dictionary, which + is a set of opaque values indexed by an opaque key with one value for + each key. A single dictionary entry is represented by a + DictionaryEntry as defined in Section 7.2.3 of the RELOAD document + [RELOAD]. The Dictionary datatype is defined as follows: + + struct { + DictionaryEntry elements<0..2^16-1>; + } Dictionary; + +7.2. Tree Lifecycle Messages + + Peers use the overlay to transmit ALM operations defined in this + section. + +7.2.1. CreateALMTree + + A new ALM tree is created in the overlay with the identity specified + by group_id. The common interpretation in a DHT-based overlay of + group_id is that the peer with a peer_id closest to and less than the + group_id is the root of the tree. However, other overlay types are + supported. The tree has no children at the time it is created. + + The group_id is generated from a well-known session key to be used by + other peers to address the multicast tree in the overlay. The + generation of the group_id from the session_key MUST be done using + the overlay's ID-generation mechanism. + + struct { + node_id peer_id; + opaque session_key<0..2^32-1>; + node_id group_id; + Dictionary options; + } ALMTree; + + peer_id: overlay address of the peer that creates the multicast tree. + + session_key: a well-known string that when hashed using the overlay's + ID-generation algorithm produces the group_id. + + group_id: overlay address of the root of the tree. + + options: name-value list of properties to be associated with the + tree, such as the maximum size of the tree, restrictions on peers + joining the tree, latency constraints, preference for distributed or + centralized tree formation and maintenance, and Heartbeat interval. + + + + + +Buford & Kolberg Experimental [Page 12] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + Tree creation is subject to access control since it involves a Store + operation. The NODE-MATCH access policy defined in Section 7.3.2 of + [RELOAD] is used. + + A successful CreateALMTree causes an ALMTree structure to be stored + in the overlay at the node G responsible for the group_id. This node + G performs the RELOAD-defined StoreReq operation as a side effect of + performing the CreateALMTree. If the StoreReq fails, the + CreateALMTree fails too. + + After a successful CreateALMTree, peers can use the RELOAD Fetch + method to retrieve the ALMTree struct at address group_id. The + ALMTree Kind is defined in Section 12.1. + +7.2.2. CreateALMTreeResponse + + After receiving a CreateALMTree message from node S, the peer sends a + CreateALMTreeResponse to node S. + + struct { + Dictionary options; + } CreateALMTreeResponse; + + options: A node may provide algorithm-dependent parameters about the + created tree to the requesting node. + +7.2.3. Join + + Join causes the distributed algorithm for peer join of a specific ALM + group to be invoked. The definition of the Join request is shown + below. If successful, the joining peer is notified of one or more + candidate parent peers in one or more JoinAccept messages. The + particular ALM join algorithm is not specified in this protocol. + + struct { + node_id peer_id; + node_id group_id; + Dictionary options; + } Join; + + peer_id: overlay address of joining/leaving peer + + group_id: overlay address of the root of the tree + + options: name-value list of options proposed by joining peer + + + + + + +Buford & Kolberg Experimental [Page 13] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + RELOAD is a request-response protocol. Consequently, the messages + JoinAccept and JoinReject (defined below) are matching responses for + Join. If JoinReject is received, then no further action on this + request is carried out. If JoinAccept is received, then either a + JoinConfirm or a JoinDecline message (see below) is sent. The + matching response for JoinConfirm is JoinConfirmResponse. The + matching response for JoinDecline is JoinDeclineResponse. + + The following list shows the matching request-responses according to + the request-response mechanism defined in [RELOAD]. + + Join -- JoinAccept: Node C sends a Join request to node P. If + node P accepts, it responds with JoinAccept. + + Join -- JoinReject: Node C sends a Join request to node P. If + node P does not accept the Join request, it responds with + JoinReject. + + JoinConfirm -- JoinConfirmResponse: If node P sent node C a + JoinAccept and node C confirms with a JoinConfirm request, then + node P responds with a JoinConfirmResponse. + + JoinDecline -- JoinDeclineResponse: If node P sent node C a + JoinAccept and node C declines with a JoinDecline request, then + node P responds with a JoinDeclineResponse. + + Thus, Join, JoinConfirm, and JoinDecline are treated as requests as + defined in RELOAD, are mapped to the RELOAD exp_a_req message, and + are therefore retransmitted until either a retry limit is reached or + a matching response received. JoinAccept, JoinReject, + JoinConfirmResponse, and JoinDeclineResponse are treated as message + responses as defined above and are mapped to the RELOAD exp_a_ans + message. + + The Join behavior can be described as follows: + + if(checkAccept(msg)) { + recvJoins.add(msg.source, msg.group_id) + SEND(JoinAccept(node_id, msg.source, msg.group_id)) + } + +7.2.4. JoinAccept (Join Response) + + JoinAccept tells the requesting joining peer that the indicated peer + is available to act as its parent in the ALM tree specified by + group_id, with the corresponding options specified. A peer MAY + receive more than one JoinAccept from different candidate parent + peers in the group_id tree. The peer accepts a peer as parent using + + + +Buford & Kolberg Experimental [Page 14] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + a JoinConfirm message. A JoinAccept that receives neither a + JoinConfirm nor JoinDecline message MUST expire. RELOAD + implementations are able to read a local configuration file for + settings. It is assumed that this file contains the timeout value to + be used. + + struct { + node_id parent_peer_id; + node_id child_peer_id; + node_id group_id; + Dictionary options; + } JoinAccept; + + parent_peer_id: overlay address of a peer that accepts the joining + peer + + child_peer_id: overlay address of joining peer + + group_id: overlay address of the root of the tree + + options: name-value list of options accepted by parent peer + +7.2.5. JoinReject (Join Response) + + A peer receiving a Join request responds with a JoinReject response + to indicate the request is rejected. + +7.2.6. JoinConfirm + + A peer receiving a JoinAccept message that it wishes to accept MUST + explicitly accept it using a JoinConfirm message before the + expiration of a timer for the JoinAccept message. The joining peer + MUST include only those options from the JoinAccept that it also + accepts, completing the negotiation of options between the two peers. + + struct { + node_id child_peer_id; + node_id parent_peer_id; + node_id group_id; + Dictionary options; + } JoinConfirm; + + child_peer_id: overlay address of joining peer that is a child of the + parent peer + + parent_peer_id: overlay address of the peer that is the parent of the + joining peer + + + + +Buford & Kolberg Experimental [Page 15] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + group_id: overlay address of the root of the tree + + options: name-value list of options accepted by both peers + + The JoinConfirm message behavior is described below: + + if(recvJoins.contains(msg.source,msg.group_id)){ + if !(groups.contains(msg.group_id)) { + groups.add(msg.group_id) + SEND(msg,msg.group_id) + } + groups[msg.group_id].children.add(msg.source) + recvJoins.del(msg.source, msg.group_id) + } + +7.2.7. JoinConfirmResponse + + A peer receiving a JoinConfirm message responds with a + JoinConfirmResponse message. + +7.2.8. JoinDecline + + A peer receiving a JoinAccept message that it does not wish to accept + MAY explicitly decline it using a JoinDecline message. + + struct { + node_id peer_id; + node_id parent_peer_id; + node_id group_id; + } JoinDecline; + + peer_id: overlay address of joining peer that declines the JoinAccept + + parent_peer_id: overlay address of the peer that issued a JoinAccept + to this peer + + group_id: overlay address of the root of the tree + + The behavior of the JoinDecline message is described as follows: + + if(recvJoins.contains(msg.source,msg.group_id)) + recvJoins.del(msg.source, msg.group_id) + +7.2.9. JoinDeclineResponse + + A peer receiving a JoinConfirm message responds with a + JoinDeclineResponse message. + + + + +Buford & Kolberg Experimental [Page 16] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +7.2.10. Leave + + A peer that is part of an ALM tree identified by group_id that + intends to detach from either a child or parent peer SHOULD send a + Leave request to the peer from which it wishes to detach. A peer + receiving a Leave request from a peer that is neither in its parent + nor child lists SHOULD ignore the message. + + struct { + node_id peer_id; + node_id group_id; + Dictionary options; + } Leave; + + peer_id: overlay address of leaving peer + + group_id: overlay address of the root of the tree + + options: name-value list of options + + The behavior of the Leave request can be described as: + + groups[msg.group_id].children.remove(msg.source) + if (groups[msg.group].children = 0) + SEND(msg,groups[msg.group_id].parent) + +7.2.11. LeaveResponse + + A peer receiving a Leave request responds with a LeaveResponse + message. + +7.2.12. Reform or Optimize Tree + + This triggers a reorganization of either the entire tree or only a + subtree. It MAY include hints to specific peers of recommended + parent or child peers to which to reconnect. A peer receiving this + message MAY ignore it, MAY propagate it to other peers in its + subtree, and MAY invoke local algorithms for selecting preferred + parent and/or child peers. + + struct { + node_id group_id; + node_id peer_id; + Dictionary options; + } Reform; + + group_id: overlay address of the root of the tree + + + + +Buford & Kolberg Experimental [Page 17] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + peer_id: if omitted, then the tree is reorganized starting from the + root; otherwise, it is reorganized only at the subtree identified by + peer_id. + + options: name-value list of options + +7.2.13. ReformResponse + + A peer receiving a Reform message responds with a ReformResponse. + + struct { + Dictionary options; + } ReformResponse; + + options: algorithm-dependent information about the results of the + Reform operation + +7.2.14. Heartbeat + + A child node signals to its adjacent parent nodes in the tree that it + is alive. If a parent node does not receive a Heartbeat message + within N Heartbeat time intervals, it MUST treat this as an explicit + Leave request from the unresponsive peer. N is configurable. RELOAD + implementations are able to read a local configuration file for + settings. It is assumed that this file contains the value for N to + be used. + + struct { + node_id peer_id_src; + node_id peer_id_dst; + node_id group_id; + Dictionary options; + } Heartbeat; + + peer_id_src: source of Heartbeat + + peer_id_dst: destination of Heartbeat + + group_id: overlay address of the root of the tree + + options: an algorithm may use the Heartbeat message to provide state + information to adjacent nodes in the tree + + + + + + + + + +Buford & Kolberg Experimental [Page 18] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +7.2.15. Heartbeat Response + + A parent node responds with a HeartbeatResponse to a Heartbeat from a + child node indicating that it has received the Heartbeat message. + +7.2.16. NodeQuery + + The NodeQuery message is used to obtain information about the state + and performance of the tree on a per-node basis. A set of nodes + could be queried to construct a centralized view of the multicast + trees, similar to a web crawler. + + struct { + node_id peer_id_src; + node_id peer_id_dst; + } NodeQuery; + + peer_id_src: source of query + + peer_id_dst: destination of query + +7.2.17. NodeQueryResponse + + The response to a NodeQuery message contains a NodeStatistics + instance for this node. + + public struct { + uint32 node_lifetime; + uint32 total_number_trees; + uint16 number_algorithms_supported; + uint8 algorithms_supported[32]; + TreeData max_tree_data; + uint16 number_active_trees; + TreeData tree_data<0..2^8-1>; + ImplementationInfo impl_info; + } NodeStatistics; + + node_lifetime: time the node has been alive in seconds since last + restart + + total_number_trees: total number of trees this node has been part + of during the node lifetime + + number_algorithms_supported: value between 0..2^16-1 corresponding + to the number of algorithms supported + + algorithms_supported: list of algorithms, each byte encoded using + the corresponding algorithm code + + + +Buford & Kolberg Experimental [Page 19] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + max_tree_data: data about tree with largest number of nodes that + this node was part of. NodeQuery can be used to crawl all the + nodes in an ALM tree to fill this field. This is intended to + support monitoring, algorithm design, and general experimentation + with ALM in RELOAD. + + number_active_trees: current number of trees that the node is part + of + + tree_data: details of each active tree; the number of such is + specified by number_active_trees + + impl_info: information about the implementation of this Usage + + public struct { + uint32 tree_id; + uint8 algorithm; + node_id tree_root; + uint8 number_parents; + node_id parent<0..2^8-1>; + uint16 number_child_nodes; + node_id children<0..2^16-1>; + uint32 path_length_to_root; + uint32 path_delay_to_root; + uint32 path_delay_to_child; + } TreeData; + + tree_id: the ID of the tree + + algorithm: code identifying the multicast algorithm used by this + tree + + tree_root: node_id of tree root, or 0 if unknown + + number_parents: 0 .. 2^8-1 indicates number of parent nodes for + this node + + parent: the RELOAD node_id of each parent node + + number_child_nodes: 0..2^16-1 indicates number of children + + children: the RELOAD node_id of each child node + + path_length_to_root: number of overlay hops to the root of the + tree + + path_delay_to_root: RTT in milliseconds to root node + + + + +Buford & Kolberg Experimental [Page 20] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + path_delay_to_child: last measured RTT in milliseconds to child + node with largest RTT + + public struct { + uint32 join_confirm_timeout; + uint32 heartbeat_interval; + uint32 heartbeat_response_timeout; + uint16 info_length; + uint8 info<0..2^16-1>; + } ImplementationInfo; + + join_confirm_timeout: The default time for + JoinConfirm/JoinDecline, intended to provide sufficient time for a + Join request to receive all responses and confirm the best choice. + Default value is 5000 msec. An implementation can change this + value. + + heartbeat_interval: The default Heartbeat interval is 2000 msec. + Different interoperating implementations could use different + intervals. + + heartbeat_response_timeout: The default Heartbeat timeout is 5000 + msec and is the max time between Heartbeat reports from an + adjacent node in the tree at which point the Heartbeat is missed. + + info_length: length of the info field + + info: implementation-specific information, such as name of + implementation, build version, and implementation-specific + features + + + + + + + + + + + + + + + + + + + + + +Buford & Kolberg Experimental [Page 21] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +7.2.18. Push + + A peer sends arbitrary multicast data to other peers in the tree. + Nodes in the tree forward this message to adjacent nodes in the tree + in an algorithm-dependent way. + + struct { + node_id group_id; + uint8 priority; + uint32 length; + uint8 data<0..2^32-1>; + } Push; + + group_id: overlay address of root of the ALM tree + + priority: the relative priority of the message; highest priority is + 255. A node may ignore this field. + + length: length of the data field in bytes + + data: the data + + In pseudocode, the behavior of Push can be described as: + + foreach(groups[msg.group_id].children as node_id) + SEND(msg,node_id) + if memberOf(msg.group_id) + invokeMessageHandler(msg.group_id, msg) + +7.2.19. PushResponse + + After receiving a Push message from node S, the receiving peer sends + a PushResponse to node S. + + struct { + Dictionary options; + } PushResponse; + + options: A node may provide feedback to the sender about previous + Push messages in some window, for example, the last N Push messages. + The feedback could include, for each Push message received, the + number of adjacent nodes that were forwarded the Push message and the + number of adjacent nodes from which a PushResponse was received. + + + + + + + + +Buford & Kolberg Experimental [Page 22] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +8. Scribe Algorithm + +8.1. Overview + + Figure 3 shows a mapping between RELOAD ALM messages (as defined in + Section 5 of this document) and Scribe messages as defined in + [CASTRO2002]. + + +---------+-------------------+-----------------+ + | Section |RELOAD ALM Message | Scribe Message | + +---------+-------------------+-----------------+ + | 7.2.1 | CreateALMTree | Create | + +---------+-------------------+-----------------+ + | 7.2.3 | Join | Join | + +---------+-------------------+-----------------+ + | 7.2.4 | JoinAccept | | + +---------+-------------------+-----------------+ + | 7.2.6 | JoinConfirm | | + +---------+-------------------+-----------------+ + | 7.2.8 | JoinDecline | | + +---------+-------------------+-----------------+ + | 7.2.10 | Leave | Leave | + +---------+-------------------+-----------------+ + | 7.2.12 | Reform | | + +---------+-------------------+-----------------+ + | 7.2.14 | Heartbeat | | + +---------+-------------------+-----------------+ + | 7.2.16 | NodeQuery | | + +---------+-------------------+-----------------+ + | 7.2.18 | Push | Multicast | + +---------+-------------------+-----------------+ + | | Note 1 | deliver | + +---------+-------------------+-----------------+ + | | Note 1 | forward | + +---------+-------------------+-----------------+ + | | Note 1 | route | + +---------+-------------------+-----------------+ + | | Note 1 | send | + +---------+-------------------+-----------------+ + + Figure 3: Mapping to Scribe Messages + + Note 1: These Scribe messages are handled by RELOAD messages. + + The following sections describe the Scribe algorithm in more detail. + + + + + + +Buford & Kolberg Experimental [Page 23] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +8.2. Create + + This message will create a group with group_id. This message MUST be + delivered to the node whose node_id is closest to the group_id. This + node becomes the rendezvous point and root for the new multicast + tree. Groups MAY have multiple sources of multicast messages. + +8.3. Join + + To join a multicast tree, a node SHOULD send a Join request with the + group_id as the key. This message gets routed by the overlay to the + rendezvous point of the tree. If an intermediate node is already a + forwarder for this tree, it SHOULD add the joining node as a child. + Otherwise, the node SHOULD create a child table for the group and add + the joining node. It SHOULD then send the Join request towards the + rendezvous point terminating the Join request from the child. + + To adapt the Scribe algorithm to the ALM Usage proposed here, after a + Join request is accepted, a JoinAccept message MUST be returned to + the joining node. + +8.4. Leave + + When leaving a multicast group, a node SHOULD change its local state + to indicate that it left the group. If the node has no children in + its table, it MUST send a Leave request to its parent, from where it + SHOULD travel up the multicast tree and stop at a node that still has + children remaining after removing the leaving node. + +8.5. JoinConfirm + + This message is not part of the Scribe protocol but is required by + the basic protocol proposed in this document. Thus, the Usage MUST + send this message to confirm a joining node accepting its parent + node. + +8.6. JoinDecline + + Like JoinConfirm, this message is not part of the Scribe protocol. + Thus, the Usage MUST send this message if a peer receiving a + JoinAccept message wishes to decline it. + +8.7. Multicast + + A message to be multicast to a group MUST be sent to the rendezvous + node from where it is forwarded down the tree. If a node is a member + of the tree rather than just a forwarder, it SHOULD pass the + multicast data up to the application. + + + +Buford & Kolberg Experimental [Page 24] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +9. P2PCast Algorithm + +9.1. Overview + + P2PCast [P2PCAST] creates a forest of related trees to increase load + balancing. P2PCast is independent of the underlying P2P substrate. + Its goals and approach are similar to SplitStream [SPLITSTREAM] + (which assumes Pastry as the P2P overlay). In P2PCast, the content + provider splits the stream of data into f stripes. Each tree in the + forest of multicast trees is an (almost) full tree of arity f. These + trees are conceptually separate: every node of the system appears + once in each tree, with the content provider being the source in all + of them. To ensure that each peer contributes as much bandwidth as + it receives, every node is a leaf in all the trees except for one, in + which the node will serve as an internal node (proper tree of this + node). To reduce the complexity of the discussion that follows, the + remainder of this section will assume that f = 2. However, the + algorithm scales for any number f. + + P2PCast distinguishes the following types of nodes: + + o Incomplete Node: A node with less than f children in its proper + stripe + + o Only-Child Node: A node whose parent (in any multicast tree) is an + incomplete node + + o Complete Node: A node with exactly f children in its proper stripe + + o Special Node: A single node that is a leaf in all multicast trees + of the forest + + + + + + + + + + + + + + + + + + + + +Buford & Kolberg Experimental [Page 25] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +9.2. Message Mapping + + Figure 4 shows a mapping between RELOAD ALM messages (as defined in + Section 5 of this document) and P2PCast messages as defined in + [P2PCAST]. + + +---------+-------------------+-----------------+ + | Section |RELOAD ALM Message | P2PCast Message | + +---------+-------------------+-----------------+ + | 7.2.1 | CreateALMTree | Create | + +---------+-------------------+-----------------+ + | 7.2.3 | Join | Join | + +---------+-------------------+-----------------+ + | 7.2.4 | JoinAccept | | + +---------+-------------------+-----------------+ + | 7.2.6 | JoinConfirm | | + +---------+-------------------+-----------------+ + | 7.2.8 | JoinDecline | | + +---------+-------------------+-----------------+ + | 7.2.10 | Leave | Leave | + +---------+-------------------+-----------------+ + | 7.2.12 | Reform | Takeon | + | | | Substitute | + | | | Search | + | | | Replace | + | | | Direct | + | | | Update | + +---------+-------------------+-----------------+ + | 7.2.14 | Heartbeat | | + +---------+-------------------+-----------------+ + | 7.2.16 | NodeQuery | | + +---------+-------------------+-----------------+ + | 7.2.18 | Push | Multicast | + +---------+-------------------+-----------------+ + + Figure 4: Mapping to P2PCast Messages + + The following sections describe the mapping of the P2PCast messages + in more detail. + +9.3. Create + + This message will create a group with group_id. This message MUST be + delivered to the node whose node_id is closest to the group_id. This + node becomes the rendezvous point and root for the new multicast + tree. The rendezvous point will maintain f subtrees. + + + + + +Buford & Kolberg Experimental [Page 26] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +9.4. Join + + To join a multicast tree, a joining node N MUST send a Join request + to a random node A already part of the tree. Depending on the type + of A, the joining algorithm continues as follows: + + o Incomplete Node: Node A will arbitrarily select for which tree it + wants to serve as an internal node and adopt N in that tree. In + the other tree, node N will adopt node A as a child (taking node + A's place in the tree), thus becoming an internal node in the + stripe that node A didn't choose. + + o Only-Child Node: As this node has a parent that is an incomplete + node, the joining node will be redirected to the parent node and + will handle the request as detailed above. + + o Complete Node: The contacted node A must be a leaf in the other + tree. If node A is a leaf node in Stripe 1, node N will become an + internal node in Stripe 1, taking the place of node A and adopting + it at the same time. To find a place for itself in the other + stripe, node N starts a random walk down the subtree rooted at the + sibling of node A (if node A is the root and thus does not have + siblings, node N is sent directly to a leaf in that tree), which + ends as soon as node N finds an incomplete node or a leaf. In + this case, node N is adopted by the incomplete node. + + o Special Node: as this node is a leaf in all subtrees, the joining + node MAY adopt the node in one tree and become a child in the + other. + + P2PCast uses defined messages for communication between nodes during + reorganization. To use P2PCast in this context, these messages are + encapsulated by the message type Reform. In doing so, the P2PCast + message is to be included in the options parameter of Reform. The + following reorganization messages are defined by P2PCast: + + Takeon: To take another peer as a child + + Substitute: To take the place of a child of some peer + + Search: To obtain the child of a node in a particular stripe + + Replace: Different from Substitute in that the calling node that + makes a node its child sheds off a random child + + Direct: To direct a node to its would-be parent + + Update: A node sends its updated state to its children + + + +Buford & Kolberg Experimental [Page 27] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + To adapt the P2PCast algorithm to the ALM Usage proposed here, after + a Join request is accepted, a JoinAccept message MUST be returned to + the joining node (one for every subtree). + +9.5. Leave + + When leaving a multicast group, a node will change its local state to + indicate that it left the group. Disregarding the case where the + leaving node is the root of the tree, the leaving node must be + complete or incomplete in its proper tree. In the other trees, the + node is a leaf and can just disappear by notifying its parent. For + the proper tree, if the node is incomplete, it is replaced by its + child. However, if the node is complete, a gap is created that is + filled by a random child. If this child is incomplete, it can simply + fill the gap. However, if it is complete, it needs to shed a random + child. This child is directed to its sibling, which sheds a random + child. This process ripples down the tree until the next-to-last + level is reached. The shed node is then taken as a child by the + parent of the deleted node in the other stripe. + + Again, for the reorganization of the tree, the Reform message type is + used as defined in the previous section. + +9.6. JoinConfirm + + This message is not part of the P2PCast protocol but is required by + the basic protocol defined in this document. Thus, the Usage MUST + send this message to confirm a joining node accepting its parent + node. As with Join and JoinAccept, this MUST be carried out for + every subtree. + +9.7. Multicast + + A message to be multicast to a group MUST be sent to the rendezvous + node from where it is forwarded down the tree by being split into k + stripes. Each stripe is then sent via a subtree. If a receiving + node is a member of the tree rather than just a forwarder, it MAY + pass the multicast data up to the application. + +10. Message Format + + All messages are mapped to the RELOAD experimental message type. The + mapping is shown in Figure 5. The message codes are listed in + Section 14.2. The format of the body of a message is provided in + [RELOAD]. + + + + + + +Buford & Kolberg Experimental [Page 28] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + +-------------------------+------------------+ + | Message |RELOAD Code Point | + +-------------------------+------------------+ + | CreateALMTree | exp_a_req | + +-------------------------+------------------+ + | CreateALMTreeResponse | exp_a_ans | + +-------------------------+------------------+ + | Join | exp_a_req | + +-------------------------+------------------+ + | JoinAccept | exp_a_ans | + +-------------------------+------------------+ + | JoinReject | exp_a_ans | + +-------------------------+------------------+ + | JoinConfirm | exp_a_req | + +-------------------------+------------------+ + | JoinConfirmResponse | exp_a_ans | + +-------------------------+------------------+ + | JoinDecline | exp_a_req | + +-------------------------+------------------+ + | JoinDeclineResponse | exp_a_ans | + +-------------------------+------------------+ + | Leave | exp_a_req | + +-------------------------+------------------+ + | LeaveResponse | exp_a_ans | + +-------------------------+------------------+ + | Reform | exp_a_req | + +-------------------------+------------------+ + | ReformResponse | exp_a_ans | + +-------------------------+------------------+ + | Heartbeat | exp_a_req | + +-------------------------+------------------+ + | HeartbeatResponse | exp_a_ans | + +-------------------------+------------------+ + | NodeQuery | exp_a_req | + +-------------------------+------------------+ + | NodeQueryResponse | exp_a_ans | + +-------------------------+------------------+ + | Push | exp_a_req | + +-------------------------+------------------+ + | PushResponse | exp_a_ans | + +-------------------------+------------------+ + + Figure 5: RELOAD Message Code Mapping + + + + + + + + +Buford & Kolberg Experimental [Page 29] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + For Data Kind-IDs, the RELOAD specification [RELOAD] states: "Code + points in the range 0xF0000001 to 0xFFFFFFFE are reserved for private + use". ALM Usage Kind-IDs are defined in the private use range. + + All ALM Usage messages map to the RELOAD Message Extension mechanism. + + Code points for the Kinds defined in this document MUST NOT conflict + with any defined code points for RELOAD. RELOAD defines exp_a_req + and exp_a_ans for experimental purposes. This specification uses + only these message types for all ALM messages. RELOAD defines the + MessageContents data structure. The ALM mapping uses the fields as + follows: + + o message_code: exp_a_req for requests and exp_a_ans for responses + + o message_body: contains one instance of ALMHeader followed by one + instance of ALMMessageContents + + o extensions: unused + +10.1. ALMHeader Definition + + struct { + uint32 sam_token; + uint16 alm_algorithm_id; + uint8 version; + } ALMHeader; + + The fields in ALMHeader are used as follows: + + sam_token: The first four bytes identify this message as an ALM + message. This field MUST contain the value 0xD3414D42 (the string + "SAMB" with the high bit of the first byte set). + + alm_algorithm_id: The ALM Algorithm ID of the ALM algorithm being + used. Each multicast tree uses only one algorithm. Trees with + different ALM algorithms can coexist and can share the same nodes. + ALM Algorithm ID codes are defined in Section 14.1. + + version: The version of the ALM protocol being used. This is a + fixed-point integer between 0.1 and 25.4. This document describes + version 1.0 with a value of 0xA. + + + + + + + + + +Buford & Kolberg Experimental [Page 30] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +10.2. ALMMessageContents Definition + + struct { + uint16 alm_message_code; + opaque alm_message_body; + } ALMMessageContents; + + The fields in ALMMessageContents are used as follows: + + alm_message_code: This indicates the message being sent. The + message codes are listed in Section 14.2. + + alm_message_body: The message body itself, represented as a + variable-length string of bytes. The bytes themselves are + dependent on the code value. See Sections 8 and 9, which describe + the various ALM methods for the definitions of the payload + contents. + +10.3. Response Codes + + Response codes are defined in Section 6.3.3.1 of [RELOAD]. This + specification maps to RELOAD ErrorResponse as follows: + + ErrorResponse.error_code = Error_Exp_A; + + Error_info contains an ALMErrorResponse instance. + + public struct { + uint16 alm_error_code; + opaque alm_error_info<0..2^16-1>; + } ALMErrorResponse; + + alm_error_code: The following error code values are defined. Numeric + values for these are defined in Section 14.3. + + Error_Unknown_Algorithm: The multicast algorithm is not known or + not supported. + + Error_Child_Limit_Reached: The maximum number of child nodes has + been reached for this node. + + Error_Node_Bandwidth_Reached: The overall data bandwidth limit + through this node has been reached. + + Error_Node_Conn_Limit_Reached: The total number of connections to + this node has been reached. + + + + + +Buford & Kolberg Experimental [Page 31] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + Error_Link_Cap_Limit_Reached: The capacity of a link has been + reached. + + Error_Node_Mem_Limit_Reached: An internal memory capacity of the + node has been reached. + + Error_Node_CPU_Cap_Limit_Reached: An internal processing capacity + of the node has been reached. + + Error_Path_Limit_Reached: The maximum path length in hop count + over the multicast tree has been reached. + + Error_Path_Delay_Limit_Reached: The maximum path length in message + delay over the multicast tree has been reached. + + Error_Tree_Fanout_Limit_Reached: The maximum fanout of a multicast + tree has been reached. + + Error_Tree_Depth_Limit_Reached: The maximum height of a multicast + tree has been reached. + + Error_Other: A human-readable description is placed in the + alm_error_info field. + +11. Examples + + All peers in the examples are assumed to have completed + bootstrapping. "Pn" refers to peer N. "group_id" refers to a peer + responsible for storing the ALMTree instance with group_id. + + + + + + + + + + + + + + + + + + + + + + +Buford & Kolberg Experimental [Page 32] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +11.1. Create Tree + + A node with "NODE-MATCH" rights sends a CreateALMTree request to the + group_id node, which also has NODE-MATCH rights for its own address. + The group_id node determines whether to create the new tree and, if + so, performs a local StoreReq. If the CreateALMTree succeeds, the + ALMTree instance can be retrieved using Fetch. An example message + flow for creating a tree is depicted in Figure 6. + + P1 P2 P3 P4 group_id + | | | | | + | | | | | + | | | | | + | CreateALMTree | | | + |------------------------------->| + | | | | | + | | | | | StoreReq + | | | | |--+ + | | | | | | + | | | | | | + | | | | |<-+ + | | | | | StoreResponse + | | | | |--+ + | | | | | | + | | | | | | + | | | | |<-+ + | | | | | + | | | | | + | | CreateALMTreeResponse | + |<-------------------------------| + | | | | | + | | | | | + | Fetch | | | + |------------------------------->| + | | | | | + | | | | | + | | FetchResponse | + |<-------------------------------| + | | | | | + + Figure 6: Message Flow Example for CreateALMTree + +11.2. Join Tree + + P1 joins node group_id as child node. P2 joins the tree as a child + of P1. P4 joins the tree as a child of P1. The corresponding + message flow is shown in Figure 7. + + + + +Buford & Kolberg Experimental [Page 33] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + P1 P2 P3 P4 group_id + | | | | | + | | | | | + | Join | + |------------------------------->| + | | | | | + | JoinAccept | + |<-------------------------------| + | | | | | + | | | | | + | |Join | + | |----------------------->| + | | | | | + | Join| + |<-------------------------------| + | | | | | + |JoinAccept | | | + |------>| | | | + | | | | | + |JoinConfirm | | | + |<------| | | | + | | | | | + | | | |Join | + | | | |------>| + | | | | Join | + |<-------------------------------| + | | | | | + | Join | | | | + |------>| | | | + | | | | | + | JoinAccept | | | + |----------------------->| | + | | | | | + | | JoinAccept | | + | |--------------->| | + | | | | | + | | | | | + | | JoinConfirm | | + |<-----------------------| | + | | | | | + | | JoinDecline | | + | |<---------------| | + | | | | | + | | | | | + + Figure 7: Message Flow Example for Tree Join + + + + + +Buford & Kolberg Experimental [Page 34] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +11.3. Leave Tree + + P1 P2 P3 P4 group_id + | | | | | + | | | | | + | | | Leave | | + |<-----------------------| | + | | | | | + | LeaveResponse | | | + |----------------------->| | + | | | | | + | | | | | + + Figure 8: Message Flow Example for Leave Tree + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Buford & Kolberg Experimental [Page 35] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +11.4. Push Data + + The multicast data is pushed recursively P1 => group_id => P1 => P2, + P4 following the tree topology created in the Join example above. An + example message flow is shown in Figure 9. + + P1 P2 P3 P4 group_id + | | | | | + | Push | | | | + |------------------------------->| + | | | | | + | | | PushResponse| + |<-------------------------------| + | | | | | + | | | | Push| + |<-------------------------------| + | | | | | + | PushResponse | | | + |------------------------------->| + | | | | | + |Push | | | | + |------>| | | | + | | | | | + |PushResponse | | | + |<------| | | | + | | | | | + | Push | | | | + |----------------------->| | + | | | | | + | | PushResponse | | + |<-----------------------| | + | | | | | + | | | | | + | | | | | + + Figure 9: Message Flow Example for Pushing Data + +12. Kind Definitions + +12.1. ALMTree Kind Definition + + This section defines the ALMTree Kind per Section 7.4.5 of [RELOAD]. + An instance of the ALMTree Kind is stored in the overlay for each ALM + tree instance. It is stored at the address group_id. + + Kind-ID: 0xF0000001. (This is a private-use code point per + Section 14.6 of [RELOAD].) The Resource Name for the ALMTree Kind-ID + is the session_key used to identify the ALM tree. + + + +Buford & Kolberg Experimental [Page 36] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + Data Model: The data model is the ALMTree structure. + + Access Control: NODE-MATCH. The node performing the store operation + is required to have NODE-MATCH access. + + Meaning: The meaning of the fields is given in Section 7.2.1. + + struct { + node_id peer_id; + opaque session_key<0..2^32-1>; + node_id group_id; + Dictionary options; + } ALMTree; + +13. RELOAD Configuration File Extensions + + There are no ALM parameters defined for the RELOAD configuration + file. + +14. IANA Considerations + + This section contains the new code points registered by this + document. + +14.1. ALM Algorithm Types + + IANA has created the "SAM ALM Algorithm IDs" registry. Entries in + this registry are 16-bit integers denoting Application-Layer + Multicast algorithms as described in Section 10.1 of this document. + Code points in the range 0x0003 to 0x7FFF SHALL be registered via + RFC 5226 [RFC5226] Expert Review. Code points in the range 0x8000 to + 0xFFFF are reserved for private use. The initial contents of this + registry are: + + +----------------+-------------------+-----------+ + | Algorithm Name | ALM Algorithm ID | RFC | + +----------------+-------------------+-----------+ + | INVALID-ALG | 0x0000 | RFC 7019 | + | SCRIBE-SAM | 0x0001 | RFC 7019 | + | P2PCAST-SAM | 0x0002 | RFC 7019 | + | Reserved | 0x8000-0xFFFF | RFC 7019 | + +----------------+-------------------+-----------+ + + Figure 10: "SAM ALM Algorithm IDs" Registry Allocations + + These values have been made available for the purposes of + experimentation. These values are not meant for vendor-specific use + of any sort and MUST NOT be used for operational deployments. + + + +Buford & Kolberg Experimental [Page 37] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +14.2. Message Code Registration + + IANA has created the "SAM ALM Message Codes" registry. Entries in + this registry are 16-bit integers denoting message codes as described + in Section 10.2 of this document. Code points in the range 0x0014 to + 0x7FFF SHALL be registered via RFC 5226 [RFC5226] Expert Review. + Code points in the range 0x8000 to 0xFFFF are reserved for private + use. The initial contents of this registry are: + + +-------------------------+----------------------+-----------+ + | Message Code Name | Message Code Value | RFC | + +-------------------------+----------------------+-----------+ + | InvalidMessageCode | 0x0000 | RFC 7019 | + | CreateALMTree | 0x0001 | RFC 7019 | + | CreateALMTreeResponse | 0x0002 | RFC 7019 | + | Join | 0x0003 | RFC 7019 | + | JoinAccept | 0x0004 | RFC 7019 | + | JoinReject | 0x0005 | RFC 7019 | + | JoinConfirm | 0x0006 | RFC 7019 | + | JoinConfirmResponse | 0x0007 | RFC 7019 | + | JoinDecline | 0x0008 | RFC 7019 | + | JoinDeclineResponse | 0x0009 | RFC 7019 | + | Leave | 0x000A | RFC 7019 | + | LeaveResponse | 0x000B | RFC 7019 | + | Reform | 0x000C | RFC 7019 | + | ReformResponse | 0x000D | RFC 7019 | + | Heartbeat | 0x000E | RFC 7019 | + | HeartbeatResponse | 0x000F | RFC 7019 | + | NodeQuery | 0x0010 | RFC 7019 | + | NodeQueryResponse | 0x0011 | RFC 7019 | + | Push | 0x0012 | RFC 7019 | + | PushResponse | 0x0013 | RFC 7019 | + | Reserved | 0x8000-0xFFFF | RFC 7019 | + +-------------------------+----------------------+-----------+ + + Figure 11: "SAM ALM Message Codes" Registry Allocations + + These values have been made available for the purposes of + experimentation. These values are not meant for vendor-specific use + of any sort and MUST NOT be used for operational deployments. + +14.3. Error Code Registration + + IANA has created the "SAM ALM Error Codes" registry. Entries in this + registry are 16-bit integers denoting error codes as described in + Section 10.3 of this document. Code points in the range 0x000D to + + + + + +Buford & Kolberg Experimental [Page 38] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + 0x7FFF SHALL be registered via RFC 5226 [RFC5226] Expert Review. + Code points in the range 0x8000 to 0xFFFF are reserved for private + use. The initial contents of this registry are: + + +----------------------------------+---------------+-----------+ + | Error Code Name | Code Value | RFC | + +----------------------------------+---------------+-----------+ + | InvalidErrorCode | 0x0000 | RFC 7019 | + | Error_Unknown_Algorithm | 0x0001 | RFC 7019 | + | Error_Child_Limit_Reached | 0x0002 | RFC 7019 | + | Error_Node_Bandwidth_Reached | 0x0003 | RFC 7019 | + | Error_Node_Conn_Limit_Reached | 0x0004 | RFC 7019 | + | Error_Link_Cap_Limit_Reached | 0x0005 | RFC 7019 | + | Error_Node_Mem_Limit_Reached | 0x0006 | RFC 7019 | + | Error_Node_CPU_Cap_Limit_Reached | 0x0007 | RFC 7019 | + | Error_Path_Limit_Reached | 0x0008 | RFC 7019 | + | Error_Path_Delay_Limit_Reached | 0x0009 | RFC 7019 | + | Error_Tree_Fanout_Limit_Reached | 0x000A | RFC 7019 | + | Error_Tree_Depth_Limit_Reached | 0x000B | RFC 7019 | + | Error_Other | 0x000C | RFC 7019 | + | Reserved | 0x8000-0xFFFF | RFC 7019 | + +----------------------------------+---------------+-----------+ + + Figure 12: "SAM ALM Error Codes" Registry Allocations + + These values have been made available for the purposes of + experimentation. These values are not meant for vendor-specific use + of any sort and MUST NOT be used for operational deployments. + +15. Security Considerations + + Overlays are vulnerable to DoS and collusion attacks. We are not + solving overlay security issues. We assume that the node + authentication model as defined in [RELOAD] will be used. + + Security issues specific to ALM Usage include the following: + + o The right to create group_id at some node_id + + o The right to store Tree info at some location in the DHT + + o A limit on number of messages per second and bandwidth use + + o The right to join an ALM tree + + + + + + + +Buford & Kolberg Experimental [Page 39] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + +16. Acknowledgements + + Marc Petit-Huguenin, Michael Welzl, Joerg Ott, and Lars Eggert + provided important comments on earlier versions of this document. + +17. References + +17.1. Normative Reference + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + +17.2. Informative References + + [BUFORD2008] Buford, J. and H. Yu, "P2P: Overlay Multicast", + Encyclopedia of Wireless and Mobile Communications, + 2008, <http://www.tandfonline.com/doi/abs/10.1081/ + E-EWMC-120043583>. + + [BUFORD2009] Buford, J., Yu, H., and E. Lua, "P2P Networking and + Applications (Chapter 9)", Morgan Kaufman, 2009, + <http://www.sciencedirect.com/science/book/ + 9780123742148>. + + [CASTRO2002] Castro, M., Druschel, P., Kermarrec, A., and A. + Rowstron, "SCRIBE: A large-scale and decentralized + application-level multicast infrastructure", IEEE + Journal on Selected Areas in Communications, Vol. 20, + No. 8, October 2002, <http://ieeexplore.ieee.org/xpl/ + login.jsp?tp=&arnumber=1038579>. + + [CASTRO2003] Castro, M., Jones, M., Kermarrec, A., Rowstron, A., + Theimer, M., Wang, H., and A. Wolman, "An Evaluation of + Scalable Application-level Multicast Built Using Peer- + to-peer Overlays", Proceedings of IEEE INFOCOM 2003, + April 2003, <http://ieeexplore.ieee.org/xpl/ + login.jsp?tp=&arnumber=1208986>. + + [COMMON-API] Waehlisch, M., Schmidt, T., and S. Venaas, "A Common + API for Transparent Hybrid Multicast", Work in + Progress, April 2013. + + [KOLBERG2010] Kolberg, M., "Employing Multicast in P2P Overlay + Networks", Handbook of Peer-to-Peer Networking, 2010, + <http://link.springer.com/content/pdf/ + 10.1007%2F978-0-387-09751-0_30.pdf>. + + + + + +Buford & Kolberg Experimental [Page 40] + +RFC 7019 ALM Extensions to RELOAD September 2013 + + + [P2PCAST] Nicolosi, A. and S. Annapureddy, "P2PCast: A Peer-to- + Peer Multicast Scheme for Streaming Data", Stanford + Secure Computer Systems Group Report, May 2003, + <http://www.scs.stanford.edu/~reddy/research/p2pcast/ + report.pdf>. + + [RELOAD] Jennings, C., Lowekamp, B., Ed., Rescorla, E., Baset, + S., and H. Schulzrinne, "REsource LOcation And + Discovery (RELOAD) Base Protocol", Work in Progress, + February 2013. + + [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing + an IANA Considerations Section in RFCs", BCP 26, RFC + 5226, May 2008. + + [SAM-GENERIC] Muramoto, E., Imai, Y., and N. Kawaguchi, "Requirements + for Scalable Adaptive Multicast Framework in Non-GIG + Networks", Work in Progress, November 2006. + + [SPLITSTREAM] Castro, M., Druschel, P., Nandi, A., Kermarrec, A., + Rowstron, A., and A. Singh, "SplitStream: High- + Bandwidth Multicast in a Cooperative Environment", SOSP + '03, Lake Bolton, New York, October 2003, + <http://dl.acm.org/citation.cfm?id=945474>. + +Authors' Addresses + + John Buford + Avaya Labs Research + 211 Mt. Airy Rd. + Basking Ridge, New Jersey 07920 + USA + + Phone: +1 908 848 5675 + EMail: buford@avaya.com + + + Mario Kolberg (editor) + University of Stirling + Dept. of Computing Science and Mathematics + Stirling FK9 4LA + UK + + Phone: +44 1786 46 7440 + EMail: mkolberg@ieee.org + URI: http://www.cs.stir.ac.uk/~mko + + + + + +Buford & Kolberg Experimental [Page 41] + |