From 4bfd864f10b68b71482b35c818559068ef8d5797 Mon Sep 17 00:00:00 2001 From: Thomas Voss Date: Wed, 27 Nov 2024 20:54:24 +0100 Subject: doc: Add RFC documents --- doc/rfc/rfc9420.txt | 7040 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 7040 insertions(+) create mode 100644 doc/rfc/rfc9420.txt (limited to 'doc/rfc/rfc9420.txt') diff --git a/doc/rfc/rfc9420.txt b/doc/rfc/rfc9420.txt new file mode 100644 index 0000000..6c4fdf6 --- /dev/null +++ b/doc/rfc/rfc9420.txt @@ -0,0 +1,7040 @@ + + + + +Internet Engineering Task Force (IETF) R. Barnes +Request for Comments: 9420 Cisco +Category: Standards Track B. Beurdouche +ISSN: 2070-1721 Inria & Mozilla + R. Robert + Phoenix R&D + J. Millican + Meta Platforms + E. Omara + + K. Cohn-Gordon + University of Oxford + July 2023 + + + The Messaging Layer Security (MLS) Protocol + +Abstract + + Messaging applications are increasingly making use of end-to-end + security mechanisms to ensure that messages are only accessible to + the communicating endpoints, and not to any servers involved in + delivering messages. Establishing keys to provide such protections + is challenging for group chat settings, in which more than two + clients need to agree on a key but may not be online at the same + time. In this document, we specify a key establishment protocol that + provides efficient asynchronous group key establishment with forward + secrecy (FS) and post-compromise security (PCS) for groups in size + ranging from two to thousands. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc9420. + +Copyright Notice + + Copyright (c) 2023 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Revised BSD License text as described in Section 4.e of the + Trust Legal Provisions and are provided without warranty as described + in the Revised BSD License. + +Table of Contents + + 1. Introduction + 2. Terminology + 2.1. Presentation Language + 2.1.1. Optional Value + 2.1.2. Variable-Size Vector Length Headers + 3. Protocol Overview + 3.1. Cryptographic State and Evolution + 3.2. Example Protocol Execution + 3.3. External Joins + 3.4. Relationships between Epochs + 4. Ratchet Tree Concepts + 4.1. Ratchet Tree Terminology + 4.1.1. Ratchet Tree Nodes + 4.1.2. Paths through a Ratchet Tree + 4.2. Views of a Ratchet Tree + 5. Cryptographic Objects + 5.1. Cipher Suites + 5.1.1. Public Keys + 5.1.2. Signing + 5.1.3. Public Key Encryption + 5.2. Hash-Based Identifiers + 5.3. Credentials + 5.3.1. Credential Validation + 5.3.2. Credential Expiry and Revocation + 5.3.3. Uniquely Identifying Clients + 6. Message Framing + 6.1. Content Authentication + 6.2. Encoding and Decoding a Public Message + 6.3. Encoding and Decoding a Private Message + 6.3.1. Content Encryption + 6.3.2. Sender Data Encryption + 7. Ratchet Tree Operations + 7.1. Parent Node Contents + 7.2. Leaf Node Contents + 7.3. Leaf Node Validation + 7.4. Ratchet Tree Evolution + 7.5. Synchronizing Views of the Tree + 7.6. Update Paths + 7.7. Adding and Removing Leaves + 7.8. Tree Hashes + 7.9. Parent Hashes + 7.9.1. Using Parent Hashes + 7.9.2. Verifying Parent Hashes + 8. Key Schedule + 8.1. Group Context + 8.2. Transcript Hashes + 8.3. External Initialization + 8.4. Pre-Shared Keys + 8.5. Exporters + 8.6. Resumption PSK + 8.7. Epoch Authenticators + 9. Secret Tree + 9.1. Encryption Keys + 9.2. Deletion Schedule + 10. Key Packages + 10.1. KeyPackage Validation + 11. Group Creation + 11.1. Required Capabilities + 11.2. Reinitialization + 11.3. Subgroup Branching + 12. Group Evolution + 12.1. Proposals + 12.1.1. Add + 12.1.2. Update + 12.1.3. Remove + 12.1.4. PreSharedKey + 12.1.5. ReInit + 12.1.6. ExternalInit + 12.1.7. GroupContextExtensions + 12.1.8. External Proposals + 12.2. Proposal List Validation + 12.3. Applying a Proposal List + 12.4. Commit + 12.4.1. Creating a Commit + 12.4.2. Processing a Commit + 12.4.3. Adding Members to the Group + 13. Extensibility + 13.1. Additional Cipher Suites + 13.2. Proposals + 13.3. Credential Extensibility + 13.4. Extensions + 13.5. GREASE + 14. Sequencing of State Changes + 15. Application Messages + 15.1. Padding + 15.2. Restrictions + 15.3. Delayed and Reordered Application Messages + 16. Security Considerations + 16.1. Transport Security + 16.2. Confidentiality of Group Secrets + 16.3. Confidentiality of Sender Data + 16.4. Confidentiality of Group Metadata + 16.4.1. GroupID, Epoch, and Message Frequency + 16.4.2. Group Extensions + 16.4.3. Group Membership + 16.5. Authentication + 16.6. Forward Secrecy and Post-Compromise Security + 16.7. Uniqueness of Ratchet Tree Key Pairs + 16.8. KeyPackage Reuse + 16.9. Delivery Service Compromise + 16.10. Authentication Service Compromise + 16.11. Additional Policy Enforcement + 16.12. Group Fragmentation by Malicious Insiders + 17. IANA Considerations + 17.1. MLS Cipher Suites + 17.2. MLS Wire Formats + 17.3. MLS Extension Types + 17.4. MLS Proposal Types + 17.5. MLS Credential Types + 17.6. MLS Signature Labels + 17.7. MLS Public Key Encryption Labels + 17.8. MLS Exporter Labels + 17.9. MLS Designated Expert Pool + 17.10. The "message/mls" Media Type + 18. References + 18.1. Normative References + 18.2. Informative References + Appendix A. Protocol Origins of Example Trees + Appendix B. Evolution of Parent Hashes + Appendix C. Array-Based Trees + Appendix D. Link-Based Trees + Contributors + Authors' Addresses + +1. Introduction + + A group of users who want to send each other encrypted messages needs + a way to derive shared symmetric encryption keys. For two parties, + this problem has been studied thoroughly, with the Double Ratchet + emerging as a common solution [DoubleRatchet] [Signal]. Channels + implementing the Double Ratchet enjoy fine-grained forward secrecy as + well as post-compromise security, but are nonetheless efficient + enough for heavy use over low-bandwidth networks. + + For a group of size greater than two, a common strategy is to + distribute symmetric "sender keys" over existing 1:1 secure channels, + and then for each member to send messages to the group encrypted with + their own sender key. On the one hand, using sender keys improves + efficiency relative to pairwise transmission of individual messages, + and it provides forward secrecy (with the addition of a hash + ratchet). On the other hand, it is difficult to achieve post- + compromise security with sender keys, requiring a number of key + update messages that scales as the square of the group size. An + adversary who learns a sender key can often indefinitely and + passively eavesdrop on that member's messages. Generating and + distributing a new sender key provides a form of post-compromise + security with regard to that sender. However, it requires + computation and communications resources that scale linearly with the + size of the group. + + In this document, we describe a protocol based on tree structures + that enables asynchronous group keying with forward secrecy and post- + compromise security. Based on earlier work on "asynchronous + ratcheting trees" [ART], the protocol presented here uses an + asynchronous key-encapsulation mechanism for tree structures. This + mechanism allows the members of the group to derive and update shared + keys with costs that scale as the log of the group size. + +2. Terminology + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in BCP + 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + Client: An agent that uses this protocol to establish shared + cryptographic state with other clients. A client is defined by + the cryptographic keys it holds. + + Group: A group represents a logical collection of clients that share + a common secret value at any given time. Its state is represented + as a linear sequence of epochs in which each epoch depends on its + predecessor. + + Epoch: A state of a group in which a specific set of authenticated + clients hold shared cryptographic state. + + Member: A client that is included in the shared state of a group and + hence has access to the group's secrets. + + Key Package: A signed object describing a client's identity and + capabilities, including a hybrid public key encryption (HPKE) + [RFC9180] public key that can be used to encrypt to that client. + Other clients can use a client's KeyPackage to introduce the + client to a new group. + + Group Context: An object that summarizes the shared, public state of + the group. The group context is typically distributed in a signed + GroupInfo message, which is provided to new members to help them + join a group. + + Signature Key: A signing key pair used to authenticate the sender of + a message. + + Proposal: A message that proposes a change to the group, e.g., + adding or removing a member. + + Commit: A message that implements the changes to the group proposed + in a set of Proposals. + + PublicMessage: An MLS protocol message that is signed by its sender + and authenticated as coming from a member of the group in a + particular epoch, but not encrypted. + + PrivateMessage: An MLS protocol message that is signed by its + sender, authenticated as coming from a member of the group in a + particular epoch, and encrypted so that it is confidential to the + members of the group in that epoch. + + Handshake Message: A PublicMessage or PrivateMessage carrying an MLS + Proposal or Commit object, as opposed to application data. + + Application Message: A PrivateMessage carrying application data. + + Terminology specific to tree computations is described in + Section 4.1. + + In general, symmetric values are referred to as "keys" or "secrets" + interchangeably. Either term denotes a value that MUST be kept + confidential to a client. When labeling individual values, we + typically use "secret" to refer to a value that is used to derive + further secret values and "key" to refer to a value that is used with + an algorithm such as Hashed Message Authentication Code (HMAC) or an + Authenticated Encryption with Associated Data (AEAD) algorithm. + + The PublicMessage and PrivateMessage formats are defined in + Section 6. Security notions such as forward secrecy and post- + compromise security are defined in Section 16. + + As detailed in Section 13.5, MLS uses the "Generate Random Extensions + And Sustain Extensibility" (GREASE) approach to maintaining + extensibility, where senders insert random values into fields in + which receivers are required to ignore unknown values. Specific + "GREASE values" for this purpose are registered in the appropriate + IANA registries. + +2.1. Presentation Language + + We use the TLS presentation language [RFC8446] to describe the + structure of protocol messages. In addition to the base syntax, we + add two additional features: the ability for fields to be optional + and the ability for vectors to have variable-size length headers. + +2.1.1. Optional Value + + An optional value is encoded with a presence-signaling octet, + followed by the value itself if present. When decoding, a presence + octet with a value other than 0 or 1 MUST be rejected as malformed. + + struct { + uint8 present; + select (present) { + case 0: struct{}; + case 1: T value; + }; + } optional; + +2.1.2. Variable-Size Vector Length Headers + + In the TLS presentation language, vectors are encoded as a sequence + of encoded elements prefixed with a length. The length field has a + fixed size set by specifying the minimum and maximum lengths of the + encoded sequence of elements. + + In MLS, there are several vectors whose sizes vary over significant + ranges. So instead of using a fixed-size length field, we use a + variable-size length using a variable-length integer encoding based + on the one described in Section 16 of [RFC9000]. They differ only in + that the one here requires a minimum-size encoding. Instead of + presenting min and max values, the vector description simply includes + a V. For example: + + struct { + uint32 fixed<0..255>; + opaque variable; + } StructWithVectors; + + Such a vector can represent values with length from 0 bytes to 2^30 + bytes. The variable-length integer encoding reserves the two most + significant bits of the first byte to encode the base 2 logarithm of + the integer encoding length in bytes. The integer value is encoded + on the remaining bits, so that the overall value is in network byte + order. The encoded value MUST use the smallest number of bits + required to represent the value. When decoding, values using more + bits than necessary MUST be treated as malformed. + + This means that integers are encoded in 1, 2, or 4 bytes and can + encode 6-, 14-, or 30-bit values, respectively. + + +========+=========+=============+=======+============+ + | Prefix | Length | Usable Bits | Min | Max | + +========+=========+=============+=======+============+ + | 00 | 1 | 6 | 0 | 63 | + +--------+---------+-------------+-------+------------+ + | 01 | 2 | 14 | 64 | 16383 | + +--------+---------+-------------+-------+------------+ + | 10 | 4 | 30 | 16384 | 1073741823 | + +--------+---------+-------------+-------+------------+ + | 11 | invalid | - | - | - | + +--------+---------+-------------+-------+------------+ + + Table 1: Summary of Integer Encodings + + Vectors that start with the prefix "11" are invalid and MUST be + rejected. + + For example: + + * The four-byte length value 0x9d7f3e7d decodes to 494878333. + + * The two-byte length value 0x7bbd decodes to 15293. + + * The single-byte length value 0x25 decodes to 37. + + The following figure adapts the pseudocode provided in [RFC9000] to + add a check for minimum-length encoding: + + ReadVarint(data): + // The length of variable-length integers is encoded in the + // first two bits of the first byte. + v = data.next_byte() + prefix = v >> 6 + if prefix == 3: + raise Exception('invalid variable length integer prefix') + + length = 1 << prefix + + // Once the length is known, remove these bits and read any + // remaining bytes. + v = v & 0x3f + repeat length-1 times: + v = (v << 8) + data.next_byte() + + // Check if the value would fit in half the provided length. + if prefix >= 1 && v < (1 << (8*(length/2) - 2)): + raise Exception('minimum encoding was not used') + + return v + + The use of variable-size integers for vector lengths allows vectors + to grow very large, up to 2^30 bytes. Implementations should take + care not to allow vectors to overflow available storage. To + facilitate debugging of potential interoperability problems, + implementations SHOULD provide a clear error when such an overflow + condition occurs. + +3. Protocol Overview + + MLS is designed to operate in the context described in [MLS-ARCH]. + In particular, we assume that the following services are provided: + + * An Authentication Service (AS) that enables group members to + authenticate the credentials presented by other group members. + + * A Delivery Service (DS) that routes MLS messages among the + participants in the protocol. + + MLS assumes a trusted AS but a largely untrusted DS. Section 16.10 + describes the impact of compromise or misbehavior of an AS. MLS is + designed to protect the confidentiality and integrity of the group + data even in the face of a compromised DS; in general, the DS is only + expected to reliably deliver messages. Section 16.9 describes the + impact of compromise or misbehavior of a DS. + + The core functionality of MLS is continuous group authenticated key + exchange (AKE). As with other authenticated key exchange protocols + (such as TLS), the participants in the protocol agree on a common + secret value, and each participant can verify the identity of the + other participants. That secret can then be used to protect messages + sent from one participant in the group to the other participants + using the MLS framing layer or can be exported for use with other + protocols. MLS provides group AKE in the sense that there can be + more than two participants in the protocol, and continuous group AKE + in the sense that the set of participants in the protocol can change + over time. + + The core organizing principles of MLS are _groups_ and _epochs_. A + group represents a logical collection of clients that share a common + secret value at any given time. The history of a group is divided + into a linear sequence of epochs. In each epoch, a set of + authenticated _members_ agree on an _epoch secret_ that is known only + to the members of the group in that epoch. The set of members + involved in the group can change from one epoch to the next, and MLS + ensures that only the members in the current epoch have access to the + epoch secret. From the epoch secret, members derive further shared + secrets for message encryption, group membership authentication, and + so on. + + The creator of an MLS group creates the group's first epoch + unilaterally, with no protocol interactions. Thereafter, the members + of the group advance their shared cryptographic state from one epoch + to another by exchanging MLS messages. + + * A _KeyPackage_ object describes a client's capabilities and + provides keys that can be used to add the client to a group. + + * A _Proposal_ message proposes a change to be made in the next + epoch, such as adding or removing a member. + + * A _Commit_ message initiates a new epoch by instructing members of + the group to implement a collection of proposals. + + * A _Welcome_ message provides a new member to the group with the + information to initialize their state for the epoch in which they + were added or in which they want to add themselves to the group. + + KeyPackage and Welcome messages are used to initiate a group or + introduce new members, so they are exchanged between group members + and clients not yet in the group. A client publishes a KeyPackage + via the DS, thus enabling other clients to add it to groups. When a + group member wants to add a new member to a group, it uses the new + member's KeyPackage to add them and constructs a Welcome message with + which the new member can initialize their local state. + + Proposal and Commit messages are sent from one member of a group to + the others. MLS provides a common framing layer for sending messages + within a group: A _PublicMessage_ provides sender authentication for + unencrypted Proposal and Commit messages. A _PrivateMessage_ + provides encryption and authentication for both Proposal/Commit + messages as well as any application data. + +3.1. Cryptographic State and Evolution + + The cryptographic state at the core of MLS is divided into three + areas of responsibility: + + .- ... -. + | | + | | | + | | | Key Schedule + | V | + | epoch_secret | +. | | | . +|\ Ratchet | | | Secret /| +| \ Tree | | | Tree / | +| \ | | | / | +| \ | V | / | +| +--> commit_secret --> epoch_secret --> encryption_secret -->+ | +| / | | | \ | +| / | | | \ | +| / | | | \ | +|/ | | | \| +' | V | ' + | epoch_secret | + | | | + | | | + | V | + | | + '- ... -' + + Figure 1: Overview of MLS Group Evolution + + * A _ratchet tree_ that represents the membership of the group, + providing group members a way to authenticate each other and + efficiently encrypt messages to subsets of the group. Each epoch + has a distinct ratchet tree. It seeds the _key schedule_. + + * A _key schedule_ that describes the chain of key derivations used + to progress from epoch to epoch (mainly using the _init_secret_ + and _epoch_secret_), as well as the derivation of a variety of + other secrets (see Table 4). For example: + + - An _encryption secret_ that is used to initialize the secret + tree for the epoch. + + - An _exporter secret_ that allows other protocols to leverage + MLS as a generic authenticated group key exchange. + + - A _resumption secret_ that members can use to prove their + membership in the group, e.g., when creating a subgroup or a + successor group. + + * A _secret tree_ derived from the key schedule that represents + shared secrets used by the members of the group for encrypting and + authenticating messages. Each epoch has a distinct secret tree. + + Each member of the group maintains a partial view of these components + of the group's state. MLS messages are used to initialize these + views and keep them in sync as the group transitions between epochs. + + Each new epoch is initiated with a Commit message. The Commit + instructs existing members of the group to update their views of the + ratchet tree by applying a set of Proposals, and uses the updated + ratchet tree to distribute fresh entropy to the group. This fresh + entropy is provided only to members in the new epoch and not to + members who have been removed. Commits thus maintain the property + that the epoch secret is confidential to the members in the current + epoch. + + For each Commit that adds one or more members to the group, there are + one or more corresponding Welcome messages. Each Welcome message + provides new members with the information they need to initialize + their views of the key schedule and ratchet tree, so that these views + align with the views held by other members of the group in this + epoch. + +3.2. Example Protocol Execution + + There are three major operations in the life of a group: + + * Adding a member, initiated by a current member; + + * Updating the keys that represent a member in the tree; and + + * Removing a member. + + Each of these operations is "proposed" by sending a message of the + corresponding type (Add / Update / Remove). The state of the group + is not changed, however, until a Commit message is sent to provide + the group with fresh entropy. In this section, we show each proposal + being committed immediately, but in more advanced deployment cases, + an application might gather several proposals before committing them + all at once. In the illustrations below, we show the Proposal and + Commit messages directly, while in reality they would be sent + encapsulated in PublicMessage or PrivateMessage objects. + + Before the initialization of a group, clients publish KeyPackages to + a directory provided by the DS (see Figure 2). + + Delivery Service + | + .--------' '--------. + | | + Group + A B C Directory Channel + | | | | | + | KeyPackageA | | | | + +------------------------------------------------->| | + | | | | | + | | KeyPackageB | | | + | +-------------------------------->| | + | | | | | + | | | KeyPackageC | | + | | +--------------->| | + | | | | | + + Figure 2: Clients A, B, and C publish KeyPackages to the directory + + Figure 3 shows how these pre-published KeyPackages are used to create + a group. When client A wants to establish a group with clients B and + C, it first initializes a group state containing only itself and + downloads KeyPackages for B and C. For each member, A generates an + Add proposal and a Commit message to add that member and then + broadcasts the two messages to the group. Client A also generates a + Welcome message and sends it directly to the new member (there's no + need to send it to the group). Only after A has received its Commit + message back from the Delivery Service does it update its state to + reflect the new member's addition. + + Once A has updated its state, the new member has processed the + Welcome, and any other group members have processed the Commit, they + will all have consistent representations of the group state, + including a group secret that is known only to the members the group. + The new member will be able to read and send new messages to the + group, but messages sent before they were added to the group will not + be accessible. + + Group + A B C Directory Channel + | | | | | + | KeyPackageB, KeyPackageC | | + |<-------------------------------------------+ | + | | | | | + | | | | Add(A->AB) | + | | | | Commit(Add) | + +--------------------------------------------------------------->| + | | | | | + | Welcome(B) | | | | + +------------->| | | | + | | | | | + | | | | Add(A->AB) | + | | | | Commit(Add) | + |<---------------------------------------------------------------+ + | | | | | + | | | | | + | | | | Add(AB->ABC) | + | | | | Commit(Add) | + +--------------------------------------------------------------->| + | | | | | + | | Welcome(C) | | | + +---------------------------->| | | + | | | | | + | | | | Add(AB->ABC) | + | | | | Commit(Add) | + |<---------------------------------------------------------------+ + | |<------------------------------------------------+ + | | | | | + + Figure 3: Client A creates a group with clients B and C + + Subsequent additions of group members proceed in the same way. Any + member of the group can download a KeyPackage for a new client, + broadcast Add and Commit messages that the current group will use to + update their state, and send a Welcome message that the new client + can use to initialize its state and join the group. + + To enforce the forward secrecy and post-compromise security of + messages, each member periodically updates the keys that represent + them to the group. A member does this by sending a Commit (possibly + with no proposals) or by sending an Update message that is committed + by another member (see Figure 4). Once the other members of the + group have processed these messages, the group's secrets will be + unknown to an attacker that had compromised the secrets corresponding + to the sender's leaf in the tree. At the end of the scenario shown + in Figure 4, the group has post-compromise security with respect to + both A and B. + + Update messages SHOULD be sent at regular intervals of time as long + as the group is active, and members that don't update SHOULD + eventually be removed from the group. It's left to the application + to determine an appropriate amount of time between Updates. Since + the purpose of sending an Update is to proactively constrain a + compromise window, the right frequency is usually on the order of + hours or days, not milliseconds. For example, an application might + send an Update each time a member sends an application message after + receiving any message from another member, or daily if no application + messages are sent. + + The MLS architecture recommends that MLS be operated over a secure + transport (see Section 7.1 of [MLS-ARCH]). Such transport protocols + will typically provide functions such as congestion control that + manage the impact of an MLS-using application on other applications + sharing the same network. Applications should take care that they do + not send MLS messages at a rate that will cause problems such as + network congestion, especially if they are not following the above + recommendation (e.g., sending MLS directly over UDP instead). + + Group + A B ... Z Directory Channel + | | | | | + | | Update(B) | | | + | +------------------------------------------->| + | | | | Update(B) | + |<----------------------------------------------------------+ + | |<-------------------------------------------+ + | | |<----------------------------+ + | | | | | + | Commit(Upd) | | | | + +---------------------------------------------------------->| + | | | | Commit(Upd) | + |<----------------------------------------------------------+ + | |<-------------------------------------------+ + | | |<----------------------------+ + | | | | | + + Figure 4: Client B proposes to update its key, and client A + commits the proposal + + Members are removed from the group in a similar way, as shown in + Figure 5. Any member of the group can send a Remove proposal + followed by a Commit message. The Commit message provides new + entropy to all members of the group except the removed member. This + new entropy is added to the epoch secret for the new epoch so that it + is not known to the removed member. Note that this does not + necessarily imply that any member is actually allowed to evict other + members; groups can enforce access control policies on top of these + basic mechanisms. + + Group + A B ... Z Directory Channel + | | | | | + | | | Remove(B) | | + | | | Commit(Rem) | | + | | +---------------------------->| + | | | | | + | | | | Remove(B) | + | | | | Commit(Rem) | + |<----------------------------------------------------------+ + | |<-------------------------------------------+ + | | |<----------------------------+ + | | | | | + + Figure 5: Client Z removes client B from the group + + Note that the flows in this section are examples; applications can + arrange message flows in other ways. For example: + + * Welcome messages don't necessarily need to be sent directly to new + joiners. Since they are encrypted to new joiners, they could be + distributed more broadly, say if the application only had access + to a broadcast channel for the group. + + * Proposal messages don't need to be immediately sent to all group + members. They need to be available to the committer before + generating a Commit, and to other members before processing the + Commit. + + * The sender of a Commit doesn't necessarily have to wait to receive + its own Commit back before advancing its state. It only needs to + know that its Commit will be the next one applied by the group, + say based on a promise from an orchestration server. + +3.3. External Joins + + In addition to the Welcome-based flow for adding a new member to the + group, it is also possible for a new member to join by means of an + "external Commit". This mechanism can be used when the existing + members don't have a KeyPackage for the new member, for example, in + the case of an "open" group that can be joined by new members without + asking permission from existing members. + + Figure 6 shows a typical message flow for an external join. To + enable a new member to join the group in this way, a member of the + group (A, B) publishes a GroupInfo object that includes the + GroupContext for the group as well as a public key that can be used + to encrypt a secret to the existing members of the group. When the + new member Z wishes to join, they download the GroupInfo object and + use it to form a Commit of a special form that adds Z to the group + (as detailed in Section 12.4.3.2). The existing members of the group + process this external Commit in a similar way to a normal Commit, + advancing to a new epoch in which Z is now a member of the group. + + Group + A B Z Directory Channel + | | | | | + | GroupInfo | | | | + +------------------------------------------->| | + | | | GroupInfo | | + | | |<-------------+ | + | | | | | + | | | Commit(ExtZ) | | + | | +---------------------------->| + | | | | Commit(ExtZ) | + |<----------------------------------------------------------+ + | |<-------------------------------------------+ + | | |<----------------------------+ + | | | | | + + Figure 6: Client A publishes a GroupInfo object, and Client Z + uses it to join the group + +3.4. Relationships between Epochs + + A group has a single linear sequence of epochs. Groups and epochs + are generally independent of one another. However, it can sometimes + be useful to link epochs cryptographically, either within a group or + across groups. MLS derives a resumption pre-shared key (PSK) from + each epoch to allow entropy extracted from one epoch to be injected + into a future epoch. A group member that wishes to inject a PSK + issues a PreSharedKey proposal (Section 12.1.4) describing the PSK to + be injected. When this proposal is committed, the corresponding PSK + will be incorporated into the key schedule as described in + Section 8.4. + + Linking epochs in this way guarantees that members entering the new + epoch agree on a key if and only if they were members of the group + during the epoch from which the resumption key was extracted. + + MLS supports two ways to tie a new group to an existing group, which + are illustrated in Figures 7 and 8. Reinitialization closes one + group and creates a new group comprising the same members with + different parameters. Branching starts a new group with a subset of + the original group's participants (with no effect on the original + group). In both cases, the new group is linked to the old group via + a resumption PSK. + + epoch_A_[n-1] + | + | + |<-- ReInit + | + V + epoch_A_[n] epoch_B_[0] + . | + . PSK(usage=reinit) | + .....................>| + | + V + epoch_B_[1] + + Figure 7: Reinitializing a Group + + epoch_A_[n] epoch_B_[0] + | | + | PSK(usage=branch) | + |....................>| + | | + V V + epoch_A_[n+1] epoch_B_[1] + + Figure 8: Branching a Group + + Applications may also choose to use resumption PSKs to link epochs in + other ways. For example, Figure 9 shows a case where a resumption + PSK from epoch n is injected into epoch n+k. This demonstrates that + the members of the group at epoch n+k were also members at epoch n, + irrespective of any changes to these members' keys due to Updates or + Commits. + + epoch_A_[n] + | + | PSK(usage=application) + |..................... + | . + | . + ... ... + | . + | . + V . + epoch_A_[n+k-1] . + | . + | . + |<.................... + | + V + epoch_A_[n+k] + + Figure 9: Reinjecting Entropy from an Earlier Epoch + +4. Ratchet Tree Concepts + + The protocol uses "ratchet trees" for deriving shared secrets among a + group of clients. A ratchet tree is an arrangement of secrets and + key pairs among the members of a group in a way that allows for + secrets to be efficiently updated to reflect changes in the group. + + Ratchet trees allow a group to efficiently remove any member by + encrypting new entropy to a subset of the group. A ratchet tree + assigns shared keys to subgroups of the overall group, so that, for + example, encrypting to all but one member of the group requires only + log(N) encryptions to subtrees, instead of the N-1 encryptions that + would be needed to encrypt to each participant individually (where N + is the number of members in the group). + + This remove operation allows MLS to efficiently achieve post- + compromise security. In an Update proposal or a full Commit message, + an old (possibly compromised) representation of a member is + efficiently removed from the group and replaced with a freshly + generated instance. + +4.1. Ratchet Tree Terminology + + Trees consist of _nodes_. A node is a _leaf_ if it has no children; + otherwise, it is a _parent_. All parents in our trees have precisely + two children, a _left_ child and a _right_ child. A node is the + _root_ of a tree if it has no parent, and _intermediate_ if it has + both children and a parent. The _descendants_ of a node are that + node's children, and the descendants of its children. We say a tree + _contains_ a node if that node is a descendant of the root of the + tree, or if the node itself is the root of the tree. Nodes are + _siblings_ if they share the same parent. + + A _subtree_ of a tree is the tree given by any node (the _head_ of + the subtree) and its descendants. The _size_ of a tree or subtree is + the number of leaf nodes it contains. For a given parent node, its + _left subtree_ is the subtree with its left child as head and its + _right subtree_ is the subtree with its right child as head. + + Every tree used in this protocol is a perfect binary tree, that is, a + complete balanced binary tree with 2^d leaves all at the same depth + d. This structure is unique for a given depth d. + + There are multiple ways that an implementation might represent a + ratchet tree in memory. A convenient property of left-balanced + binary trees (including the complete trees used here) is that they + can be represented as an array of nodes, with node relationships + computed based on the nodes' indices in the array. A more + traditional representation based on linked node objects may also be + used. Appendices C and D provide some details on how to implement + the tree operations required for MLS in these representations. MLS + places no requirements on implementations' internal representations + of ratchet trees. An implementation may use any tree representation + and associated algorithms, as long as they produce correct protocol + messages. + +4.1.1. Ratchet Tree Nodes + + Each leaf node in a ratchet tree is given an _index_ (or _leaf + index_), starting at 0 from the left to 2^d - 1 at the right (for a + tree with 2^d leaves). A tree with 2^d leaves has 2^(d+1) - 1 nodes, + including parent nodes. + + Each node in a ratchet tree is either _blank_ (containing no value) + or it holds an HPKE public key with some associated data: + + * A public key (for the HPKE scheme in use; see Section 5.1) + + * A credential (only for leaf nodes; see Section 5.3) + + * An ordered list of "unmerged" leaves (see Section 4.2) + + * A hash of certain information about the node's parent, as of the + last time the node was changed (see Section 7.9). + + As described in Section 4.2, different members know different subsets + of the set of private keys corresponding to the public keys in nodes + in the tree. The private key corresponding to a parent node is known + only to members at leaf nodes that are descendants of that node. The + private key corresponding to a leaf node is known only to the member + at that leaf node. A leaf node is _unmerged_ relative to one of its + ancestor nodes if the member at the leaf node does not know the + private key corresponding to the ancestor node. + + Every node, regardless of whether the node is blank or populated, has + a corresponding _hash_ that summarizes the contents of the subtree + below that node. The rules for computing these hashes are described + in Section 7.8. + + The _resolution_ of a node is an ordered list of non-blank nodes that + collectively cover all non-blank descendants of the node. The + resolution of the root contains the set of keys that are collectively + necessary to encrypt to every node in the group. The resolution of a + node is effectively a depth-first, left-first enumeration of the + nearest non-blank nodes below the node: + + * The resolution of a non-blank node comprises the node itself, + followed by its list of unmerged leaves, if any. + + * The resolution of a blank leaf node is the empty list. + + * The resolution of a blank intermediate node is the result of + concatenating the resolution of its left child with the resolution + of its right child, in that order. + + For example, consider the following subtree, where the _ character + represents a blank node and unmerged leaves are indicated in square + brackets: + + ... + / + _ + ______|______ + / \ + X[B] _ + __|__ __|__ + / \ / \ + _ _ Y _ + / \ / \ / \ / \ + A B _ D E F _ H + + 0 1 2 3 4 5 6 7 + + Figure 10: A Tree with Blanks and Unmerged Leaves + + In this tree, we can see all of the above rules in play: + + * The resolution of node X is the list [X, B]. + + * The resolution of leaf 2 or leaf 6 is the empty list []. + + * The resolution of top node is the list [X, B, Y, H]. + +4.1.2. Paths through a Ratchet Tree + + The _direct path_ of a root is the empty list. The direct path of + any other node is the concatenation of that node's parent along with + the parent's direct path. + + The _copath_ of a node is the node's sibling concatenated with the + list of siblings of all the nodes in its direct path, excluding the + root. + + The _filtered direct path_ of a leaf node L is the node's direct + path, with any node removed whose child on the copath of L has an + empty resolution (keeping in mind that any unmerged leaves of the + copath child count toward its resolution). The removed nodes do not + need their own key pairs because encrypting to the node's key pair + would be equivalent to encrypting to its non-copath child. + + For example, consider the following tree (where blank nodes are + indicated with _, but also assigned a label for reference): + + W = root + | + .-----+-----. + / \ + _=U Y + | | + .-+-. .-+-. + / \ / \ + T _=V X _=Z + / \ / \ / \ / \ + A B _ _ E F G _=H + + 0 1 2 3 4 5 6 7 + + Figure 11: A Complete Tree with Five Members, with Labels for + Blank Parent Nodes + + In this tree, the direct paths, copaths, and filtered direct paths + for the leaf nodes are as follows: + + +======+=============+=========+======================+ + | Node | Direct path | Copath | Filtered Direct Path | + +======+=============+=========+======================+ + | A | T, U, W | B, V, Y | T, W | + +------+-------------+---------+----------------------+ + | B | T, U, W | A, V, Y | T, W | + +------+-------------+---------+----------------------+ + | E | X, Y, W | F, Z, U | X, Y, W | + +------+-------------+---------+----------------------+ + | F | X, Y, W | E, Z, U | X, Y, W | + +------+-------------+---------+----------------------+ + | G | Z, Y, W | H, X, U | Y, W | + +------+-------------+---------+----------------------+ + + Table 2 + +4.2. Views of a Ratchet Tree + + We generally assume that each participant maintains a complete and + up-to-date view of the public state of the group's ratchet tree, + including the public keys for all nodes and the credentials + associated with the leaf nodes. + + No participant in an MLS group knows the private key associated with + every node in the tree. Instead, each member is assigned to a leaf + of the tree, which determines the subset of private keys it knows. + The credential stored at that leaf is one provided by the member. + + In particular, MLS maintains the members' views of the tree in such a + way as to maintain the _tree invariant_: + + | The private key for a node in the tree is known to a member of the + | group only if the node's subtree contains that member's leaf. + + In other words, if a node is not blank, then it holds a public key. + The corresponding private key is known only to members occupying + leaves below that node. + + The reverse implication is not true: A member may not know the + private key of an intermediate node above them. Such a member has an + _unmerged_ leaf at the intermediate node. Encrypting to an + intermediate node requires encrypting to the node's public key, as + well as the public keys of all the unmerged leaves below it. A leaf + is unmerged with regard to all of its ancestors when it is first + added, because the process of adding the leaf does not give it access + to the private keys for all of the nodes above it in the tree. + Leaves are "merged" as they receive the private keys for nodes, as + described in Section 7.4. + + For example, consider a four-member group (A, B, C, D) where the node + above the right two members is blank. (This is what it would look + like if A created a group with B, C, and D.) Then the public state + of the tree and the views of the private keys of the tree held by + each participant would be as follows, where _ represents a blank + node, ? represents an unknown private key, and pk(X) represents the + public key corresponding to the private key X: + + Public Tree + ============================ + pk(ABCD) + / \ + pk(AB) _ + / \ / \ + pk(A) pk(B) pk(C) pk(D) + + + Private @ A Private @ B Private @ C Private @ D + ============= ============= ============= ============= + ABCD ABCD ABCD ABCD + / \ / \ / \ / \ + AB _ AB _ ? _ ? _ + / \ / \ / \ / \ / \ / \ / \ / \ + A ? ? ? ? B ? ? ? ? C ? ? ? ? D + + Note how the tree invariant applies: Each member knows only their own + leaf, the private key AB is known only to A and B, and the private + key ABCD is known to all four members. This also illustrates another + important point: it is possible for there to be "holes" on the path + from a member's leaf to the root in which the member knows the key + both above and below a given node, but not for that node, as in the + case with D. + +5. Cryptographic Objects + +5.1. Cipher Suites + + Each MLS session uses a single cipher suite that specifies the + following primitives to be used in group key computations: + + * HPKE parameters: + + - A Key Encapsulation Mechanism (KEM) + + - A Key Derivation Function (KDF) + + - An Authenticated Encryption with Associated Data (AEAD) + encryption algorithm + + * A hash algorithm + + * A Message Authentication Code (MAC) algorithm + + * A signature algorithm + + MLS uses HPKE for public key encryption [RFC9180]. The DeriveKeyPair + function associated to the KEM for the cipher suite maps octet + strings to HPKE key pairs. As in HPKE, MLS assumes that an AEAD + algorithm produces a single ciphertext output from AEAD encryption + (aligning with [RFC5116]), as opposed to a separate ciphertext and + tag. + + Cipher suites are represented with the CipherSuite type. The cipher + suites are defined in Section 17.1. + +5.1.1. Public Keys + + HPKE public keys are opaque values in a format defined by the + underlying protocol (see Section 4 of [RFC9180] for more + information). + + opaque HPKEPublicKey; + + Signature public keys are likewise represented as opaque values in a + format defined by the cipher suite's signature scheme. + + opaque SignaturePublicKey; + + For cipher suites using the Edwards-curve Digital Signature Algorithm + (EdDSA) signature schemes (Ed25519 or Ed448), the public key is in + the format specified in [RFC8032]. + + For cipher suites using the Elliptic Curve Digital Signature + Algorithm (ECDSA) with the NIST curves (P-256, P-384, or P-521), the + public key is represented as an encoded + UncompressedPointRepresentation struct, as defined in [RFC8446]. + +5.1.2. Signing + + The signature algorithm specified in a group's cipher suite is the + mandatory algorithm to be used for signing messages within the group. + It MUST be the same as the signature algorithm specified in the + credentials in the leaves of the tree (including the leaf node + information in KeyPackages used to add new members). + + The signatures used in this document are encoded as specified in + [RFC8446]. In particular, ECDSA signatures are DER encoded, and + EdDSA signatures are defined as the concatenation of R and S, as + specified in [RFC8032]. + + To disambiguate different signatures used in MLS, each signed value + is prefixed by a label as shown below: + + SignWithLabel(SignatureKey, Label, Content) = + Signature.Sign(SignatureKey, SignContent) + + VerifyWithLabel(VerificationKey, Label, Content, SignatureValue) = + Signature.Verify(VerificationKey, SignContent, SignatureValue) + + Where SignContent is specified as: + + struct { + opaque label; + opaque content; + } SignContent; + + And its fields are set to: + + label = "MLS 1.0 " + Label; + content = Content; + + The functions Signature.Sign and Signature.Verify are defined by the + signature algorithm. If MLS extensions require signatures by group + members, they should reuse the SignWithLabel construction, using a + distinct label. To avoid collisions in these labels, an IANA + registry is defined in Section 17.6. + +5.1.3. Public Key Encryption + + As with signing, MLS includes a label and context in encryption + operations to avoid confusion between ciphertexts produced for + different purposes. Encryption and decryption including this label + and context are done as follows: + + EncryptWithLabel(PublicKey, Label, Context, Plaintext) = + SealBase(PublicKey, EncryptContext, "", Plaintext) + + DecryptWithLabel(PrivateKey, Label, Context, KEMOutput, Ciphertext) = + OpenBase(KEMOutput, PrivateKey, EncryptContext, "", Ciphertext) + + Where EncryptContext is specified as: + + struct { + opaque label; + opaque context; + } EncryptContext; + + And its fields are set to: + + label = "MLS 1.0 " + Label; + context = Context; + + The functions SealBase and OpenBase are defined in Section 6.1 of + [RFC9180] (with "Base" as the MODE), using the HPKE algorithms + specified by the group's cipher suite. If MLS extensions require + HPKE encryption operations, they should reuse the EncryptWithLabel + construction, using a distinct label. To avoid collisions in these + labels, an IANA registry is defined in Section 17.7. + +5.2. Hash-Based Identifiers + + Some MLS messages refer to other MLS objects by hash. For example, + Welcome messages refer to KeyPackages for the members being welcomed, + and Commits refer to Proposals they cover. These identifiers are + computed as follows: + + opaque HashReference; + + HashReference KeyPackageRef; + HashReference ProposalRef; + + MakeKeyPackageRef(value) + = RefHash("MLS 1.0 KeyPackage Reference", value) + + MakeProposalRef(value) + = RefHash("MLS 1.0 Proposal Reference", value) + + RefHash(label, value) = Hash(RefHashInput) + + Where RefHashInput is defined as: + + struct { + opaque label; + opaque value; + } RefHashInput; + + And its fields are set to: + + label = label; + value = value; + + For a KeyPackageRef, the value input is the encoded KeyPackage, and + the cipher suite specified in the KeyPackage determines the KDF used. + For a ProposalRef, the value input is the AuthenticatedContent + carrying the Proposal. In the latter two cases, the KDF is + determined by the group's cipher suite. + +5.3. Credentials + + Each member of a group presents a credential that provides one or + more identities for the member and associates them with the member's + signing key. The identities and signing key are verified by the + Authentication Service in use for a group. + + It is up to the application to decide which identifiers to use at the + application level. For example, a certificate in an X509Credential + may attest to several domain names or email addresses in its + subjectAltName extension. An application may decide to present all + of these to a user, or if it knows a "desired" domain name or email + address, it can check that the desired identifier is among those + attested. Using the terminology from [RFC6125], a credential + provides "presented identifiers", and it is up to the application to + supply a "reference identifier" for the authenticated client, if any. + + // See the "MLS Credential Types" IANA registry for values + uint16 CredentialType; + + struct { + opaque cert_data; + } Certificate; + + struct { + CredentialType credential_type; + select (Credential.credential_type) { + case basic: + opaque identity; + + case x509: + Certificate certificates; + }; + } Credential; + + A "basic" credential is a bare assertion of an identity, without any + additional information. The format of the encoded identity is + defined by the application. + + For an X.509 credential, each entry in the certificates field + represents a single DER-encoded X.509 certificate. The chain is + ordered such that the first entry (certificates[0]) is the end-entity + certificate. The public key encoded in the subjectPublicKeyInfo of + the end-entity certificate MUST be identical to the signature_key in + the LeafNode containing this credential. A chain MAY omit any non- + leaf certificates that supported peers are known to already possess. + +5.3.1. Credential Validation + + The application using MLS is responsible for specifying which + identifiers it finds acceptable for each member in a group. In other + words, following the model that [RFC6125] describes for TLS, the + application maintains a list of "reference identifiers" for the + members of a group, and the credentials provide "presented + identifiers". A member of a group is authenticated by first + validating that the member's credential legitimately represents some + presented identifiers, and then ensuring that the reference + identifiers for the member are authenticated by those presented + identifiers. + + The parts of the system that perform these functions are collectively + referred to as the Authentication Service (AS) [MLS-ARCH]. A + member's credential is said to be _validated with the AS_ when the AS + verifies that the credential's presented identifiers are correctly + associated with the signature_key field in the member's LeafNode, and + that those identifiers match the reference identifiers for the + member. + + Whenever a new credential is introduced in the group, it MUST be + validated with the AS. In particular, at the following events in the + protocol: + + * When a member receives a KeyPackage that it will use in an Add + proposal to add a new member to the group + + * When a member receives a GroupInfo object that it will use to join + a group, either via a Welcome or via an external Commit + + * When a member receives an Add proposal adding a member to the + group + + * When a member receives an Update proposal whose LeafNode has a new + credential for the member + + * When a member receives a Commit with an UpdatePath whose LeafNode + has a new credential for the committer + + * When an external_senders extension is added to the group + + * When an existing external_senders extension is updated + + In cases where a member's credential is being replaced, such as the + Update and Commit cases above, the AS MUST also verify that the set + of presented identifiers in the new credential is valid as a + successor to the set of presented identifiers in the old credential, + according to the application's policy. + +5.3.2. Credential Expiry and Revocation + + In some credential schemes, a valid credential can "expire" or become + invalid after a certain point in time. For example, each X.509 + certificate has a notAfter field, expressing a time after which the + certificate is not valid. + + Expired credentials can cause operational problems in light of the + validation requirements of Section 5.3.1. Applications can apply + some operational practices and adaptations to Authentication Service + policies to moderate these impacts. + + In general, to avoid operational problems such as new joiners + rejecting expired credentials in a group, applications that use such + credentials should ensure to the extent practical that all of the + credentials in use in a group are valid at all times. + + If a member finds that its credential has expired (or will soon), it + should issue an Update or Commit that replaces it with a valid + credential. For this reason, members SHOULD accept Update proposals + and Commits issued by members with expired credentials, if the + credential in the Update or Commit is valid. + + Similarly, when a client is processing messages sent some time in the + past (e.g., syncing up with a group after being offline), the client + SHOULD accept signatures from members with expired credentials, since + the credential may have been valid at the time the message was sent. + + If a member finds that another member's credential has expired, they + may issue a Remove that removes that member. For example, an + application could require a member preparing to issue a Commit to + check the tree for expired credentials and include Remove proposals + for those members in its Commit. In situations where the group tree + is known to the DS, the DS could also monitor the tree for expired + credentials and issue external Remove proposals. + + Some credential schemes also allow credentials to be revoked. + Revocation is similar to expiry in that a previously valid credential + becomes invalid. As such, most of the considerations above also + apply to revoked credentials. However, applications may want to + treat revoked credentials differently, e.g., by removing members with + revoked credentials while allowing members with expired credentials + time to update. + +5.3.3. Uniquely Identifying Clients + + MLS implementations will presumably provide applications with a way + to request protocol operations with regard to other clients (e.g., + removing clients). Such functions will need to refer to the other + clients using some identifier. MLS clients have a few types of + identifiers, with different operational properties. + + Internally to the protocol, group members are uniquely identified by + their leaf index. However, a leaf index is only valid for referring + to members in a given epoch. The same leaf index may represent a + different member, or no member at all, in a subsequent epoch. + + The Credentials presented by the clients in a group authenticate + application-level identifiers for the clients. However, these + identifiers may not uniquely identify clients. For example, if a + user has multiple devices that are all present in an MLS group, then + those devices' clients could all present the user's application-layer + identifiers. + + If needed, applications may add application-specific identifiers to + the extensions field of a LeafNode object with the application_id + extension. + + opaque application_id; + + However, applications MUST NOT rely on the data in an application_id + extension as if it were authenticated by the Authentication Service, + and SHOULD gracefully handle cases where the identifier presented is + not unique. + +6. Message Framing + + Handshake and application messages use a common framing structure. + This framing provides encryption to ensure confidentiality within the + group, as well as signing to authenticate the sender. + + In most of the protocol, messages are handled in the form of + AuthenticatedContent objects. These structures contain the content + of the message itself as well as information to authenticate the + sender (see Section 6.1). The additional protections required to + transmit these messages over an untrusted channel (group membership + authentication or AEAD encryption) are added by encoding the + AuthenticatedContent as a PublicMessage or PrivateMessage message, + which can then be sent as an MLSMessage. Likewise, these protections + are enforced (via membership verification or AEAD decryption) when + decoding a PublicMessage or PrivateMessage into an + AuthenticatedContent object. + + PrivateMessage represents a signed and encrypted message, with + protections for both the content of the message and related metadata. + PublicMessage represents a message that is only signed, and not + encrypted. Applications MUST use PrivateMessage to encrypt + application messages and SHOULD use PrivateMessage to encode + handshake messages, but they MAY transmit handshake messages encoded + as PublicMessage objects in cases where it is necessary for the + Delivery Service to examine such messages. + + enum { + reserved(0), + mls10(1), + (65535) + } ProtocolVersion; + + enum { + reserved(0), + application(1), + proposal(2), + commit(3), + (255) + } ContentType; + + enum { + reserved(0), + member(1), + external(2), + new_member_proposal(3), + new_member_commit(4), + (255) + } SenderType; + + struct { + SenderType sender_type; + select (Sender.sender_type) { + case member: + uint32 leaf_index; + case external: + uint32 sender_index; + case new_member_commit: + case new_member_proposal: + struct{}; + }; + } Sender; + + // See the "MLS Wire Formats" IANA registry for values + uint16 WireFormat; + + struct { + opaque group_id; + uint64 epoch; + Sender sender; + opaque authenticated_data; + + ContentType content_type; + select (FramedContent.content_type) { + case application: + opaque application_data; + case proposal: + Proposal proposal; + case commit: + Commit commit; + }; + } FramedContent; + + struct { + ProtocolVersion version = mls10; + WireFormat wire_format; + select (MLSMessage.wire_format) { + case mls_public_message: + PublicMessage public_message; + case mls_private_message: + PrivateMessage private_message; + case mls_welcome: + Welcome welcome; + case mls_group_info: + GroupInfo group_info; + case mls_key_package: + KeyPackage key_package; + }; + } MLSMessage; + + Messages from senders that aren't in the group are sent as + PublicMessage. See Sections 12.1.8 and 12.4.3.2 for more details. + + The following structure is used to fully describe the data + transmitted in plaintexts or ciphertexts. + + struct { + WireFormat wire_format; + FramedContent content; + FramedContentAuthData auth; + } AuthenticatedContent; + + The following figure illustrates how the various structures described + in this section relate to each other, and the high-level operations + used to produce and consume them: + + Proposal Commit Application Data + | | | + +--------------+--------------+ + | + V + FramedContent + | | -. + +--------+ | | + | | | + V | +-- Asymmetric + FramedContentAuthData | | Sign / Verify + | | | + +--------+ | | + | | | + V V -' + AuthenticatedContent + | -. + +--------+--------+ | + | | +-- Symmetric + V V | Protect / Unprotect + PublicMessage PrivateMessage -' + | | + | | Welcome KeyPackage GroupInfo + | | | | | + +-----------------+-----+----------+----------+ + | + V + MLSMessage + + Figure 12: Relationships among MLS Objects + +6.1. Content Authentication + + FramedContent is authenticated using the FramedContentAuthData + structure. + + struct { + ProtocolVersion version = mls10; + WireFormat wire_format; + FramedContent content; + select (FramedContentTBS.content.sender.sender_type) { + case member: + case new_member_commit: + GroupContext context; + case external: + case new_member_proposal: + struct{}; + }; + } FramedContentTBS; + + opaque MAC; + + struct { + /* SignWithLabel(., "FramedContentTBS", FramedContentTBS) */ + opaque signature; + select (FramedContent.content_type) { + case commit: + /* + MAC(confirmation_key, + GroupContext.confirmed_transcript_hash) + */ + MAC confirmation_tag; + case application: + case proposal: + struct{}; + }; + } FramedContentAuthData; + + The signature is computed using SignWithLabel with label + "FramedContentTBS" and with a content that covers the message content + and the wire format that will be used for this message. If the + sender's sender_type is member, the content also covers the + GroupContext for the current epoch so that signatures are specific to + a given group and epoch. + + The sender MUST use the private key corresponding to the following + signature key depending on the sender's sender_type: + + * member: The signature key contained in the LeafNode at the index + indicated by leaf_index in the ratchet tree. + + * external: The signature key at the index indicated by sender_index + in the external_senders group context extension (see + Section 12.1.8.1). The content_type of the message MUST be + proposal and the proposal_type MUST be a value that is allowed for + external senders. + + * new_member_commit: The signature key in the LeafNode in the + Commit's path (see Section 12.4.3.2). The content_type of the + message MUST be commit. + + * new_member_proposal: The signature key in the LeafNode in the + KeyPackage embedded in an external Add proposal. The content_type + of the message MUST be proposal and the proposal_type of the + Proposal MUST be add. + + Recipients of an MLSMessage MUST verify the signature with the key + depending on the sender_type of the sender as described above. + + The confirmation tag value confirms that the members of the group + have arrived at the same state of the group. A FramedContentAuthData + is said to be valid when both the signature and confirmation_tag + fields are valid. + +6.2. Encoding and Decoding a Public Message + + Messages that are authenticated but not encrypted are encoded using + the PublicMessage structure. + + struct { + FramedContent content; + FramedContentAuthData auth; + select (PublicMessage.content.sender.sender_type) { + case member: + MAC membership_tag; + case external: + case new_member_commit: + case new_member_proposal: + struct{}; + }; + } PublicMessage; + + The membership_tag field in the PublicMessage object authenticates + the sender's membership in the group. For messages sent by members, + it MUST be set to the following value: + + struct { + FramedContentTBS content_tbs; + FramedContentAuthData auth; + } AuthenticatedContentTBM; + + membership_tag = MAC(membership_key, AuthenticatedContentTBM) + + When decoding a PublicMessage into an AuthenticatedContent, the + application MUST check membership_tag and MUST check that the + FramedContentAuthData is valid. + +6.3. Encoding and Decoding a Private Message + + Authenticated and encrypted messages are encoded using the + PrivateMessage structure. + + struct { + opaque group_id; + uint64 epoch; + ContentType content_type; + opaque authenticated_data; + opaque encrypted_sender_data; + opaque ciphertext; + } PrivateMessage; + + encrypted_sender_data and ciphertext are encrypted using the AEAD + function specified by the cipher suite in use, using the SenderData + and PrivateMessageContent structures as input. + +6.3.1. Content Encryption + + Content to be encrypted is encoded in a PrivateMessageContent + structure. + + struct { + select (PrivateMessage.content_type) { + case application: + opaque application_data; + + case proposal: + Proposal proposal; + + case commit: + Commit commit; + }; + + FramedContentAuthData auth; + opaque padding[length_of_padding]; + } PrivateMessageContent; + + The padding field is set by the sender, by first encoding the content + (via the select) and the auth field, and then appending the chosen + number of zero bytes. A receiver identifies the padding field in a + plaintext decoded from PrivateMessage.ciphertext by first decoding + the content and the auth field; then the padding field comprises any + remaining octets of plaintext. The padding field MUST be filled with + all zero bytes. A receiver MUST verify that there are no non-zero + bytes in the padding field, and if this check fails, the enclosing + PrivateMessage MUST be rejected as malformed. This check ensures + that the padding process is deterministic, so that, for example, + padding cannot be used as a covert channel. + + In the MLS key schedule, the sender creates two distinct key ratchets + for handshake and application messages for each member of the group. + When encrypting a message, the sender looks at the ratchets it + derived for its own member and chooses an unused generation from + either the handshake ratchet or the application ratchet, depending on + the content type of the message. This generation of the ratchet is + used to derive a provisional nonce and key. + + Before use in the encryption operation, the nonce is XORed with a + fresh random value to guard against reuse. Because the key schedule + generates nonces deterministically, a client MUST keep persistent + state as to where in the key schedule it is; if this persistent state + is lost or corrupted, a client might reuse a generation that has + already been used, causing reuse of a key/nonce pair. + + To avoid this situation, the sender of a message MUST generate a + fresh random four-byte "reuse guard" value and XOR it with the first + four bytes of the nonce from the key schedule before using the nonce + for encryption. The sender MUST include the reuse guard in the + reuse_guard field of the sender data object, so that the recipient of + the message can use it to compute the nonce to be used for + decryption. + + +-+-+-+-+---------...---+ + | Key Schedule Nonce | + +-+-+-+-+---------...---+ + XOR + +-+-+-+-+---------...---+ + | Guard | 0 | + +-+-+-+-+---------...---+ + === + +-+-+-+-+---------...---+ + | Encrypt/Decrypt Nonce | + +-+-+-+-+---------...---+ + + The Additional Authenticated Data (AAD) input to the encryption + contains an object of the following form, with the values used to + identify the key and nonce: + + struct { + opaque group_id; + uint64 epoch; + ContentType content_type; + opaque authenticated_data; + } PrivateContentAAD; + + When decoding a PrivateMessageContent, the application MUST check + that the FramedContentAuthData is valid. + + It is up to the application to decide what authenticated_data to + provide and how much padding to add to a given message (if any). The + overall size of the AAD and ciphertext MUST fit within the limits + established for the group's AEAD algorithm in [CFRG-AEAD-LIMITS]. + +6.3.2. Sender Data Encryption + + The "sender data" used to look up the key for content encryption is + encrypted with the cipher suite's AEAD with a key and nonce derived + from both the sender_data_secret and a sample of the encrypted + content. Before being encrypted, the sender data is encoded as an + object of the following form: + + struct { + uint32 leaf_index; + uint32 generation; + opaque reuse_guard[4]; + } SenderData; + + When constructing a SenderData object from a Sender object, the + sender MUST verify Sender.sender_type is member and use + Sender.leaf_index for SenderData.leaf_index. + + The reuse_guard field contains a fresh random value used to avoid + nonce reuse in the case of state loss or corruption, as described in + Section 6.3.1. + + The key and nonce provided to the AEAD are computed as the KDF of the + first KDF.Nh bytes of the ciphertext generated in the previous + section. If the length of the ciphertext is less than KDF.Nh, the + whole ciphertext is used. In pseudocode, the key and nonce are + derived as: + + ciphertext_sample = ciphertext[0..KDF.Nh-1] + + sender_data_key = ExpandWithLabel(sender_data_secret, "key", + ciphertext_sample, AEAD.Nk) + sender_data_nonce = ExpandWithLabel(sender_data_secret, "nonce", + ciphertext_sample, AEAD.Nn) + + The AAD for the SenderData ciphertext is the first three fields of + PrivateMessage: + + struct { + opaque group_id; + uint64 epoch; + ContentType content_type; + } SenderDataAAD; + + When parsing a SenderData struct as part of message decryption, the + recipient MUST verify that the leaf index indicated in the leaf_index + field identifies a non-blank node. + +7. Ratchet Tree Operations + + The ratchet tree for an epoch describes the membership of a group in + that epoch, providing public key encryption (HPKE) keys that can be + used to encrypt to subsets of the group as well as information to + authenticate the members. In order to reflect changes to the + membership of the group from one epoch to the next, corresponding + changes are made to the ratchet tree. In this section, we describe + the content of the tree and the required operations. + +7.1. Parent Node Contents + + As discussed in Section 4.1.1, the nodes of a ratchet tree contain + several types of data describing individual members (for leaf nodes) + or subgroups of the group (for parent nodes). Parent nodes are + simpler: + + struct { + HPKEPublicKey encryption_key; + opaque parent_hash; + uint32 unmerged_leaves; + } ParentNode; + + The encryption_key field contains an HPKE public key whose private + key is held only by the members at the leaves among its descendants. + The parent_hash field contains a hash of this node's parent node, as + described in Section 7.9. The unmerged_leaves field lists the leaves + under this parent node that are unmerged, according to their indices + among all the leaves in the tree. The entries in the unmerged_leaves + vector MUST be sorted in increasing order. + +7.2. Leaf Node Contents + + A leaf node in the tree describes all the details of an individual + client's appearance in the group, signed by that client. It is also + used in client KeyPackage objects to store the information that will + be needed to add a client to a group. + + enum { + reserved(0), + key_package(1), + update(2), + commit(3), + (255) + } LeafNodeSource; + + struct { + ProtocolVersion versions; + CipherSuite cipher_suites; + ExtensionType extensions; + ProposalType proposals; + CredentialType credentials; + } Capabilities; + + struct { + uint64 not_before; + uint64 not_after; + } Lifetime; + + // See the "MLS Extension Types" IANA registry for values + uint16 ExtensionType; + + struct { + ExtensionType extension_type; + opaque extension_data; + } Extension; + + struct { + HPKEPublicKey encryption_key; + SignaturePublicKey signature_key; + Credential credential; + Capabilities capabilities; + + LeafNodeSource leaf_node_source; + select (LeafNode.leaf_node_source) { + case key_package: + Lifetime lifetime; + + case update: + struct{}; + + case commit: + opaque parent_hash; + }; + + Extension extensions; + /* SignWithLabel(., "LeafNodeTBS", LeafNodeTBS) */ + opaque signature; + } LeafNode; + + struct { + HPKEPublicKey encryption_key; + SignaturePublicKey signature_key; + Credential credential; + Capabilities capabilities; + + LeafNodeSource leaf_node_source; + select (LeafNodeTBS.leaf_node_source) { + case key_package: + Lifetime lifetime; + + case update: + struct{}; + + case commit: + opaque parent_hash; + }; + + Extension extensions; + + select (LeafNodeTBS.leaf_node_source) { + case key_package: + struct{}; + + case update: + opaque group_id; + uint32 leaf_index; + + case commit: + opaque group_id; + uint32 leaf_index; + }; + } LeafNodeTBS; + + The encryption_key field contains an HPKE public key whose private + key is held only by the member occupying this leaf (or in the case of + a LeafNode in a KeyPackage object, the issuer of the KeyPackage). + The signature_key field contains the member's public signing key. + The credential field contains information authenticating both the + member's identity and the provided signing key, as described in + Section 5.3. + + The capabilities field indicates the protocol features that the + client supports, including protocol versions, cipher suites, + credential types, non-default proposal types, and non-default + extension types. The following proposal and extension types are + considered "default" and MUST NOT be listed: + + * Proposal types: + + - 0x0001 - add + + - 0x0002 - update + + - 0x0003 - remove + + - 0x0004 - psk + + - 0x0005 - reinit + + - 0x0006 - external_init + + - 0x0007 - group_context_extensions + + * Extension types: + + - 0x0001 - application_id + + - 0x0002 - ratchet_tree + + - 0x0003 - required_capabilities + + - 0x0004 - external_pub + + - 0x0005 - external_senders + + There are no default values for the other fields of a capabilities + object. The client MUST list all values for the respective + parameters that it supports. + + The types of any non-default extensions that appear in the extensions + field of a LeafNode MUST be included in the extensions field of the + capabilities field, and the credential type used in the LeafNode MUST + be included in the credentials field of the capabilities field. + + As discussed in Section 13, unknown values in capabilities MUST be + ignored, and the creator of a capabilities field SHOULD include some + random GREASE values to help ensure that other clients correctly + ignore unknown values. + + The leaf_node_source field indicates how this LeafNode came to be + added to the tree. This signal tells other members of the group + whether the leaf node is required to have a lifetime or parent_hash, + and whether the group_id is added as context to the signature. These + fields are included selectively because the client creating a + LeafNode is not always able to compute all of them. For example, a + KeyPackage is created before the client knows which group it will be + used with, so its signature can't bind to a group_id. + + In the case where the leaf was added to the tree based on a pre- + published KeyPackage, the lifetime field represents the times between + which clients will consider a LeafNode valid. These times are + represented as absolute times, measured in seconds since the Unix + epoch (1970-01-01T00:00:00Z). Applications MUST define a maximum + total lifetime that is acceptable for a LeafNode, and reject any + LeafNode where the total lifetime is longer than this duration. In + order to avoid disagreements about whether a LeafNode has a valid + lifetime, the clients in a group SHOULD maintain time synchronization + (e.g., using the Network Time Protocol [RFC5905]). + + In the case where the leaf node was inserted into the tree via a + Commit message, the parent_hash field contains the parent hash for + this leaf node (see Section 7.9). + + The LeafNodeTBS structure covers the fields above the signature in + the LeafNode. In addition, when the leaf node was created in the + context of a group (the update and commit cases), the group ID of the + group is added as context to the signature. + + LeafNode objects stored in the group's ratchet tree are updated + according to the evolution of the tree. Each modification of + LeafNode content MUST be reflected by a change in its signature. + This allows other members to verify the validity of the LeafNode at + any time, particularly in the case of a newcomer joining the group. + +7.3. Leaf Node Validation + + The validity of a LeafNode needs to be verified at the following + stages: + + * When a LeafNode is downloaded in a KeyPackage, before it is used + to add the client to the group + + * When a LeafNode is received by a group member in an Add, Update, + or Commit message + + * When a client validates a ratchet tree, e.g., when joining a group + or after processing a Commit + + The client verifies the validity of a LeafNode using the following + steps: + + * Verify that the credential in the LeafNode is valid, as described + in Section 5.3.1. + + * Verify that the signature on the LeafNode is valid using + signature_key. + + * Verify that the LeafNode is compatible with the group's + parameters. If the GroupContext has a required_capabilities + extension, then the required extensions, proposals, and credential + types MUST be listed in the LeafNode's capabilities field. + + * Verify that the credential type is supported by all members of the + group, as specified by the capabilities field of each member's + LeafNode, and that the capabilities field of this LeafNode + indicates support for all the credential types currently in use by + other members. + + * Verify the lifetime field: + + - If the LeafNode appears in a message being sent by the client, + e.g., a Proposal or a Commit, then the client MUST verify that + the current time is within the range of the lifetime field. + + - If instead the LeafNode appears in a message being received by + the client, e.g., a Proposal, a Commit, or a ratchet tree of + the group the client is joining, it is RECOMMENDED that the + client verifies that the current time is within the range of + the lifetime field. (This check is not mandatory because the + LeafNode might have expired in the time between when the + message was sent and when it was received.) + + * Verify that the extensions in the LeafNode are supported by + checking that the ID for each extension in the extensions field is + listed in the capabilities.extensions field of the LeafNode. + + * Verify the leaf_node_source field: + + - If the LeafNode appears in a KeyPackage, verify that + leaf_node_source is set to key_package. + + - If the LeafNode appears in an Update proposal, verify that + leaf_node_source is set to update and that encryption_key + represents a different public key than the encryption_key in + the leaf node being replaced by the Update proposal. + + - If the LeafNode appears in the leaf_node value of the + UpdatePath in a Commit, verify that leaf_node_source is set to + commit. + + * Verify that the following fields are unique among the members of + the group: + + - signature_key + + - encryption_key + +7.4. Ratchet Tree Evolution + + Whenever a member initiates an epoch change (i.e., commits; see + Section 12.4), they may need to refresh the key pairs of their leaf + and of the nodes on their leaf's direct path in order to maintain + forward secrecy and post-compromise security. + + The member initiating the epoch change generates the fresh key pairs + using the following procedure. The procedure is designed in a way + that allows group members to efficiently communicate the fresh secret + keys to other group members, as described in Section 7.6. + + A member updates the nodes along its direct path as follows: + + * Blank all the nodes on the direct path from the leaf to the root. + + * Generate a fresh HPKE key pair for the leaf. + + * Generate a sequence of path secrets, one for each node on the + leaf's filtered direct path, as follows. In this setting, + path_secret[0] refers to the first parent node in the filtered + direct path, path_secret[1] to the second parent node, and so on. + + path_secret[0] is sampled at random + path_secret[n] = DeriveSecret(path_secret[n-1], "path") + + * Compute the sequence of HPKE key pairs (node_priv,node_pub), one + for each node on the leaf's direct path, as follows. + + node_secret[n] = DeriveSecret(path_secret[n], "node") + node_priv[n], node_pub[n] = KEM.DeriveKeyPair(node_secret[n]) + + The node secret is derived as a temporary intermediate secret so that + each secret is only used with one algorithm: The path secret is used + as an input to DeriveSecret, and the node secret is used as an input + to DeriveKeyPair. + + For example, suppose there is a group with four members, with C an + unmerged leaf at Z: + + Y + | + .-+-. + / \ + X Z[C] + / \ / \ + A B C D + + 0 1 2 3 + + Figure 13: A Full Tree with One Unmerged Leaf + + If member B subsequently generates an UpdatePath based on a secret + "leaf_secret", then it would generate the following sequence of path + secrets: + + path_secret[1] ---> node_secret[1] -------> node_priv[1], node_pub[1] + + ^ + | + | + path_secret[0] ---> node_secret[0] -------> node_priv[0], node_pub[0] + + ^ + | + | + leaf_secret ------> leaf_node_secret --+--> leaf_priv, leaf_pub + | | + '-------. .-------' + | + leaf_node + + Figure 14: Derivation of Ratchet Tree Keys along a Direct Path + + After applying the UpdatePath, the tree will have the following + structure: + + node_priv[1] --------> Y' + | + .-+-. + / \ + node_priv[0] ----> X' Z[C] + / \ / \ + A B C D + ^ + leaf_priv -----------+ + 0 1 2 3 + + Figure 15: Placement of Keys in a Ratchet Tree + +7.5. Synchronizing Views of the Tree + + After generating fresh key material and applying it to update their + local tree state as described in Section 7.4, the generator + broadcasts this update to other members of the group in a Commit + message, who apply it to keep their local views of the tree in sync + with the sender's. More specifically, when a member commits a change + to the tree (e.g., to add or remove a member), it transmits an + UpdatePath containing a set of public keys and encrypted path secrets + for intermediate nodes in the filtered direct path of its leaf. The + other members of the group use these values to update their view of + the tree, aligning their copy of the tree to the sender's. + + An UpdatePath contains the following information for each node in the + filtered direct path of the sender's leaf, including the root: + + * The public key for the node + + * One or more encrypted copies of the path secret corresponding to + the node + + The path secret value for a given node is encrypted to the subtree + rooted at the parent's non-updated child, i.e., the child on the + copath of the sender's leaf node. There is one encryption of the + path secret to each public key in the resolution of the non-updated + child. + + A member of the group _updates their direct path_ by computing new + values for their leaf node and the nodes along their filtered direct + path as follows: + + 1. Blank all nodes along the direct path of the sender's leaf. + + 2. Compute updated path secrets and public keys for the nodes on the + sender's filtered direct path. + + * Generate a sequence of path secrets of the same length as the + filtered direct path, as defined in Section 7.4. + + * For each node in the filtered direct path, replace the node's + public key with the node_pub[n] value derived from the + corresponding path secret path_secret[n]. + + 3. Compute the new parent hashes for the nodes along the filtered + direct path and the sender's leaf node. + + 4. Update the leaf node for the sender. + + * Set the leaf_node_source to commit. + + * Set the encryption_key to the public key of a freshly sampled + key pair. + + * Set the parent hash to the parent hash for the leaf. + + * Re-sign the leaf node with its new contents. + + Since the new leaf node effectively updates an existing leaf node in + the group, it MUST adhere to the same restrictions as LeafNodes used + in Update proposals (aside from leaf_node_source). The application + MAY specify other changes to the leaf node, e.g., providing a new + signature key, updated capabilities, or different extensions. + + The member then _encrypts path secrets to the group_. For each node + in the member's filtered direct path, the member takes the following + steps: + + 1. Compute the resolution of the node's child that is on the copath + of the sender (the child that is not in the direct path of the + sender). Any new member (from an Add proposal) added in the same + Commit MUST be excluded from this resolution. + + 2. For each node in the resolution, encrypt the path secret for the + direct path node using the public key of the resolution node, as + defined in Section 7.6. + + The recipient of an UpdatePath performs the corresponding steps. + First, the recipient _merges UpdatePath into the tree_: + + 1. Blank all nodes on the direct path of the sender's leaf. + + 2. For all nodes on the filtered direct path of the sender's leaf, + + * Set the public key to the public key in the UpdatePath. + + * Set the list of unmerged leaves to the empty list. + + 3. Compute parent hashes for the nodes in the sender's filtered + direct path, and verify that the parent_hash field of the leaf + node matches the parent hash for the first node in its filtered + direct path. + + * Note that these hashes are computed from root to leaf, so that + each hash incorporates all the non-blank nodes above it. The + root node always has a zero-length hash for its parent hash. + + Second, the recipient _decrypts the path secrets_: + + 1. Identify a node in the filtered direct path for which the + recipient is in the subtree of the non-updated child. + + 2. Identify a node in the resolution of the copath node for which + the recipient has a private key. + + 3. Decrypt the path secret for the parent of the copath node using + the private key from the resolution node. + + 4. Derive path secrets for ancestors of that node in the sender's + filtered direct path using the algorithm described above. + + 5. Derive the node secrets and node key pairs from the path secrets. + + 6. Verify that the derived public keys are the same as the + corresponding public keys sent in the UpdatePath. + + 7. Store the derived private keys in the corresponding ratchet tree + nodes. + + For example, in order to communicate the example update described in + Section 7.4, the member at node B would transmit the following + values: + + +=============+====================================================+ + | Public Key | Ciphertext(s) | + +=============+====================================================+ + | node_pub[1] | E(pk(Z), path_secret[1]), E(pk(C), path_secret[1]) | + +-------------+----------------------------------------------------+ + | node_pub[0] | E(pk(A), path_secret[0]) | + +-------------+----------------------------------------------------+ + + Table 3 + + In this table, the value node_pub[i] represents the public key + derived from node_secret[i], pk(X) represents the current public key + of node X, and E(K, S) represents the public key encryption of the + path secret S to the public key K (using HPKE). + + A recipient at node A would decrypt E(pk(A), path_secret\[0\]) to + obtain path_secret\[0\], then use it to derive path_secret[1] and the + resulting node secrets and key pairs. Thus, A would have the private + keys to nodes X' and Y', in accordance with the tree invariant. + + Similarly, a recipient at node D would decrypt E(pk(Z), + path_secret[1]) to obtain path_secret[1], then use it to derive the + node secret and key pair for the node Y'. As required to maintain + the tree invariant, node D does not receive the private key for the + node X', since X' is not an ancestor of D. + + After processing the update, each recipient MUST delete outdated key + material, specifically: + + * The path secrets and node secrets used to derive each updated node + key pair. + + * Each outdated node key pair that was replaced by the update. + +7.6. Update Paths + + As described in Section 12.4, each Commit message may optionally + contain an UpdatePath, with a new LeafNode and set of parent nodes + for the sender's filtered direct path. For each parent node, the + UpdatePath contains a new public key and encrypted path secret. The + parent nodes are kept in the same order as the filtered direct path. + + struct { + opaque kem_output; + opaque ciphertext; + } HPKECiphertext; + + struct { + HPKEPublicKey encryption_key; + HPKECiphertext encrypted_path_secret; + } UpdatePathNode; + + struct { + LeafNode leaf_node; + UpdatePathNode nodes; + } UpdatePath; + + For each UpdatePathNode, the resolution of the corresponding copath + node MUST exclude all new leaf nodes added as part of the current + Commit. The length of the encrypted_path_secret vector MUST be equal + to the length of the resolution of the copath node (excluding new + leaf nodes), with each ciphertext being the encryption to the + respective resolution node. + + The HPKECiphertext values are encrypted and decrypted as follows: + + (kem_output, ciphertext) = + EncryptWithLabel(node_public_key, "UpdatePathNode", + group_context, path_secret) + + path_secret = + DecryptWithLabel(node_private_key, "UpdatePathNode", + group_context, kem_output, ciphertext) + + Here node_public_key is the public key of the node for which the path + secret is encrypted, group_context is the provisional GroupContext + object for the group, and the EncryptWithLabel function is as defined + in Section 5.1.3. + +7.7. Adding and Removing Leaves + + In addition to the path-based updates to the tree described above, it + is also necessary to add and remove leaves of the tree in order to + reflect changes to the membership of the group (see Sections 12.1.1 + and 12.1.3). Since the tree is always full, adding or removing + leaves corresponds to increasing or decreasing the depth of the tree, + resulting in the number of leaves being doubled or halved. These + operations are also known as _extending_ and _truncating_ the tree. + + Leaves are always added and removed at the right edge of the tree. + When the size of the tree needs to be increased, a new blank root + node is added, whose left subtree is the existing tree and right + subtree is a new all-blank subtree. This operation is typically done + when adding a member to the group. + + _ <-- new blank root _ + __|__ __|__ + / \ / \ + X ===> X _ <-- new blank subtree ===> X _ + / \ / \ / \ / \ / \ + A B A B _ _ A B C _ + ^ + | + new member --+ + + Figure 16: Extending the Tree to Make Room for a Third Member + + When the right subtree of the tree no longer has any non-blank nodes, + it can be safely removed. The root of the tree and the right subtree + are discarded (whether or not the root node is blank). The left + child of the root becomes the new root node, and the left subtree + becomes the new tree. This operation is typically done after + removing a member from the group. + + Y Y + __|__ __|__ + / \ / \ + X _ ===> X _ ==> X <-- new root + / \ / \ / \ / \ / \ + A B C _ A B _ _ A B + ^ + | + removed member --+ + + Figure 17: Cleaning Up after Removing Member C + + Concrete algorithms for these operations on array-based and link- + based trees are provided in Appendices C and D. The concrete + algorithms are non-normative. An implementation may use any + algorithm that produces the correct tree in its internal + representation. + +7.8. Tree Hashes + + MLS hashes the contents of the tree in two ways to authenticate + different properties of the tree. _Tree hashes_ are defined in this + section, and _parent hashes_ are defined in Section 7.9. + + Each node in a ratchet tree has a tree hash that summarizes the + subtree below that node. The tree hash of the root is used in the + GroupContext to confirm that the group agrees on the whole tree. + Tree hashes are computed recursively from the leaves up to the root. + + P --> th(P) + ^ ^ + / \ + / \ + th(L) th(R) + + Figure 18: Composition of the Tree Hash + + The tree hash of an individual node is the hash of the node's + TreeHashInput object, which may contain either a LeafNodeHashInput or + a ParentNodeHashInput depending on the type of node. + LeafNodeHashInput objects contain the leaf_index and the LeafNode (if + any). ParentNodeHashInput objects contain the ParentNode (if any) + and the tree hash of the node's left and right children. For both + parent and leaf nodes, the optional node value MUST be absent if the + node is blank and present if the node contains a value. + + enum { + reserved(0), + leaf(1), + parent(2), + (255) + } NodeType; + + struct { + NodeType node_type; + select (TreeHashInput.node_type) { + case leaf: LeafNodeHashInput leaf_node; + case parent: ParentNodeHashInput parent_node; + }; + } TreeHashInput; + + struct { + uint32 leaf_index; + optional leaf_node; + } LeafNodeHashInput; + + struct { + optional parent_node; + opaque left_hash; + opaque right_hash; + } ParentNodeHashInput; + + The tree hash of an entire tree corresponds to the tree hash of the + root node, which is computed recursively by starting at the leaf + nodes and building up. + +7.9. Parent Hashes + + While tree hashes summarize the state of a tree at point in time, + parent hashes capture information about how keys in the tree were + populated. + + When a client sends a Commit to change a group, it can include an + UpdatePath to assign new keys to the nodes along its filtered direct + path. When a client computes an UpdatePath (as defined in + Section 7.5), it computes and signs a parent hash that summarizes the + state of the tree after the UpdatePath has been applied. These + summaries are constructed in a chain from the root to the member's + leaf so that the part of the chain closer to the root can be + overwritten as nodes set in one UpdatePath are reset by a later + UpdatePath. + + ph(Q) + / + / + V + P.public_key --> ph(P) + / ^ + / \ + V \ + N.parent_hash th(S) + + Figure 19: Inputs to a Parent Hash + + As a result, the signature over the parent hash in each member's leaf + effectively signs the subtree of the tree that hasn't been changed + since that leaf was last changed in an UpdatePath. A new member + joining the group uses these parent hashes to verify that the parent + nodes in the tree were set by members of the group, not chosen by an + external attacker. For an example of how this works, see Appendix B. + + Consider a ratchet tree with a non-blank parent node P and children D + and S (for "parent", "direct path", and "sibling"), with D and P in + the direct path of a leaf node L (for "leaf"): + + ... + / + P + __|__ + / \ + D S + / \ / \ + ... ... ... ... + / + L + + Figure 20: Nodes Involved in a Parent Hash Computation + + The parent hash of P changes whenever an UpdatePath object is applied + to the ratchet tree along a path from a leaf L traversing node D (and + hence also P). The new "Parent hash of P (with copath child S)" is + obtained by hashing P's ParentHashInput struct. + + struct { + HPKEPublicKey encryption_key; + opaque parent_hash; + opaque original_sibling_tree_hash; + } ParentHashInput; + + The field encryption_key contains the HPKE public key of P. If P is + the root, then the parent_hash field is set to a zero-length octet + string. Otherwise, parent_hash is the parent hash of the next node + after P on the filtered direct path of the leaf L. This way, P's + parent hash fixes the new HPKE public key of each non-blank node on + the path from P to the root. Note that the path from P to the root + may contain some blank nodes that are not fixed by P's parent hash. + However, for each node that has an HPKE key, this key is fixed by P's + parent hash. + + Finally, original_sibling_tree_hash is the tree hash of S in the + ratchet tree modified as follows: For each leaf L in + P.unmerged_leaves, blank L and remove it from the unmerged_leaves + sets of all parent nodes. + + Observe that original_sibling_tree_hash does not change between + updates of P. This property is crucial for the correctness of the + protocol. + + Note that original_sibling_tree_hash is the tree hash of S, not the + parent hash. The parent_hash field in ParentHashInput captures + information about the nodes above P. the original_sibling_tree_hash + captures information about the subtree under S that is not being + updated (and thus the subtree to which a path secret for P would be + encrypted according to Section 7.5). + + For example, in the following tree: + + W [F] + ______|_____ + / \ + U Y [F] + __|__ __|__ + / \ / \ + T _ _ _ + / \ / \ / \ / \ + A B C D E F G _ + + Figure 21: A Tree Illustrating Parent Hash Computations + + With P = W and S = Y, original_sibling_tree_hash is the tree hash of + the following tree: + + Y + __|__ + / \ + _ _ + / \ / \ + E _ G _ + + Because W.unmerged_leaves includes F, F is blanked and removed from + Y.unmerged_leaves. + + Note that no recomputation is needed if the tree hash of S is + unchanged since the last time P was updated. This is the case for + computing or processing a Commit whose UpdatePath traverses P, since + the Commit itself resets P. (In other words, it is only necessary to + recompute the original sibling tree hash when validating a group's + tree on joining.) More generally, if none of the entries in + P.unmerged_leaves are in the subtree under S (and thus no leaves were + blanked), then the original tree hash at S is the tree hash of S in + the current tree. + + If it is necessary to recompute the original tree hash of a node, the + efficiency of recomputation can be improved by caching intermediate + tree hashes, to avoid recomputing over the subtree when the subtree + is included in multiple parent hashes. A subtree hash can be reused + as long as the intersection of the parent's unmerged leaves with the + subtree is the same as in the earlier computation. + +7.9.1. Using Parent Hashes + + In ParentNode objects and LeafNode objects with leaf_node_source set + to commit, the value of the parent_hash field is the parent hash of + the next non-blank parent node above the node in question (the next + node in the filtered direct path). Using the node labels in + Figure 20, the parent_hash field of D is equal to the parent hash of + P with copath child S. This is the case even when the node D is a + leaf node. + + The parent_hash field of a LeafNode is signed by the member. The + signature of such a LeafNode thus attests to which keys the group + member introduced into the ratchet tree and to whom the corresponding + secret keys were sent, in addition to the other contents of the + LeafNode. This prevents malicious insiders from constructing + artificial ratchet trees with a node D whose HPKE secret key is known + to the insider, yet where the insider isn't assigned a leaf in the + subtree rooted at D. Indeed, such a ratchet tree would violate the + tree invariant. + +7.9.2. Verifying Parent Hashes + + Parent hashes are verified at two points in the protocol: When + joining a group and when processing a Commit. + + The parent hash in a node D is valid with respect to a parent node P + if the following criteria hold. Here C and S are the children of P + (for "child" and "sibling"), with C being the child that is on the + direct path of D (possibly D itself) and S being the other child: + + * D is a descendant of P in the tree. + + * The parent_hash field of D is equal to the parent hash of P with + copath child S. + + * D is in the resolution of C, and the intersection of P's + unmerged_leaves with the subtree under C is equal to the + resolution of C with D removed. + + These checks verify that D and P were updated at the same time (in + the same UpdatePath), and that they were neighbors in the UpdatePath + because the nodes in between them would have omitted from the + filtered direct path. + + A parent node P is "parent-hash valid" if it can be chained back to a + leaf node in this way. That is, if there is leaf node L and a + sequence of parent nodes P_1, ..., P_N such that P_N = P and each + step in the chain is authenticated by a parent hash, then L's parent + hash is valid with respect to P_1, P_1's parent hash is valid with + respect to P_2, and so on. + + When joining a group, the new member MUST authenticate that each non- + blank parent node P is parent-hash valid. This can be done "bottom + up" by building chains up from leaves and verifying that all non- + blank parent nodes are covered by exactly one such chain, or "top + down" by verifying that there is exactly one descendant of each non- + blank parent node for which the parent node is parent-hash valid. + + When processing a Commit message that includes an UpdatePath, clients + MUST recompute the expected value of parent_hash for the committer's + new leaf and verify that it matches the parent_hash value in the + supplied leaf_node. After being merged into the tree, the nodes in + the UpdatePath form a parent-hash chain from the committer's leaf to + the root. + +8. Key Schedule + + Group keys are derived using the Extract and Expand functions from + the KDF for the group's cipher suite, as well as the functions + defined below: + + ExpandWithLabel(Secret, Label, Context, Length) = + KDF.Expand(Secret, KDFLabel, Length) + + DeriveSecret(Secret, Label) = + ExpandWithLabel(Secret, Label, "", KDF.Nh) + + Where KDFLabel is specified as: + + struct { + uint16 length; + opaque label; + opaque context; + } KDFLabel; + + And its fields are set to: + + length = Length; + label = "MLS 1.0 " + Label; + context = Context; + + The value KDF.Nh is the size of an output from KDF.Extract, in bytes. + In the below diagram: + + * KDF.Extract takes its salt argument from the top and its Input + Keying Material (IKM) argument from the left. + + * DeriveSecret takes its Secret argument from the incoming arrow. + + * 0 represents an all-zero byte string of length KDF.Nh. + + When processing a handshake message, a client combines the following + information to derive new epoch secrets: + + * The init secret from the previous epoch + + * The commit secret for the current epoch + + * The GroupContext object for current epoch + + Given these inputs, the derivation of secrets for an epoch proceeds + as shown in the following diagram: + + init_secret_[n-1] + | + | + V + commit_secret --> KDF.Extract + | + | + V + ExpandWithLabel(., "joiner", GroupContext_[n], KDF.Nh) + | + | + V + joiner_secret + | + | + V +psk_secret (or 0) --> KDF.Extract + | + | + +--> DeriveSecret(., "welcome") + | = welcome_secret + | + V + ExpandWithLabel(., "epoch", GroupContext_[n], KDF.Nh) + | + | + V + epoch_secret + | + | + +--> DeriveSecret(.,