diff options
Diffstat (limited to 'doc/rfc/rfc9415.txt')
-rw-r--r-- | doc/rfc/rfc9415.txt | 2049 |
1 files changed, 2049 insertions, 0 deletions
diff --git a/doc/rfc/rfc9415.txt b/doc/rfc/rfc9415.txt new file mode 100644 index 0000000..b21fd63 --- /dev/null +++ b/doc/rfc/rfc9415.txt @@ -0,0 +1,2049 @@ + + + + +Internet Research Task Force (IRTF) F. Gont +Request for Comments: 9415 SI6 Networks +Category: Informational I. Arce +ISSN: 2070-1721 Quarkslab + July 2023 + + + On the Generation of Transient Numeric Identifiers + +Abstract + + This document performs an analysis of the security and privacy + implications of different types of "transient numeric identifiers" + used in IETF protocols and tries to categorize them based on their + interoperability requirements and their associated failure severity + when such requirements are not met. Subsequently, it provides advice + on possible algorithms that could be employed to satisfy the + interoperability requirements of each identifier category while + minimizing the negative security and privacy implications, thus + providing guidance to protocol designers and protocol implementers. + Finally, it describes a number of algorithms that have been employed + in real implementations to generate transient numeric identifiers and + analyzes their security and privacy properties. This document is a + product of the Privacy Enhancements and Assessments Research Group + (PEARG) in the IRTF. + +Status of This Memo + + This document is not an Internet Standards Track specification; it is + published for informational purposes. + + This document is a product of the Internet Research Task Force + (IRTF). The IRTF publishes the results of Internet-related research + and development activities. These results might not be suitable for + deployment. This RFC represents the consensus of the Privacy + Enhancements and Assessments Research Group of the Internet Research + Task Force (IRTF). Documents approved for publication by the IRSG + are not candidates for any level of Internet Standard; see Section 2 + of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc9415. + +Copyright Notice + + Copyright (c) 2023 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. + +Table of Contents + + 1. Introduction + 2. Terminology + 3. Threat Model + 4. Issues with the Specification of Transient Numeric Identifiers + 5. Protocol Failure Severity + 6. Categorizing Transient Numeric Identifiers + 7. Common Algorithms for Transient Numeric Identifier Generation + 7.1. Category #1: Uniqueness (Soft Failure) + 7.2. Category #2: Uniqueness (Hard Failure) + 7.3. Category #3: Uniqueness, Stable within Context (Soft + Failure) + 7.4. Category #4: Uniqueness, Monotonically Increasing within + Context (Hard Failure) + 8. Common Vulnerabilities Associated with Transient Numeric + Identifiers + 8.1. Network Activity Correlation + 8.2. Information Leakage + 8.3. Fingerprinting + 8.4. Exploitation of the Semantics of Transient Numeric + Identifiers + 8.5. Exploitation of Collisions of Transient Numeric Identifiers + 8.6. Exploitation of Predictable Transient Numeric Identifiers + for Injection Attacks + 8.7. Cryptanalysis + 9. Vulnerability Assessment of Transient Numeric Identifiers + 9.1. Category #1: Uniqueness (Soft Failure) + 9.2. Category #2: Uniqueness (Hard Failure) + 9.3. Category #3: Uniqueness, Stable within Context (Soft + Failure) + 9.4. Category #4: Uniqueness, Monotonically Increasing within + Context (Hard Failure) + 10. IANA Considerations + 11. Security Considerations + 12. References + 12.1. Normative References + 12.2. Informative References + Appendix A. Algorithms and Techniques with Known Issues + A.1. Predictable Linear Identifiers Algorithm + A.2. Random-Increments Algorithm + A.3. Reusing Identifiers Across Different Contexts + Acknowledgements + Authors' Addresses + +1. Introduction + + Networking protocols employ a variety of transient numeric + identifiers for different protocol objects, such as IPv4 and IPv6 + Identification values [RFC0791] [RFC8200], IPv6 Interface Identifiers + (IIDs) [RFC4291], transport-protocol ephemeral port numbers + [RFC6056], TCP Initial Sequence Numbers (ISNs) [RFC9293], NTP + Reference IDs (REFIDs) [RFC5905], and DNS IDs [RFC1035]. These + identifiers typically have specific requirements (e.g., uniqueness + during a specified period of time) that must be satisfied such that + they do not result in negative interoperability implications and an + associated failure severity when such requirements are not met. + + | NOTE: Some documents refer to the DNS ID as the DNS "Query ID" + | or "TxID". + + For more than 30 years, a large number of implementations of IETF + protocols have been subject to a variety of attacks, with effects + ranging from Denial of Service (DoS) or data injection to information + leakages that could be exploited for pervasive monitoring [RFC7258]. + The root cause of these issues has been, in many cases, the poor + selection of transient numeric identifiers in such protocols, usually + as a result of insufficient or misleading specifications. While it + is generally trivial to identify an algorithm that can satisfy the + interoperability requirements of a given transient numeric + identifier, empirical evidence exists that doing so without + negatively affecting the security and/or privacy properties of the + aforementioned protocols is prone to error [RFC9414]. + + For example, implementations have been subject to security and/or + privacy issues resulting from: + + * predictable IPv4 or IPv6 Identification values (e.g., see + [Sanfilippo1998a], [RFC6274], and [RFC7739]), + + * predictable IPv6 IIDs (e.g., see [RFC7217], [RFC7707], and + [RFC7721]), + + * predictable transport-protocol ephemeral port numbers (e.g., see + [RFC6056] and [Silbersack2005]), + + * predictable TCP Initial Sequence Numbers (ISNs) (e.g., see + [Morris1985], [Bellovin1989], and [RFC6528]), + + * predictable initial timestamps in TCP timestamps options (e.g., + see [TCPT-uptime] and [RFC7323]), and + + * predictable DNS IDs (see, e.g., [Schuba1993] and [Klein2007]). + + Recent history indicates that, when new protocols are standardized or + new protocol implementations are produced, the security and privacy + properties of the associated transient numeric identifiers tend to be + overlooked, and inappropriate algorithms to generate such identifiers + are either suggested in the specifications or selected by + implementers. As a result, advice in this area is warranted. + + We note that the use of cryptographic techniques may readily mitigate + some of the issues arising from predictable transient numeric + identifiers. For example, cryptographic authentication can readily + mitigate data injection attacks even in the presence of predictable + transient numeric identifiers (such as "sequence numbers"). However, + use of flawed algorithms (such as global counters) for generating + transient numeric identifiers could still result in information + leakages even when cryptographic techniques are employed. + + This document contains a non-exhaustive survey of transient numeric + identifiers employed in various IETF protocols and aims to categorize + such identifiers based on their interoperability requirements and the + associated failure severity when such requirements are not met. + Subsequently, it provides advice on possible algorithms that could be + employed to satisfy the interoperability requirements of each + category while minimizing negative security and privacy implications. + Finally, it analyzes several algorithms that have been employed in + real implementations to meet such requirements and analyzes their + security and privacy properties. + + This document represents the consensus of the Privacy Enhancements + and Assessments Research Group (PEARG). + +2. Terminology + + Transient Numeric Identifier: + A data object in a protocol specification that can be used to + definitely distinguish a protocol object (a datagram, network + interface, transport-protocol endpoint, session, etc.) from all + other objects of the same type, in a given context. Transient + numeric identifiers are usually defined as a series of bits and + represented using integer values. These identifiers are typically + dynamically selected, as opposed to statically assigned numeric + identifiers (see, e.g., [IANA-PROT]). We note that different + transient numeric identifiers may have additional requirements or + properties depending on their specific use in a protocol. We use + the term "transient numeric identifier" (or simply "numeric + identifier" or "identifier" as short forms) as a generic term to + refer to any data object in a protocol specification that + satisfies the identification property stated above. + + Failure Severity: + The interoperability consequences of a failure to comply with the + interoperability requirements of a given identifier. Severity + considers the worst potential consequence of a failure, determined + by the system damage and/or time lost to repair the failure. In + this document, we define two types of failure severity: "soft + failure" and "hard failure". + + Soft Failure: + A recoverable condition in which a protocol does not operate in + the prescribed manner but normal operation can be resumed + automatically in a short period of time. For example, a simple + packet-loss event that is subsequently recovered with a packet + retransmission can be considered a soft failure. + + Hard Failure: + A non-recoverable condition in which a protocol does not operate + in the prescribed manner or it operates with excessive degradation + of service. For example, an established TCP connection that is + aborted due to an error condition constitutes, from the point of + view of the transport protocol, a hard failure, since it enters a + state from which normal operation cannot be resumed. + +3. Threat Model + + Throughout this document, we do not consider on-path attacks. That + is, we assume the attacker does not have physical or logical access + to the system(s) being attacked and that the attacker can only + observe traffic explicitly directed to the attacker. Similarly, an + attacker cannot observe traffic transferred between the sender and + the receiver(s) of a target protocol but may be able to interact with + any of these entities, including by, e.g., sending any traffic to + them to sample transient numeric identifiers employed by the target + hosts when communicating with the attacker. + + For example, when analyzing vulnerabilities associated with TCP + Initial Sequence Numbers (ISNs), we consider the attacker is unable + to capture network traffic corresponding to a TCP connection between + two other hosts. However, we consider the attacker is able to + communicate with any of these hosts (e.g., establish a TCP connection + with any of them) to, e.g., sample the TCP ISNs employed by these + hosts when communicating with the attacker. + + Similarly, when considering host-tracking attacks based on IPv6 + Interface Identifiers, we consider an attacker may learn the IPv6 + address employed by a victim host if, e.g., the address becomes + exposed as a result of the victim host communicating with an + attacker-operated server. Subsequently, an attacker may perform + host-tracking by probing a set of target addresses composed by a set + of target prefixes and the IPv6 Interface Identifier originally + learned by the attacker. Alternatively, an attacker may perform + host-tracking if, e.g., the victim host communicates with an + attacker-operated server as it moves from one location to another, + thereby exposing its configured addresses. We note that none of + these scenarios require the attacker observe traffic not explicitly + directed to the attacker. + +4. Issues with the Specification of Transient Numeric Identifiers + + While assessing IETF protocol specifications regarding the use of + transient numeric identifiers, we have found that most of the issues + discussed in this document arise as a result of one of the following + conditions: + + * protocol specifications that under specify their transient numeric + identifiers + + * protocol specifications that over specify their transient numeric + identifiers + + * protocol implementations that simply fail to comply with the + specified requirements + + A number of IETF protocol specifications under specified their + transient numeric identifiers, thus leading to implementations that + were vulnerable to numerous off-path attacks. Examples of them are + the specification of TCP local ports in [RFC0793] or the + specification of the DNS ID in [RFC1035]. + + | NOTE: The TCP local port in an active OPEN request is commonly + | known as the "ephemeral port" of the corresponding TCP + | connection [RFC6056]. + + On the other hand, there are a number of IETF protocol specifications + that over specify some of their associated transient numeric + identifiers. For example, [RFC4291] essentially overloads the + semantics of IPv6 Interface Identifiers (IIDs) by embedding link- + layer addresses in the IPv6 IIDs when the interoperability + requirement of uniqueness could be achieved in other ways that do not + result in negative security and privacy implications [RFC7721]. + Similarly, [RFC2460] suggests the use of a global counter for the + generation of Identification values when the interoperability + requirement of uniqueness per {IPv6 Source Address, IPv6 Destination + Address} could be achieved with other algorithms that do not result + in negative security and privacy implications [RFC7739]. + + Finally, there are protocol implementations that simply fail to + comply with existing protocol specifications. For example, some + popular operating systems still fail to implement transport-protocol + ephemeral port randomization, as recommended in [RFC6056], or TCP + Initial Sequence Number randomization, as recommended in [RFC9293]. + +5. Protocol Failure Severity + + Section 2 defines the concept of "failure severity", along with two + types of failure severities that we employ throughout this document: + soft and hard. + + Our analysis of the severity of a failure is performed from the point + of view of the protocol in question. However, the corresponding + severity on the upper protocol (or application) might not be the same + as that of the protocol in question. For example, a TCP connection + that is aborted might or might not result in a hard failure of the + upper application, i.e., if the upper application can establish a new + TCP connection without any impact on the application, a hard failure + at the TCP protocol may have no severity at the application layer. + On the other hand, if a hard failure of a TCP connection results in + excessive degradation of service at the application layer, it will + also result in a hard failure at the application. + +6. Categorizing Transient Numeric Identifiers + + This section includes a non-exhaustive survey of transient numeric + identifiers, which are representative of all the possible + combinations of interoperability requirements and failure severities + found in popular protocols of different layers. Additionally, it + proposes a number of categories that can accommodate these + identifiers based on their interoperability requirements and their + associated failure severity (soft or hard). + + | NOTE: All other transient numeric identifiers that were + | analyzed as part of this effort could be accommodated into one + | of the existing categories from Table 1. + + +===============+===============================+==================+ + | Identifier | Interoperability Requirements | Failure Severity | + +===============+===============================+==================+ + | IPv6 ID | Uniqueness (for IPv6 address | Soft/Hard (1) | + | | pair) | | + +---------------+-------------------------------+------------------+ + | IPv6 IID | Uniqueness (and stable within | Soft (3) | + | | IPv6 prefix) (2) | | + +---------------+-------------------------------+------------------+ + | TCP ISN | Monotonically increasing (4) | Hard (4) | + +---------------+-------------------------------+------------------+ + | TCP initial | Monotonically increasing (5) | Hard (5) | + | timestamp | | | + +---------------+-------------------------------+------------------+ + | TCP ephemeral | Uniqueness (for connection | Hard | + | port | ID) | | + +---------------+-------------------------------+------------------+ + | IPv6 Flow | Uniqueness | None (6) | + | Label | | | + +---------------+-------------------------------+------------------+ + | DNS ID | Uniqueness | None (7) | + +---------------+-------------------------------+------------------+ + + Table 1: Survey of Transient Numeric Identifiers + + NOTE: + + (1) While a single collision of IPv6 Identification (ID) values + would simply lead to a single packet drop (and hence, a "soft" + failure), repeated collisions at high data rates might result in + self-propagating collisions of IPv6 IDs, thus possibly leading + to a hard failure [RFC4963]. + + (2) While the interoperability requirements are simply that the + Interface Identifier results in a unique IPv6 address, for + operational reasons, it is typically desirable that the + resulting IPv6 address (and hence, the corresponding Interface + Identifier) be stable within each network [RFC7217] [RFC8064]. + + (3) While IPv6 Interface Identifiers must result in unique IPv6 + addresses, IPv6 Duplicate Address Detection (DAD) [RFC4862] + allows for the detection of duplicate addresses, and hence, such + Interface Identifier collisions can be recovered. + + (4) In theory, there are no interoperability requirements for TCP + Initial Sequence Numbers (ISNs), since the TIME-WAIT state and + TCP's "quiet time" concept take care of old segments from + previous incarnations of a connection. However, a widespread + optimization allows for a new incarnation of a previous + connection to be created if the ISN of the incoming SYN is + larger than the last sequence number seen in that direction for + the previous incarnation of the connection. Thus, monotonically + increasing TCP ISNs allow for such optimization to work as + expected [RFC6528] and can help avoid connection-establishment + failures. + + (5) Strictly speaking, there are no interoperability requirements + for the *initial* TCP timestamp employed by a TCP instance + (i.e., the TS Value (TSval) in a segment with the SYN bit set). + However, some TCP implementations allow a new incarnation of a + previous connection to be created if the TSval of the incoming + SYN is larger than the last TSval seen in that direction for the + previous incarnation of the connection (please see [RFC6191]). + Thus, monotonically increasing TCP initial timestamps (across + connections to the same endpoint) allow for such optimization to + work as expected [RFC6191] and can help avoid connection- + establishment failures. + + (6) The IPv6 Flow Label [RFC6437], along with the IPv6 Source + Address and the IPv6 Destination Address, is typically employed + for load sharing [RFC7098]. Reuse of a Flow Label value for the + same set {Source Address, Destination Address} would typically + cause both flows to be multiplexed onto the same link. However, + as long as this does not occur deterministically, it will not + result in any negative implications. + + (7) DNS IDs are employed, together with the IP Source Address, the + IP Destination Address, the transport-protocol Source Port, and + the transport-protocol Destination Port, to match DNS requests + and responses. However, since an implementation knows which DNS + requests were sent for that set of {IP Source Address, IP + Destination Address, transport-protocol Source Port, transport- + protocol Destination Port, DNS ID}, a collision of DNS IDs would + result, if anything, in a small performance penalty (the + response would nevertheless be discarded when it is found that + it does not answer the query sent in the corresponding DNS + query). + + Based on the survey above, we can categorize identifiers as follows: + + +=======+======================================+====================+ + | Cat # | Category | Sample Numeric | + | | | IDs | + +=======+======================================+====================+ + | 1 | Uniqueness (soft failure) | IPv6 Flow L., DNS | + | | | ID | + +-------+--------------------------------------+--------------------+ + | 2 | Uniqueness (hard failure) | IPv6 ID, TCP | + | | | ephemeral port | + +-------+--------------------------------------+--------------------+ + | 3 | Uniqueness, stable within context | IPv6 IID | + | | (soft failure) | | + +-------+--------------------------------------+--------------------+ + | 4 | Uniqueness, monotonically increasing | TCP ISN, TCP | + | | within context (hard failure) | initial timestamp | + +-------+--------------------------------------+--------------------+ + + Table 2: Identifier Categories + + We note that Category #4 could be considered a generalized case of + Category #3, in which a monotonically increasing element is added to + a stable (within context) element, such that the resulting + identifiers are monotonically increasing within a specified context. + That is, the same algorithm could be employed for both #3 and #4, + given appropriate parameters. + +7. Common Algorithms for Transient Numeric Identifier Generation + + The following subsections describe some sample algorithms that can be + employed for generating transient numeric identifiers for each of the + categories above while mitigating the vulnerabilities analyzed in + Section 8 of this document. + + All of the variables employed in the algorithms of the following + subsections are of "unsigned integer" type, except for the "retry" + variable, which is of (signed) "integer" type. + +7.1. Category #1: Uniqueness (Soft Failure) + + The requirement of uniqueness with a soft failure severity can be + complied with a Pseudorandom Number Generator (PRNG). + + | NOTE: Please see [RFC4086] regarding randomness requirements + | for security. + + While most systems provide access to a PRNG, many of such PRNG + implementations are not cryptographically secure and therefore might + be statistically biased or subject to adversarial influence. For + example, ISO C [C11] rand(3) implementations are not + cryptographically secure. + + | NOTE: Section 7.1 ("Uniform Deviates") of [Press1992] discusses + | the underlying issues affecting ISO C [C11] rand(3) + | implementations. + + On the other hand, a number of systems provide an interface to a + Cryptographically Secure PRNG (CSPRNG) [RFC4086] [RFC8937], which + guarantees high entropy, unpredictability, and good statistical + distribution of the random values generated. For example, GNU/ + Linux's CSPRNG implementation is available via the getentropy(3) + interface [GETENTROPY], while OpenBSD's CSPRNG implementation is + available via the arc4random(3) and arc4random_uniform(3) interfaces + [ARC4RANDOM]. Where available, these CSPRNGs should be preferred + over, e.g., POSIX [POSIX] random(3) or ISO C [C11] rand(3) + implementations. + + In scenarios where a CSPRNG is not readily available to select + transient numeric identifiers of Category #1, a security and privacy + assessment of employing a regular PRNG should be performed, + supporting the implementation decision. + + | NOTE: [Aumasson2018], [Press1992], and [Knuth1983] discuss + | theoretical and practical aspects of pseudorandom number + | generation and provide guidance on how to evaluate PRNGs. + + We note that, since the premise is that collisions of transient + numeric identifiers of this category only lead to soft failures, in + many cases, the algorithm might not need to check the suitability of + a selected identifier (i.e., the suitable_id() function, described + below, could always return "true"). + + In scenarios where, e.g., simultaneous use of a given numeric + identifier is undesirable and an implementation detects such + condition, the implementation may opt to select the next available + identifier in the same sequence or select another random number. + Section 7.1.1 is an implementation of the former strategy, while + Section 7.1.2 is an implementation of the latter. Typically, the + algorithm in Section 7.1.2 results in a more uniform distribution of + the generated transient numeric identifiers. However, for transient + numeric identifiers where an implementation typically keeps local + state about unsuitable/used identifiers, the algorithm in + Section 7.1.2 may require many more iterations than the algorithm in + Section 7.1.1 to generate a suitable transient numeric identifier. + This will usually be affected by the current usage ratio of transient + numeric identifiers (i.e., the number of numeric identifiers + considered suitable / total number of numeric identifiers) and other + parameters. Therefore, in such cases, many implementations tend to + prefer the algorithm in Section 7.1.1 over the algorithm in + Section 7.1.2. + +7.1.1. Simple Randomization Algorithm + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + next_id = min_id + (random() % id_range); + retry = id_range; + + do { + if (suitable_id(next_id)) { + return next_id; + } + + if (next_id == max_id) { + next_id = min_id; + } else { + next_id++; + } + + retry--; + + } while (retry > 0); + + return ERROR; + + NOTE: + + random() is a PRNG that returns a pseudorandom unsigned integer + number of appropriate size. Beware that "adapting" the length of + the output of random() with a modulo operator (e.g., C language's + "%") may change the distribution of the PRNG. To preserve a + uniform distribution, the rejection sampling technique + [Romailler2020] can be used. + + suitable_id() is a function that checks, if possible and + desirable, whether a candidate numeric identifier is suitable + (e.g., whether it is in use or has been recently employed). + Depending on how/where the numeric identifier is used, it may or + may not be possible (or even desirable) to check whether the + numeric identifier is suitable. + + All the variables (in this algorithm and all the others algorithms + discussed in this document) are unsigned integers. + + When an identifier is found to be unsuitable, this algorithm selects + the next available numeric identifier in sequence. Thus, even when + this algorithm selects numeric identifiers randomly, it is biased + towards the first available numeric identifier after a sequence of + unavailable numeric identifiers. For example, if this algorithm is + employed for transport-protocol ephemeral port randomization + [RFC6056] and the local list of unsuitable port numbers (e.g., + registered port numbers that should not be used for ephemeral ports) + is significant, an attacker may actually have a significantly better + chance of guessing an ephemeral port number. + + Assuming the randomness requirements for the PRNG are met (see + [RFC4086]), this algorithm does not suffer from any of the issues + discussed in Section 8. + +7.1.2. Another Simple Randomization Algorithm + + The following pseudocode illustrates another algorithm for selecting + a random transient numeric identifier where, in the event a selected + identifier is found to be unsuitable (e.g., already in use), another + identifier is randomly selected: + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + retry = id_range; + + do { + next_id = min_id + (random() % id_range); + + if (suitable_id(next_id)) { + return next_id; + } + + retry--; + + } while (retry > 0); + + return ERROR; + + NOTE: + + random() is a PRNG that returns a pseudorandom unsigned integer + number of appropriate size. Beware that "adapting" the length of + the output of random() with a modulo operator (e.g., C language's + "%") may change the distribution of the PRNG. To preserve a + uniform distribution, the rejection sampling technique + [Romailler2020] can be used. + + suitable_id() is a function that checks, if possible and + desirable, whether a candidate numeric identifier is suitable + (e.g., if it is not already in use). Depending on how/where the + numeric identifier is used, it may or may not be possible (or even + desirable) to check whether the numeric identifier is in use (or + whether it has been recently employed). + + When an identifier is found to be unsuitable, this algorithm selects + another random numeric identifier. Thus, this algorithm might be + unable to select a transient numeric identifier (i.e., return + "ERROR"), even if there are suitable identifiers available, in cases + where a large number of identifiers are found to be unsuitable (e.g., + "in use"). + + Assuming the randomness requirements for the PRNG are met (see + [RFC4086]), this algorithm does not suffer from any of the issues + discussed in Section 8. + +7.2. Category #2: Uniqueness (Hard Failure) + + One of the most trivial approaches for generating a unique transient + numeric identifier (with a hard failure severity) is to reduce the + identifier reuse frequency by generating the numeric identifiers with + a monotonically increasing function (e.g., linear). As a result, any + of the algorithms described in Section 7.4 ("Category #4: Uniqueness, + Monotonically Increasing within Context (Hard Failure)") can be + readily employed for complying with the requirements of this + transient numeric identifier category. + + In cases where suitability (e.g., uniqueness) of the selected + identifiers can be definitely assessed by the local system, any of + the algorithms described in Section 7.1 ("Category #1: Uniqueness + (Soft Failure)") can be readily employed for complying with the + requirements of this numeric identifier category. + + | NOTE: In the case of, e.g., TCP ephemeral ports or TCP ISNs, a + | transient numeric identifier that might seem suitable from the + | perspective of the local system might actually be unsuitable + | from the perspective of the remote system (e.g., because there + | is state associated with the selected identifier at the remote + | system). Therefore, in such cases, it is not possible to + | employ the algorithms from Section 7.1 ("Category #1: + | Uniqueness (Soft Failure)"). + +7.3. Category #3: Uniqueness, Stable within Context (Soft Failure) + + The goal of the following algorithm is to produce identifiers that + are stable for a given context (identified by "CONTEXT") but that + change when the aforementioned context changes. + + In order to avoid storing the transient numeric identifiers computed + for each CONTEXT in memory, the following algorithm employs a + calculated technique (as opposed to keeping state in memory) to + generate a stable transient numeric identifier for each given + context. + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + + retry = 0; + + do { + offset = F(CONTEXT, retry, secret_key); + next_id = min_id + (offset % id_range); + + if (suitable_id(next_id)) { + return next_id; + } + + retry++; + + } while (retry <= MAX_RETRIES); + + return ERROR; + + NOTE: + + CONTEXT is the concatenation of all the elements that define a + given context. + + F() is a pseudorandom function (PRF). It must not be computable + from the outside (without knowledge of the secret key). F() must + also be difficult to reverse, such that it resists attempts to + obtain the secret key, even when given samples of the output of + F() and knowledge or control of the other input parameters. F() + should produce an output of at least as many bits as required for + the transient numeric identifier. SipHash-2-4 (128-bit key, + 64-bit output) [SipHash] and BLAKE3 (256-bit key, arbitrary-length + output) [BLAKE3] are two possible options for F(). Alternatively, + F() could be implemented with a keyed hash message authentication + code (HMAC) [RFC2104]. HMAC-SHA-256 [FIPS-SHS] would be one + possible option for such implementation alternative. Note: Use of + HMAC-MD5 [RFC1321] or HMAC-SHA1 [FIPS-SHS] are not recommended for + F() [RFC6151] [RFC6194]. The result of F() is no more secure than + the secret key, and therefore, "secret_key" must be unknown to the + attacker and must be of a reasonable length. "secret_key" must + remain stable for a given CONTEXT, since otherwise, the numeric + identifiers generated by this algorithm would not have the desired + stability properties (i.e., stable for a given CONTEXT). In most + cases, "secret_key" should be selected with a PRNG (see [RFC4086] + for recommendations on choosing secrets) at an appropriate time + and stored in stable or volatile storage (as necessary) for future + use. + + suitable_id() checks whether a candidate numeric identifier has + suitable uniqueness properties. + + In this algorithm, the function F() provides a stateless and stable + per-CONTEXT offset, where CONTEXT is the concatenation of all the + elements that define the given context. + + For example, if this algorithm is expected to produce IPv6 IIDs that + are unique per network interface and Stateless Address + Autoconfiguration (SLAAC) prefix, CONTEXT should be the concatenation + of, e.g., the network interface index and the SLAAC autoconfiguration + prefix (please see [RFC7217] for an implementation of this algorithm + for generation of stable IPv6 addresses). + + The result of F() is stored in the variable "offset", which may take + any value within the storage type range, since we are restricting the + resulting identifier to be in the range [min_id, max_id] in a similar + way as in the algorithm described in Section 7.1.1. + + As noted above, suitable_id() checks whether a candidate numeric + identifier has suitable uniqueness properties. Collisions (i.e., an + identifier that is not unique) are recovered by incrementing the + "retry" variable and recomputing F(), up to a maximum of MAX_RETRIES + times. However, recovering from collisions will usually result in + identifiers that fail to remain constant for the specified context. + This is normally acceptable when the probability of collisions is + small, as in the case of, e.g., IPv6 IIDs resulting from SLAAC + [RFC7217] [RFC8981]. + + For obvious reasons, the transient numeric identifiers generated with + this algorithm allow for network activity correlation and + fingerprinting within "CONTEXT". However, this is essentially a + design goal of this category of transient numeric identifiers. + +7.4. Category #4: Uniqueness, Monotonically Increasing within Context + (Hard Failure) + +7.4.1. Per-Context Counter Algorithm + + One possible way of selecting unique monotonically increasing + identifiers (per context) is to employ a per-context counter. Such + an algorithm could be described as follows: + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + retry = id_range; + id_inc = increment() % id_range; + + if( (next_id = lookup_counter(CONTEXT)) == ERROR){ + next_id = min_id + random() % id_range; + } + + do { + if ( (max_id - next_id) >= id_inc){ + next_id = next_id + id_inc; + } + else { + next_id = min_id + id_inc - (max_id - next_id); + } + + if (suitable_id(next_id)){ + store_counter(CONTEXT, next_id); + return next_id; + } + + retry = retry - id_inc; + + } while (retry > 0); + + return ERROR; + + NOTE: + + CONTEXT is the concatenation of all the elements that define a + given context. + + increment() returns a small integer that is employed to increment + the current counter value to obtain the next transient numeric + identifier. This value must be larger than or equal to 1, and + much smaller than the number of possible values for the numeric + identifiers (i.e., "id_range"). Most implementations of this + algorithm employ a constant increment of 1. Using a value other + than 1 can help mitigate some information leakages (please see + below) at the expense of a possible increase in the numeric + identifier reuse frequency. The code above makes sure that the + increment employed in the algorithm (id_inc) is always smaller + than the number of possible values for the numeric identifiers + (i.e., "max_id - min_d + 1"). However, as noted above, this value + must also be much smaller than the number of possible values for + the numeric identifiers. + + lookup_counter() is a function that returns the current counter + for a given context or an error condition if that counter does not + exist. + + random() is a PRNG that returns a pseudorandom unsigned integer + number of appropriate size. Beware that "adapting" the length of + the output of random() with a modulo operator (e.g., C language's + "%") may change the distribution of the PRNG. To preserve a + uniform distribution, the rejection sampling technique + [Romailler2020] can be used. + + store_counter() is a function that saves a counter value for a + given context. + + suitable_id() checks whether a candidate numeric identifier has + suitable uniqueness properties. + + Essentially, whenever a new identifier is to be selected, the + algorithm checks whether a counter for the corresponding context + exists. If it does, the value of such counter is incremented to + obtain the new transient numeric identifier, and the counter is + updated. If no counter exists for such context, a new counter is + created and initialized to a random value and used as the selected + transient numeric identifier. This algorithm produces a per-context + counter, which results in one monotonically increasing function for + each context. Since each counter is initialized to a random value, + the resulting values are unpredictable by an off-path attacker. + + The choice of id_inc has implications on both the security and + privacy properties of the resulting identifiers and also on the + corresponding interoperability properties. On one hand, minimizing + the increments generally minimizes the identifier reuse frequency, + albeit at increased predictability. On the other hand, if the + increments are randomized, predictability of the resulting + identifiers is reduced, and the information leakage produced by + global constant increments is mitigated. However, using larger + increments than necessary can result in higher numeric identifier + reuse frequency. + + This algorithm has the following drawbacks: + + * It requires an implementation to store each per-context counter in + memory. If, as a result of resource management, the counter for a + given context must be removed, the last transient numeric + identifier value used for that context will be lost. Thus, if an + identifier subsequently needs to be generated for the same + context, the corresponding counter will need to be recreated and + reinitialized to a random value, thus possibly leading to reuse/ + collision of numeric identifiers. + + * Keeping one counter for each possible "context" may in some cases + be considered too onerous in terms of memory requirements. + + Otherwise, the identifiers produced by this algorithm do not suffer + from the other issues discussed in Section 8. + +7.4.2. Simple PRF-Based Algorithm + + The goal of this algorithm is to produce monotonically increasing + transient numeric identifiers (for each given context) with a + randomized initial value. For example, if the identifiers being + generated must be monotonically increasing for each {Source Address, + Destination Address} set, then each possible combination of {Source + Address, Destination Address} should have a separate monotonically + increasing sequence that starts at a different random value. + + Instead of maintaining a per-context counter (as in the algorithm + from Section 7.4.1), the following algorithm employs a calculated + technique to maintain a random offset for each possible context. + + /* Initialization code */ + counter = 0; + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + id_inc = increment() % id_range; + offset = F(CONTEXT, secret_key); + retry = id_range; + + do { + next_id = min_id + (offset + counter) % id_range; + counter = counter + id_inc; + + if (suitable_id(next_id)) { + return next_id; + } + + retry = retry - id_inc; + + } while (retry > 0); + + return ERROR; + + NOTE: + + CONTEXT is the concatenation of all the elements that define a + given context. For example, if this algorithm is expected to + produce identifiers that are monotonically increasing for each set + {Source Address, Destination Address}, CONTEXT should be the + concatenation of Source Address and Destination Address. + + increment() has the same properties and requirements as those + specified for increment() in Section 7.4.1. + + F() is a PRF, with the same properties as those specified for F() + in Section 7.3. + + suitable_id() checks whether a candidate numeric identifier has + suitable uniqueness properties. + + In the algorithm above, the function F() provides a stateless, + stable, and unpredictable offset for each given context (as + identified by "CONTEXT"). Both the "offset" and "counter" variables + may take any value within the storage type range since we are + restricting the resulting identifier to be in the range [min_id, + max_id] in a similar way as in the algorithm described in + Section 7.1.1. This allows us to simply increment the "counter" + variable and rely on the unsigned integer to wrap around. + + The result of F() is no more secure than the secret key, and + therefore, "secret_key" must be unknown to the attacker and must be + of a reasonable length. "secret_key" must remain stable for a given + CONTEXT, since otherwise, the numeric identifiers generated by this + algorithm would not have the desired properties (i.e., monotonically + increasing for a given CONTEXT). In most cases, "secret_key" should + be selected with a PRNG (see [RFC4086] for recommendations on + choosing secrets) at an appropriate time and stored in stable or + volatile storage (as necessary) for future use. + + It should be noted that, since this algorithm uses a global counter + ("counter") for selecting identifiers (i.e., all counters share the + same increment space), this algorithm results in an information + leakage (as described in Section 8.2). For example, if this + algorithm was used for selecting TCP ephemeral ports and an attacker + could force a client to periodically establish a new TCP connection + to an attacker-controlled system (or through an attacker-observable + routing path), the attacker could subtract consecutive Source Port + values to obtain the number of outgoing TCP connections established + globally by the victim host within that time period (up to wrap- + around issues and five-tuple collisions, of course). This + information leakage could be partially mitigated by employing small + random values for the increments (i.e., increment() function), + instead of having increment() return the constant "1". + + We nevertheless note that an improved mitigation of this information + leakage could be more successfully achieved by employing the + algorithm from Section 7.4.3, instead. + +7.4.3. Double-PRF Algorithm + + A trade-off between maintaining a single global "counter" variable + and maintaining 2**N "counter" variables (where N is the width of the + result of F()) could be achieved as follows. The system would keep + an array of TABLE_LENGTH values, which would provide a separation of + the increment space into multiple buckets. This improvement could be + incorporated into the algorithm from Section 7.4.2 as follows: + + /* Initialization code */ + + for(i = 0; i < TABLE_LENGTH; i++) { + table[i] = random(); + } + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + id_inc = increment() % id_range; + offset = F(CONTEXT, secret_key1); + index = G(CONTEXT, secret_key2) % TABLE_LENGTH; + retry = id_range; + + do { + next_id = min_id + (offset + table[index]) % id_range; + table[index] = table[index] + id_inc; + + if (suitable_id(next_id)) { + return next_id; + } + + retry = retry - id_inc; + + } while (retry > 0); + + return ERROR; + + NOTE: + + increment() has the same properties and requirements as those + specified for increment() in Section 7.4.1. + + Both F() and G() are PRFs, with the same properties as those + required for F() in Section 7.3. The results of F() and G() are + no more secure than their respective secret keys ("secret_key1" + and "secret_key2", respectively), and therefore, both secret keys + must be unknown to the attacker and must be of a reasonable + length. Both secret keys must remain stable for the given + CONTEXT, since otherwise, the transient numeric identifiers + generated by this algorithm would not have the desired properties + (i.e., monotonically increasing for a given CONTEXT). In most + cases, both secret keys should be selected with a PRNG (see + [RFC4086] for recommendations on choosing secrets) at an + appropriate time and stored in stable or volatile storage (as + necessary) for future use. + + "table[]" could be initialized with random values, as indicated by + the initialization code in the pseudocode above. + + The "table[]" array assures that successive transient numeric + identifiers for a given context will be monotonically increasing. + Since the increment space is separated into TABLE_LENGTH different + spaces, the identifier reuse frequency will be (probabilistically) + lower than that of the algorithm in Section 7.4.2. That is, the + generation of an identifier for one given context will not + necessarily result in increments in the identifier sequence of other + contexts. It is interesting to note that the size of "table[]" does + not limit the number of different identifier sequences but rather + separates the *increment space* into TABLE_LENGTH different spaces. + The selected transient numeric identifier sequence will be obtained + by adding the corresponding entry from "table[]" to the value in the + "offset" variable, which selects the actual identifier sequence space + (as in the algorithm from Section 7.4.2). + + An attacker can perform traffic analysis for any "increment space" + (i.e., context) into which the attacker has "visibility" -- namely, + the attacker can force a system to generate identifiers for + G(CONTEXT, secret_key2), where the result of G() identifies the + target "increment space". However, the attacker's ability to perform + traffic analysis is very reduced when compared to the simple PRF- + based identifiers (described in Section 7.4.2) and the predictable + linear identifiers (described in Appendix A.1). Additionally, an + implementation can further limit the attacker's ability to perform + traffic analysis by further separating the increment space (that is, + using a larger value for TABLE_LENGTH) and/or by randomizing the + increments (i.e., increment() returning a small random number as + opposed to the constant "1"). + + Otherwise, this algorithm does not suffer from the issues discussed + in Section 8. + +8. Common Vulnerabilities Associated with Transient Numeric Identifiers + +8.1. Network Activity Correlation + + An identifier that is predictable within a given context allows for + network activity correlation within that context. + + For example, a stable IPv6 Interface Identifier allows for network + activity to be correlated within the context in which the Interface + Identifier is stable [RFC7721]. A stable per-network IPv6 Interface + Identifier (as in [RFC7217]) allows for network activity correlation + within a network, whereas a constant IPv6 Interface Identifier (which + remains constant across networks) allows not only network activity + correlation within the same network but also across networks ("host- + tracking"). + + Similarly, an implementation that generates TCP ISNs with a global + counter could allow for fingerprinting and network activity + correlation across networks, since an attacker could passively infer + the identity of the victim based on the TCP ISNs employed for + subsequent communication instances. Similarly, an implementation + that generates predictable IPv6 Identification values could be + subject to fingerprinting attacks (see, e.g., [Bellovin2002]). + +8.2. Information Leakage + + Transient numeric identifiers that result in specific patterns can + produce an information leakage to other communicating entities. For + example, it is common to generate transient numeric identifiers with + an algorithm such as: + + ID = offset(CONTEXT) + mono(CONTEXT); + + This generic expression generates identifiers by adding a + monotonically increasing function (e.g., linear) to a randomized + offset. offset() is constant within a given context, whereas mono() + produces a monotonically increasing sequence for the given context. + Identifiers generated with this expression will generally be + predictable within CONTEXT. + + + The predictability of mono(), irrespective of the predictability of + offset(), can leak information that may be of use to attackers. For + example, a node that selects transport-protocol ephemeral port + numbers, as in: + + ephemeral_port = offset(IP_Dst_Addr) + mono() + + that is, with a per-destination offset but a global mono() function + (e.g., a global counter), will leak information about the total + number of outgoing connections that have been issued by the + vulnerable implementation. + + + Similarly, a node that generates IPv6 Identification values as in: + + ID = offset(IP_Src_Addr, IP_Dst_Addr) + mono() + + will leak out information about the total number of fragmented + packets that have been transmitted by the vulnerable implementation. + The vulnerabilities described in [Sanfilippo1998a], + [Sanfilippo1998b], and [Sanfilippo1999] are all associated with the + use of a global mono() function (i.e., with a global and constant + "CONTEXT") -- particularly when it is a linear function (constant + increments of 1). + + Predicting transient numeric identifiers can be of help for other + types of attacks. For example, predictable TCP ISNs can open the + door to trivial connection-reset and data injection attacks (see + Section 8.6). + +8.3. Fingerprinting + + Fingerprinting is the capability of an attacker to identify or + reidentify a visiting user, user agent, or device via configuration + settings or other observable characteristics. Observable protocol + objects and characteristics can be employed to identify/reidentify + various entities. These entities can range from the underlying + hardware or operating system (OS) (vendor, type, and version) to the + user. [EFF] illustrates web-browser-based fingerprinting, but + similar techniques can be applied at other layers and protocols, + whether alternatively or in conjunction with it. + + Transient numeric identifiers are one of the observable protocol + components that could be leveraged for fingerprinting purposes. That + is, an attacker could sample transient numeric identifiers to infer + the algorithm (and its associated parameters, if any) for generating + such identifiers, possibly revealing the underlying OS vendor, type, + and version. This information could possibly be further leveraged in + conjunction with other fingerprinting techniques and sources. + + Evasion of protocol-stack fingerprinting can prove to be a very + difficult task, i.e., most systems make use of a wide variety of + protocols, each of which have a large number of parameters that can + be set to arbitrary values or generated with a variety of algorithms + with multiple parameters. + + | NOTE: General protocol-based fingerprinting is discussed in + | [RFC6973], along with guidelines to mitigate the associated + | vulnerability. [Fyodor1998] and [Fyodor2006] are classic + | references on OS detection via TCP/IP stack fingerprinting. + | Network Mapper [nmap] is probably the most popular tool for + | remote OS identification via active TCP/IP stack + | fingerprinting. p0f [Zalewski2012], on the other hand, is a + | tool for performing remote OS detection via passive TCP/IP + | stack fingerprinting. Finally, [TBIT] is a TCP fingerprinting + | tool that aims at characterizing the behavior of a remote TCP + | peer based on active probes, which has been widely used in the + | research community. + + Algorithms that, from the perspective of an observer (e.g., the + legitimate communicating peer), result in specific values or patterns + will allow for at least some level of fingerprinting. For example, + the algorithm from Section 7.3 will typically allow fingerprinting + within the context where the resulting identifiers are stable. + Similarly, the algorithms from Section 7.4 will result in + monotonically increasing sequences within a given context, thus + allowing for at least some level of fingerprinting (when the other + communicating entity can correlate different sampled identifiers as + belonging to the same monotonically increasing sequence). + + Thus, where possible, algorithms from Section 7.1 should be preferred + over algorithms that result in specific values or patterns. + +8.4. Exploitation of the Semantics of Transient Numeric Identifiers + + Identifiers that are not semantically opaque tend to be more + predictable than semantically opaque identifiers. For example, a + Media Access Control (MAC) address contains an Organizationally + Unique Identifier (OUI), which may identify the vendor that + manufactured the corresponding network interface card. This can be + leveraged by an attacker trying to "guess" MAC addresses, who has + some knowledge about the possible Network Interface Card (NIC) + vendor. + + [RFC7707] discusses a number of techniques to reduce the search space + when performing IPv6 address-scanning attacks by leveraging the + semantics of IPv6 IIDs. + +8.5. Exploitation of Collisions of Transient Numeric Identifiers + + In many cases, the collision of transient network identifiers can + have a hard failure severity (or result in a hard failure severity if + an attacker can cause multiple collisions deterministically, one + after another). For example, predictable IP Identification values + open the door to Denial of Service (DoS) attacks (see, e.g., + [RFC5722].). + +8.6. Exploitation of Predictable Transient Numeric Identifiers for + Injection Attacks + + Some protocols rely on "sequence numbers" for the validation of + incoming packets. For example, TCP employs sequence numbers for + reassembling TCP segments, while IPv4 and IPv6 employ Identification + values for reassembling IPv4 and IPv6 fragments (respectively). + Lacking built-in cryptographic mechanisms for validating packets, + these protocols are therefore vulnerable to on-path data (see, e.g., + [Joncheray1995]) and/or control-information (see, e.g., [RFC4953] and + [RFC5927]) injection attacks. The extent to which these protocols + may resist off-path (i.e., "blind") injection attacks depends on + whether the associated "sequence numbers" are predictable and the + effort required to successfully predict a valid "sequence number" + (see, e.g., [RFC4953] and [RFC5927]). + + We note that the use of unpredictable "sequence numbers" is a + completely ineffective mitigation for on-path injection attacks and + also a mostly ineffective mitigation for off-path (i.e., "blind") + injection attacks. However, many legacy protocols (such as TCP) do + not incorporate cryptographic mitigations as part of the core + protocol but rather as optional features (see, e.g., [RFC5925]), if + available at all. Additionally, ad hoc use of cryptographic + mitigations might not be sufficient to relieve a protocol + implementation of generating appropriate transient numeric + identifiers. For example, use of the Transport Layer Security (TLS) + protocol [RFC8446] with TCP will protect the application protocol but + will not help to mitigate, e.g., TCP-based connection-reset attacks + (see, e.g., [RFC4953]). Similarly, use of SEcure Neighbor Discovery + (SEND) [RFC3971] will still imply reliance on the successful + reassembly of IPv6 fragments in those cases where SEND packets do not + fit into the link Maximum Transmission Unit (MTU) (see [RFC6980]). + +8.7. Cryptanalysis + + A number of algorithms discussed in this document (such as those + described in Sections 7.4.2 and 7.4.3) rely on PRFs. Implementations + that employ weak PRFs or keys of inappropriate size can be subject to + cryptanalysis, where an attacker can obtain the secret key employed + for the PRF, predict numeric identifiers, etc. + + Furthermore, an implementation that overloads the semantics of the + secret key can result in more trivial cryptanalysis, possibly + resulting in the leakage of the value employed for the secret key. + + | NOTE: [IPID-DEV] describes two vulnerable transient numeric + | identifier generators that employ cryptographically weak hash + | functions. Additionally, one of such implementations employs + | 32 bits of a kernel address as the secret key for a hash + | function, and therefore, successful cryptanalysis leaks the + | aforementioned kernel address, allowing for Kernel Address + | Space Layout Randomization (KASLR) [KASLR] bypass. + +9. Vulnerability Assessment of Transient Numeric Identifiers + + The following subsections analyze possible vulnerabilities associated + with the algorithms described in Section 7. + +9.1. Category #1: Uniqueness (Soft Failure) + + Possible vulnerabilities associated with the algorithms from + Section 7.1 include the following: + + * use of flawed PRNGs (please see, e.g., [Zalewski2001], + [Zalewski2002], [Klein2007], and [CVEs]) + + * inadvertently affecting the distribution of an otherwise suitable + PRNG (please see, e.g., [Romailler2020]) + + Where available, CSPRNGs should be preferred over regular PRNGs, such + as, e.g., POSIX random(3) implementations. In scenarios where a + CSPRNG is not readily available, a security and privacy assessment of + employing a regular PRNG should be performed, supporting the + implementation decision. + + | NOTE: Please see [RFC4086] regarding randomness requirements + | for security. [Aumasson2018], [Press1992], and [Knuth1983] + | discuss theoretical and practical aspects of random number + | generation and provide guidance on how to evaluate PRNGs. + + When employing a PRNG, many implementations "adapt" the length of its + output with a modulo operator (e.g., C language's "%"), possibly + changing the distribution of the output of the PRNG. + + For example, consider an implementation that employs the following + code: + + id = random() % 50000; + + This example implementation means to obtain a transient numeric + identifier in the range 0-49999. If random() produces, e.g., a + pseudorandom number of 16 bits (with uniform distribution), the + selected transient numeric identifier will have a nonuniform + distribution with the numbers in the range 0-15535 having double + frequency than the numbers in the range 15536-49999. + + + | NOTE: For example, in our sample code, both an output of 10 and + | output of 50010 from the random() function will result in an + | "id" value of 10. + + This effect is reduced if the PRNG produces an output that is much + longer than the length implied by the modulo operation. We note that + to preserve a uniform distribution, the rejection sampling technique + [Romailler2020] can be used. + + Use of algorithms other than PRNGs for generating identifiers of this + category is discouraged. + +9.2. Category #2: Uniqueness (Hard Failure) + + As noted in Section 7.2, this category can employ the same algorithms + as Category #4, since a monotonically increasing sequence tends to + minimize the transient numeric identifier reuse frequency. + Therefore, the vulnerability analysis in Section 9.4 also applies to + this category. + + Additionally, as noted in Section 7.2, some transient numeric + identifiers of this category might be able to use the algorithms from + Section 7.1, in which case the same considerations as in Section 9.1 + would apply. + +9.3. Category #3: Uniqueness, Stable within Context (Soft Failure) + + Possible vulnerabilities associated with the algorithms from + Section 7.3 are the following: + + * Use of weak PRFs or inappropriate secret keys (whether + inappropriate selection or inappropriate size) could allow for + cryptanalysis, which could eventually be exploited by an attacker + to predict future transient numeric identifiers. + + * Since the algorithm generates a unique and stable identifier + within a specified context, it may allow for network activity + correlation and fingerprinting within the specified context. + +9.4. Category #4: Uniqueness, Monotonically Increasing within Context + (Hard Failure) + + The algorithm described in Section 7.4.1 for generating identifiers + of Category #4 will result in an identifiable pattern (i.e., a + monotonically increasing sequence) for the transient numeric + identifiers generated for each CONTEXT, and thus will allow for + fingerprinting and network activity correlation within each CONTEXT. + + On the other hand, a simple way to generalize and analyze the + algorithms described in Sections 7.4.2 and 7.4.3 for generating + identifiers of Category #4 is as follows: + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + retry = id_range; + id_inc = increment() % id_range; + + do { + update_mono(CONTEXT, id_inc); + next_id = min_id + (offset(CONTEXT) + \ + mono(CONTEXT)) % id_range; + + if (suitable_id(next_id)) { + return next_id; + } + + retry = retry - id_inc; + + } while (retry > 0); + + return ERROR; + + NOTE: + + increment() returns a small integer that is employed to generate a + monotonically increasing function. Most implementations employ a + constant value for "increment()" (usually 1). The value returned + by increment() must be much smaller than the value computed for + "id_range". + + update_mono(CONTEXT, id_inc) increments the counter corresponding + to CONTEXT by "id_inc". + + mono(CONTEXT) reads the counter corresponding to CONTEXT. + + Essentially, an identifier (next_id) is generated by adding a + monotonically increasing function (mono()) to an offset value, which + is unknown to the attacker and stable for given context (CONTEXT). + + The following aspects of the algorithm should be considered: + + * For the most part, it is the offset() function that results in + identifiers that are unpredictable by an off-patch attacker. + While the resulting sequence is known to be monotonically + increasing, the use of a randomized offset value makes the + resulting values unknown to the attacker. + + * The most straightforward "stateless" implementation of offset() is + with a PRF that takes the values that identify the context and a + secret key (not shown in the figure above) as arguments. + + * One possible implementation of mono() would be to have mono() + internally employ a single counter (as in the algorithm from + Section 7.4.2) or map the increments for different contexts into a + number of counters/buckets, such that the number of counters that + need to be maintained in memory is reduced (as in the "Double-PRF + Algorithm" from Section 7.4.3). + + * In all cases, a monotonically increasing function is implemented + by incrementing the previous value of a counter by increment() + units. In the most trivial case, increment() could return the + constant "1". But increment() could also be implemented to return + small random integers such that the increments are unpredictable + (see Appendix A.2 of this document). This represents a trade-off + between the unpredictability of the resulting transient numeric + identifiers and the transient numeric identifier reuse frequency. + + Considering the generic algorithm illustrated above, we can identify + the following possible vulnerabilities: + + * Since the algorithms for this category are similar to those of + Section 9.3, with the addition of a monotonically increasing + function, all the issues discussed in Section 9.3 ("Category #3: + Uniqueness, Stable within Context (Soft Failure)") also apply to + this case. + + * mono() can be correlated to the number of identifiers generated + for a given context (CONTEXT). Thus, if mono() spans more than + the necessary context, the "increments" could be leaked to other + parties, thus disclosing information about the number of + identifiers that have been generated by the algorithm for all + contexts. This information disclosure becomes more evident when + an implementation employs a constant increment of 1. For example, + an implementation where mono() is actually a single global counter + will unnecessarily leak information about the number of + identifiers that have been generated by the algorithm (globally, + for all contexts). [Fyodor2003] describes one example of how such + information leakages can be exploited. We note that limiting the + span of the increment space will require a larger number of + counters to be stored in memory (i.e., a larger value for the + TABLE_LENGTH parameter of the algorithm in Section 7.4.3). + + * Transient numeric identifiers generated with the algorithms + described in Sections 7.4.2 and 7.4.3 will normally allow for + fingerprinting within CONTEXT since, for such context, the + resulting identifiers will have an identifiable pattern (i.e., a + monotonically increasing sequence). + +10. IANA Considerations + + This document has no IANA actions. + +11. Security Considerations + + This entire document is about the security and privacy implications + of transient numeric identifiers. [RFC9416] recommends that protocol + specifications specify the interoperability requirements of their + transient numeric identifiers, perform a vulnerability assessment of + their transient numeric identifiers, and recommend an algorithm for + generating each of their transient numeric identifiers. This + document analyzes possible algorithms (and their implications) that + could be employed to comply with the interoperability requirements of + the most common categories of transient numeric identifiers while + minimizing the associated negative security and privacy implications. + +12. References + +12.1. Normative References + + [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, + DOI 10.17487/RFC0791, September 1981, + <https://www.rfc-editor.org/info/rfc791>. + + [RFC0793] Postel, J., "Transmission Control Protocol", RFC 793, + DOI 10.17487/RFC0793, September 1981, + <https://www.rfc-editor.org/info/rfc793>. + + [RFC1035] Mockapetris, P., "Domain names - implementation and + specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, + November 1987, <https://www.rfc-editor.org/info/rfc1035>. + + [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, + DOI 10.17487/RFC1321, April 1992, + <https://www.rfc-editor.org/info/rfc1321>. + + [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 + (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, + December 1998, <https://www.rfc-editor.org/info/rfc2460>. + + [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, + "Randomness Requirements for Security", BCP 106, RFC 4086, + DOI 10.17487/RFC4086, June 2005, + <https://www.rfc-editor.org/info/rfc4086>. + + [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing + Architecture", RFC 4291, DOI 10.17487/RFC4291, February + 2006, <https://www.rfc-editor.org/info/rfc4291>. + + [RFC4862] Thomson, S., Narten, T., and T. Jinmei, "IPv6 Stateless + Address Autoconfiguration", RFC 4862, + DOI 10.17487/RFC4862, September 2007, + <https://www.rfc-editor.org/info/rfc4862>. + + [RFC5722] Krishnan, S., "Handling of Overlapping IPv6 Fragments", + RFC 5722, DOI 10.17487/RFC5722, December 2009, + <https://www.rfc-editor.org/info/rfc5722>. + + [RFC5905] Mills, D., Martin, J., Ed., Burbank, J., and W. Kasch, + "Network Time Protocol Version 4: Protocol and Algorithms + Specification", RFC 5905, DOI 10.17487/RFC5905, June 2010, + <https://www.rfc-editor.org/info/rfc5905>. + + [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP + Authentication Option", RFC 5925, DOI 10.17487/RFC5925, + June 2010, <https://www.rfc-editor.org/info/rfc5925>. + + [RFC6056] Larsen, M. and F. Gont, "Recommendations for Transport- + Protocol Port Randomization", BCP 156, RFC 6056, + DOI 10.17487/RFC6056, January 2011, + <https://www.rfc-editor.org/info/rfc6056>. + + [RFC6151] Turner, S. and L. Chen, "Updated Security Considerations + for the MD5 Message-Digest and the HMAC-MD5 Algorithms", + RFC 6151, DOI 10.17487/RFC6151, March 2011, + <https://www.rfc-editor.org/info/rfc6151>. + + [RFC6191] Gont, F., "Reducing the TIME-WAIT State Using TCP + Timestamps", BCP 159, RFC 6191, DOI 10.17487/RFC6191, + April 2011, <https://www.rfc-editor.org/info/rfc6191>. + + [RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, + "IPv6 Flow Label Specification", RFC 6437, + DOI 10.17487/RFC6437, November 2011, + <https://www.rfc-editor.org/info/rfc6437>. + + [RFC6528] Gont, F. and S. Bellovin, "Defending against Sequence + Number Attacks", RFC 6528, DOI 10.17487/RFC6528, February + 2012, <https://www.rfc-editor.org/info/rfc6528>. + + [RFC7217] Gont, F., "A Method for Generating Semantically Opaque + Interface Identifiers with IPv6 Stateless Address + Autoconfiguration (SLAAC)", RFC 7217, + DOI 10.17487/RFC7217, April 2014, + <https://www.rfc-editor.org/info/rfc7217>. + + [RFC7323] Borman, D., Braden, B., Jacobson, V., and R. + Scheffenegger, Ed., "TCP Extensions for High Performance", + RFC 7323, DOI 10.17487/RFC7323, September 2014, + <https://www.rfc-editor.org/info/rfc7323>. + + [RFC8064] Gont, F., Cooper, A., Thaler, D., and W. Liu, + "Recommendation on Stable IPv6 Interface Identifiers", + RFC 8064, DOI 10.17487/RFC8064, February 2017, + <https://www.rfc-editor.org/info/rfc8064>. + + [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 + (IPv6) Specification", STD 86, RFC 8200, + DOI 10.17487/RFC8200, July 2017, + <https://www.rfc-editor.org/info/rfc8200>. + + [RFC8981] Gont, F., Krishnan, S., Narten, T., and R. Draves, + "Temporary Address Extensions for Stateless Address + Autoconfiguration in IPv6", RFC 8981, + DOI 10.17487/RFC8981, February 2021, + <https://www.rfc-editor.org/info/rfc8981>. + + [RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)", + STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022, + <https://www.rfc-editor.org/info/rfc9293>. + +12.2. Informative References + + [ARC4RANDOM] + OpenBSD, "arc4random(3)", Library Functions Manual, + September 2019, <https://man.openbsd.org/arc4random>. + + [Aumasson2018] + Aumasson, J-P., "Serious Cryptography: A Practical + Introduction to Modern Encryption", No Starch Press, Inc., + ISBN-10 1-59327-826-8, November 2017. + + [Bellovin1989] + Bellovin, S., "Security Problems in the TCP/IP Protocol + Suite", Computer Communications Review, Vol. 19, No. 2, + pp. 32-48, April 1989, + <https://www.cs.columbia.edu/~smb/papers/ipext.pdf>. + + [Bellovin2002] + Bellovin, S., "A Technique for Counting NATted Hosts", + IMW'02, Marseille, France, ISBN 1-58113-603-X/02/0011, + November 2002, + <https://www.cs.columbia.edu/~smb/papers/fnat.pdf>. + + [BLAKE3] "BLAKE3: one function, fast everywhere", September 2022, + <https://blake3.io/>. + + [C11] ISO/IEC, "Information technology - Programming languages - + C", ISO/IEC 9899:2018, June 2018. + + [CPNI-TCP] Centre for the Protection of National Infrastructure + (CPNI), "Security Assessment of the Transmission Control + Protocol (TCP)", CPNI Technical Note 3/2009, February + 2009, <https://www.si6networks.com/files/publications/tn- + 03-09-security-assessment-TCP.pdf>. + + [CVEs] NVD, "Vulnerability Advisories for PRNGs", + <https://www.gont.com.ar/miscellanea/prng-cves/>. + + [EFF] EFF, "Cover your tracks: See how trackers view your + browser", <https://coveryourtracks.eff.org/>. + + [FIPS-SHS] NIST, "Secure Hash Standard (SHS)", FIPS PUB 180-4, + DOI 10.6028/NIST.FIPS.180-4, August 2015, + <https://nvlpubs.nist.gov/nistpubs/FIPS/ + NIST.FIPS.180-4.pdf>. + + [Fyodor1998] + Fyodor, "Remote OS detection via TCP/IP Stack + FingerPrinting", Phrack Magazine, Volume 8, Issue 54, + December 1998, + <http://www.phrack.org/archives/issues/54/9.txt>. + + [Fyodor2003] + Fyodor, "Idle Scanning and related IPID games", 2003, + <https://nmap.org/presentations/CanSecWest03/CD_Content/ + idlescan_paper/idlescan.html>. + + [Fyodor2006] + Lyon, G., "Chapter 8. Remote OS Detection", January 2009, + <https://nmap.org/book/osdetect.html>. + + [GETENTROPY] + Linux, "getentropy(3)", Linux Programmer's Manual, March + 2021, + <https://man7.org/linux/man-pages/man3/getentropy.3.html>. + + [IANA-PROT] + IANA, "Protocol Registries", + <https://www.iana.org/protocols>. + + [IPID-DEV] Klein, A. and B. Pinkas, "From IP ID to Device ID and + KASLR Bypass (Extended Version)", + DOI 10.48550/arXiv.1906.10478, October 2019, + <https://arxiv.org/pdf/1906.10478.pdf>. + + [Joncheray1995] + Joncheray, L., "Simple Active Attack Against TCP", + Proceedings of the Fifth USENIX UNIX Security Symposium, + June 1995, <https://www.usenix.org/legacy/publications/lib + rary/proceedings/security95/full_papers/joncheray.pdf>. + + [KASLR] PaX Team, "Address Space Layout Randomization", + <https://pax.grsecurity.net/docs/aslr.txt>. + + [Klein2007] + Klein, A., "OpenBSD DNS Cache Poisoning and Multiple O/S + Predictable IP ID Vulnerability", November 2007, + <https://dl.packetstormsecurity.net/papers/attack/OpenBSD_ + DNS_Cache_Poisoning_and_Multiple_OS_Predictable_IP_ID_Vuln + erability.pdf>. + + [Knuth1983] + Knuth, D., "The Art of Computer Programming", Volume 2 + (Seminumerical Algorithms), 2nd Ed., Reading, + Massachusetts, Addison-Wesley Publishing Company, January + 1981. + + [Morris1985] + Morris, R., "A Weakness in the 4.2BSD UNIX TCP/IP + Software", CSTR 117, AT&T Bell Laboratories, Murray Hill, + NJ, February 1985, + <https://pdos.csail.mit.edu/~rtm/papers/117.pdf>. + + [nmap] nmap, "Nmap: Free Security Scanner For Network Exploration + and Audit", 2020, <https://nmap.org/>. + + [POSIX] IEEE, "IEEE Standard for Information Technology -- + Portable Operating System Interface (POSIX(TM)) Base + Specifications, Issue 7", IEEE Std 1003.1-2017, + DOI 10.1109/IEEESTD.2018.8277153, January 2018, + <https://doi.org/10.1109/IEEESTD.2018.8277153>. + + [Press1992] + Press, W., Teukolsky, S., Vetterling, W., and B. Flannery, + "Numerical Recipes in C: The Art of Scientific Computing", + 2nd Ed., Cambridge University Press, ISBN 0-521-43108-5, + December 1992. + + [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- + Hashing for Message Authentication", RFC 2104, + DOI 10.17487/RFC2104, February 1997, + <https://www.rfc-editor.org/info/rfc2104>. + + [RFC3971] Arkko, J., Ed., Kempf, J., Zill, B., and P. Nikander, + "SEcure Neighbor Discovery (SEND)", RFC 3971, + DOI 10.17487/RFC3971, March 2005, + <https://www.rfc-editor.org/info/rfc3971>. + + [RFC4953] Touch, J., "Defending TCP Against Spoofing Attacks", + RFC 4953, DOI 10.17487/RFC4953, July 2007, + <https://www.rfc-editor.org/info/rfc4953>. + + [RFC4963] Heffner, J., Mathis, M., and B. Chandler, "IPv4 Reassembly + Errors at High Data Rates", RFC 4963, + DOI 10.17487/RFC4963, July 2007, + <https://www.rfc-editor.org/info/rfc4963>. + + [RFC5927] Gont, F., "ICMP Attacks against TCP", RFC 5927, + DOI 10.17487/RFC5927, July 2010, + <https://www.rfc-editor.org/info/rfc5927>. + + [RFC6194] Polk, T., Chen, L., Turner, S., and P. Hoffman, "Security + Considerations for the SHA-0 and SHA-1 Message-Digest + Algorithms", RFC 6194, DOI 10.17487/RFC6194, March 2011, + <https://www.rfc-editor.org/info/rfc6194>. + + [RFC6274] Gont, F., "Security Assessment of the Internet Protocol + Version 4", RFC 6274, DOI 10.17487/RFC6274, July 2011, + <https://www.rfc-editor.org/info/rfc6274>. + + [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., + Morris, J., Hansen, M., and R. Smith, "Privacy + Considerations for Internet Protocols", RFC 6973, + DOI 10.17487/RFC6973, July 2013, + <https://www.rfc-editor.org/info/rfc6973>. + + [RFC6980] Gont, F., "Security Implications of IPv6 Fragmentation + with IPv6 Neighbor Discovery", RFC 6980, + DOI 10.17487/RFC6980, August 2013, + <https://www.rfc-editor.org/info/rfc6980>. + + [RFC7098] Carpenter, B., Jiang, S., and W. Tarreau, "Using the IPv6 + Flow Label for Load Balancing in Server Farms", RFC 7098, + DOI 10.17487/RFC7098, January 2014, + <https://www.rfc-editor.org/info/rfc7098>. + + [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an + Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May + 2014, <https://www.rfc-editor.org/info/rfc7258>. + + [RFC7707] Gont, F. and T. Chown, "Network Reconnaissance in IPv6 + Networks", RFC 7707, DOI 10.17487/RFC7707, March 2016, + <https://www.rfc-editor.org/info/rfc7707>. + + [RFC7721] Cooper, A., Gont, F., and D. Thaler, "Security and Privacy + Considerations for IPv6 Address Generation Mechanisms", + RFC 7721, DOI 10.17487/RFC7721, March 2016, + <https://www.rfc-editor.org/info/rfc7721>. + + [RFC7739] Gont, F., "Security Implications of Predictable Fragment + Identification Values", RFC 7739, DOI 10.17487/RFC7739, + February 2016, <https://www.rfc-editor.org/info/rfc7739>. + + [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol + Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, + <https://www.rfc-editor.org/info/rfc8446>. + + [RFC8937] Cremers, C., Garratt, L., Smyshlyaev, S., Sullivan, N., + and C. Wood, "Randomness Improvements for Security + Protocols", RFC 8937, DOI 10.17487/RFC8937, October 2020, + <https://www.rfc-editor.org/info/rfc8937>. + + [RFC9414] Gont, F. and I. Arce, "Unfortunate History of Transient + Numeric Identifiers", RFC 9414, DOI 10.17487/RFC9414, July + 2023, <https://www.rfc-editor.org/info/rfc9414>. + + [RFC9416] Gont, F. and I. Arce, "Security Considerations for + Transient Numeric Identifiers Employed in Network + Protocols", BCP 72, RFC 9416, DOI 10.17487/RFC9416, July + 2023, <https://www.rfc-editor.org/info/rfc9416>. + + [Romailler2020] + Romailler, Y., "The Definitive Guide to "Modulo Bias and + How to Avoid It"!", Kudelski Security Research, July 2020, + <https://research.kudelskisecurity.com/2020/07/28/the- + definitive-guide-to-modulo-bias-and-how-to-avoid-it/>. + + [Sanfilippo1998a] + Sanfilippo, S., "about the ip header id", message to the + Bugtraq mailing list, December 1998, + <http://seclists.org/bugtraq/1998/Dec/48>. + + [Sanfilippo1998b] + Sanfilippo, S., "new tcp scan method", message to the + Bugtraq mailing list, 18 December 1998, + <https://seclists.org/bugtraq/1998/Dec/79>. + + [Sanfilippo1999] + Sanfilippo, S., "more ip id", message to the Bugtraq + mailing list, November 1999, + <https://github.com/antirez/hping/raw/master/docs/MORE- + FUN-WITH-IPID>. + + [Schuba1993] + Schuba, C., "Addressing Weakness in the Domain Name System + Protocol", August 1993, + <http://ftp.cerias.purdue.edu/pub/papers/christoph-schuba/ + schuba-DNS-msthesis.pdf>. + + [Shimomura1995] + Shimomura, T., "Technical details of the attack described + by Markoff in NYT", message to the USENET + comp.security.misc newsgroup, 25 January 1995, + <https://www.gont.com.ar/files/post-shimomura-usenet.txt>. + + [Silbersack2005] + Silbersack, M., "Improving TCP/IP security through + randomization without sacrificing interoperability", + EuroBSDCon 2005 Conference, + <https://www.silby.com/eurobsdcon05/ + eurobsdcon_silbersack.pdf>. + + [SipHash] "SipHash: a fast short-input PRF", February 2023, + <https://github.com/veorq/SipHash>. + + [TBIT] TBIT, "TBIT, the TCP Behavior Inference Tool", 2001, + <https://www.icir.org/tbit/>. + + [TCPT-uptime] + McDanel, B., "TCP Timestamping - Obtaining System Uptime + Remotely", message to the Bugtraq mailing list, March + 2001, <https://seclists.org/bugtraq/2001/Mar/182>. + + [Zalewski2001] + Zalewski, M., "Strange Attractors and TCP/IP Sequence + Number Analysis", April 2001, + <https://lcamtuf.coredump.cx/oldtcp/tcpseq.html>. + + [Zalewski2002] + Zalewski, M., "Strange Attractors and TCP/IP Sequence + Number Analysis - One Year Later (2002)", + <https://lcamtuf.coredump.cx/newtcp/>. + + [Zalewski2012] + Zalewski, M., "p0f v3 (3.09b)", + <https://lcamtuf.coredump.cx/p0f.shtml>. + +Appendix A. Algorithms and Techniques with Known Issues + + The following subsections discuss algorithms and techniques with + known negative security and privacy implications. + + | NOTE: As discussed in Section 1, the use of cryptographic + | techniques might allow for the safe use of some of these + | algorithms and techniques. However, this should be evaluated + | on a case-by-case basis. + +A.1. Predictable Linear Identifiers Algorithm + + One of the most trivial ways to achieve uniqueness with a low + identifier reuse frequency is to produce a linear sequence. This + type of algorithm has been employed in the past to generate + identifiers of Categories #1, #2, and #4 (please see Section 6 for an + analysis of these categories). + + For example, the following algorithm has been employed (see, e.g., + [Morris1985], [Shimomura1995], [Silbersack2005], and [CPNI-TCP]) in a + number of operating systems for selecting IP IDs, TCP ephemeral port + numbers, etc.: + + /* Initialization code */ + + next_id = min_id; + id_inc= 1; + + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + retry = id_range; + + do { + if (next_id == max_id) { + next_id = min_id; + } + else { + next_id = next_id + id_inc; + } + + if (suitable_id(next_id)) { + return next_id; + } + + retry--; + + } while (retry > 0); + + return ERROR; + + NOTE: + + suitable_id() checks whether a candidate numeric identifier is + suitable (e.g., whether it is unique or not). + + For obvious reasons, this algorithm results in predictable sequences. + Since a global counter is used to generate the transient numeric + identifiers ("next_id" in the example above), an entity that learns + one numeric identifier can infer past numeric identifiers and predict + future values to be generated by the same algorithm. Since the value + employed for the increments is known (such as "1" in this case), an + attacker can sample two values and learn the number of identifiers + that were generated in between the two sampled values. Furthermore, + if the counter is initialized, to some known value (e.g., when the + system is bootstrapped), the algorithm will leak additional + information, such as the number of transmitted fragmented datagrams + in the case of an IP ID generator [Sanfilippo1998a] or the system + uptime in the case of TCP timestamps [TCPT-uptime]. + +A.2. Random-Increments Algorithm + + This algorithm offers a middle ground between the algorithms that + generate randomized transient numeric identifiers (such as those + described in Sections 7.1.1 and 7.1.2) and those that generate + identifiers with a predictable monotonically increasing function (see + Appendix A.1). + + /* Initialization code */ + + next_id = random(); /* Initialization value */ + id_rinc = 500; /* Determines the trade-off */ + + + /* Transient Numeric ID selection function */ + + id_range = max_id - min_id + 1; + retry = id_range; + + + do { + /* Random increment */ + id_inc = (random() % id_rinc) + 1; + + if ( (max_id - next_id) >= id_inc){ + next_id = next_id + id_inc; + } + else { + next_id = min_id + id_inc - (max_id - next_id); + } + + if (suitable_id(next_id)) { + return next_id; + } + + retry = retry - id_inc; + + } while (retry > 0); + + return ERROR; + + NOTE: + + random() is a PRNG that returns a pseudorandom unsigned integer + number of appropriate size. Beware that "adapting" the length of + the output of random() with a modulo operator (e.g., C language's + "%") may change the distribution of the PRNG. To preserve a + uniform distribution, the rejection sampling technique + [Romailler2020] can be used. + + suitable_id() is a function that checks whether a candidate + identifier is suitable (e.g., whether it is unique or not). + + This algorithm aims at producing a global monotonically increasing + sequence of transient numeric identifiers while avoiding the use of + fixed increments, which would lead to trivially predictable + sequences. The value "id_rinc" allows for direct control of the + trade-off between unpredictability and identifier reuse frequency. + The smaller the value of "id_rinc", the more similar this algorithm + is to a predicable, global linear identifier generation algorithm (as + the one in Appendix A.1). The larger the value of "id_rinc", the + more similar this algorithm is to the algorithm described in + Section 7.1.1 of this document. + + When the identifiers wrap, there is a risk of collisions of transient + numeric identifiers (i.e., identifier reuse). Therefore, "id_rinc" + should be selected according to the following criteria: + + * It should maximize the wrapping time of the identifier space. + + * It should minimize identifier reuse frequency. + + * It should maximize unpredictability. + + Clearly, these are competing goals, and the decision of which value + of "id_rinc" to use is a trade-off. Therefore, the value of + "id_rinc" is at times a configurable parameter so that system + administrators can make the trade-off for themselves. We note that + the alternative algorithms discussed throughout this document offer + better interoperability, security, and privacy properties than this + algorithm, and hence, implementation of this algorithm is + discouraged. + +A.3. Reusing Identifiers Across Different Contexts + + Employing the same identifier across contexts in which stability is + not required (i.e., overloading the semantics of transient numeric + identifiers) usually has negative security and privacy implications. + + For example, in order to generate transient numeric identifiers of + Category #2 or #3, an implementation or specification might be + tempted to employ a source for the numeric identifiers that is known + to provide unique values but that may also be predictable or leak + information related to the entity generating the identifier. This + technique has been employed in the past for, e.g., generating IPv6 + IIDs by reusing the MAC address of the underlying network interface + card. However, as noted in [RFC7721] and [RFC7707], embedding link- + layer addresses in IPv6 IIDs not only results in predictable values + but also leaks information about the manufacturer of the underlying + network interface card, allows for network activity correlation, and + makes address-based scanning attacks feasible. + +Acknowledgements + + The authors would like to thank (in alphabetical order) Bernard + Aboba, Jean-Philippe Aumasson, Steven Bellovin, Luis León Cárdenas + Graide, Spencer Dawkins, Theo de Raadt, Guillermo Gont, Joseph + Lorenzo Hall, Gre Norcie, Colin Perkins, Vincent Roca, Shivan Sahib, + Rich Salz, Martin Thomson, and Michael Tüxen for providing valuable + comments on earlier draft versions of this document. + + The authors would like to thank Shivan Sahib and Christopher Wood for + their guidance during the publication process of this document. + + The authors would like to thank Jean-Philippe Aumasson and Mathew + D. Green (John Hopkins University) for kindly answering a number of + questions. + + The authors would like to thank Diego Armando Maradona for his magic + and inspiration. + +Authors' Addresses + + Fernando Gont + SI6 Networks + Segurola y Habana 4310 7mo piso + Ciudad Autonoma de Buenos Aires + Argentina + Email: fgont@si6networks.com + URI: https://www.si6networks.com + + + Ivan Arce + Quarkslab + Segurola y Habana 4310 7mo piso + Ciudad Autonoma de Buenos Aires + Argentina + Email: iarce@quarkslab.com + URI: https://www.quarkslab.com |