diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc8771.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc8771.txt')
-rw-r--r-- | doc/rfc/rfc8771.txt | 480 |
1 files changed, 480 insertions, 0 deletions
diff --git a/doc/rfc/rfc8771.txt b/doc/rfc/rfc8771.txt new file mode 100644 index 0000000..42dc766 --- /dev/null +++ b/doc/rfc/rfc8771.txt @@ -0,0 +1,480 @@ + + + + +Independent Submission A. Mayrhofer +Request for Comments: 8771 nic.at GmbH +Category: Experimental J. Hague +ISSN: 2070-1721 Sinodun + 1 April 2020 + + +The Internationalized Deliberately Unreadable Network NOtation (I-DUNNO) + +Abstract + + Domain Names were designed for humans, IP addresses were not. But + more than 30 years after the introduction of the DNS, a minority of + mankind persists in invading the realm of machine-to-machine + communication by reading, writing, misspelling, memorizing, + permuting, and confusing IP addresses. This memo describes the + Internationalized Deliberately Unreadable Network NOtation + ("I-DUNNO"), a notation designed to replace current textual + representations of IP addresses with something that is not only more + concise but will also discourage this small, but obviously important, + subset of human activity. + +Status of This Memo + + This document is not an Internet Standards Track specification; it is + published for examination, experimental implementation, and + evaluation. + + This document defines an Experimental Protocol for the Internet + community. This is a contribution to the RFC Series, independently + of any other RFC stream. The RFC Editor has chosen to publish this + document at its discretion and makes no statement about its value for + implementation or deployment. Documents approved for publication by + the RFC Editor are not candidates for any level of Internet Standard; + see Section 2 of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc8771. + +Copyright Notice + + Copyright (c) 2020 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. + +Table of Contents + + 1. Introduction + 2. Terminology + 3. The Notation + 3.1. Forming I-DUNNO + 3.2. Deforming I-DUNNO + 4. I-DUNNO Confusion Level Requirements + 4.1. Minimum Confusion Level + 4.2. Satisfactory Confusion Level + 4.3. Delightful Confusion Level + 5. Example + 6. IANA Considerations + 7. Security Considerations + 8. References + 8.1. Normative References + 8.2. Informative References + Authors' Addresses + +1. Introduction + + In Section 2.3 of [RFC0791], the original designers of the Internet + Protocol carefully defined names and addresses as separate + quantities. While they did not explicitly reserve names for human + consumption and addresses for machine use, they did consider the + matter indirectly in their philosophical communal statement: "A name + indicates what we seek." This clearly indicates that names rather + than addresses should be of concern to humans. + + The specification of domain names in [RFC1034], and indeed the + continuing enormous effort put into the Domain Name System, + reinforces the view that humans should use names and leave worrying + about addresses to the machines. RFC 1034 mentions "users" several + times, and even includes the word "humans", even though it is + positioned slightly unfortunately, though perfectly understandably, + in a context of "annoying" and "can wreak havoc" (see Section 5.2.3 + of [RFC1034]). Nevertheless, this is another clear indication that + domain names are made for human use, while IP addresses are for + machine use. + + Given this, and a long error-strewn history of human attempts to + utilize addresses directly, it is obviously desirable that humans + should not meddle with IP addresses. For that reason, it appears + quite logical that a human-readable (textual) representation of IP + addresses was just very vaguely specified in Section 2.1 of + [RFC1123]. Subsequently, a directed effort to further discourage + human use by making IP addresses more confusing was introduced in + [RFC1883] (which was obsoleted by [RFC8200]), and additional options + for human puzzlement were offered in Section 2.2 of [RFC4291]. These + noble early attempts to hamper efforts by humans to read, understand, + or even spell IP addressing schemes were unfortunately severely + compromised in [RFC5952]. + + In order to prevent further damage from human meddling with IP + addresses, there is a clear urgent need for an address notation that + replaces these "Legacy Notations", and efficiently discourages humans + from reading, modifying, or otherwise manipulating IP addresses. + Research in this area long ago recognized the potential in + ab^H^Hperusing the intricacies, inaccuracies, and chaotic disorder of + what humans are pleased to call a "Cultural Technique" (also known as + "Script"), and with a certain inexorable inevitability has focused of + late on the admirable confusion (and thus discouragement) potential + of [UNICODE] as an address notation. In Section 4, we introduce a + framework of Confusion Levels as an aid to the evaluation of the + effectiveness of any Unicode-based scheme in producing notation in a + form designed to be resistant to ready comprehension or, heaven + forfend, mutation of the address, and so effecting the desired + confusion and discouragement. + + The authors welcome [RFC8369] as a major step in the right direction. + However, we have some reservations about the scheme proposed therein: + + * Our analysis of the proposed scheme indicates that, while + impressively concise, it fails to attain more than at best a + Minimum Confusion Level in our classification. + + * Humans, especially younger ones, are becoming skilled at handling + emoji. Over time, this will negatively impact the discouragement + factor. + + * The proposed scheme is specific to IPv6; if a solution to this + problem is to be in any way timely, it must, as a matter of the + highest priority, address IPv4. After all, even taking the + regrettable effects of RFC 5952 into account, IPv6 does at least + remain inherently significantly more confusing and discouraging + than IPv4. + + This document therefore specifies an alternative Unicode-based + notation, the Internationalized Deliberately Unreadable Network + NOtation (I-DUNNO). This notation addresses each of the concerns + outlined above: + + * I-DUNNO can generate Minimum, Satisfactory, or Delightful levels + of confusion. + + * As well as emoji, it takes advantage of other areas of Unicode + confusion. + + * It can be used with IPv4 and IPv6 addresses. + + We concede that I-DUNNO notation is markedly less concise than that + of RFC 8369. However, by permitting multiple code points in the + representation of a single address, I-DUNNO opens up the full + spectrum of Unicode-adjacent code point interaction. This is a + significant factor in allowing I-DUNNO to achieve higher levels of + confusion. I-DUNNO also requires no change to the current size of + Unicode code points, and so its chances of adoption and + implementation are (slightly) higher. + + Note that the use of I-DUNNO in the reverse DNS system is currently + out of scope. The occasional human-induced absence of the magical + one-character sequence U+002E is believed to cause sufficient + disorder there. + + Media Access Control (MAC) addresses are totally out of the question. + +2. Terminology + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and + "OPTIONAL" in this document are to be interpreted as described in + BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all + capitals, as shown here. + + Additional terminology from [RFC6919] MIGHT apply. + +3. The Notation + + I-DUNNO leverages UTF-8 [RFC3629] to obfuscate IP addresses for + humans. UTF-8 uses sequences between 1 and 4 octets to represent + code points as follows: + + +-----------------------+-------------------------------------+ + | Char. number range | UTF-8 octet sequence | + +-----------------------+-------------------------------------+ + | (hexadecimal) | (binary) | + +=======================+=====================================+ + | 0000 0000 - 0000 007F | 0xxxxxxx | + +-----------------------+-------------------------------------+ + | 0000 0080 - 0000 07FF | 110xxxxx 10xxxxxx | + +-----------------------+-------------------------------------+ + | 0000 0800 - 0000 FFFF | 1110xxxx 10xxxxxx 10xxxxxx | + +-----------------------+-------------------------------------+ + | 0001 0000 - 0010 FFFF | 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx | + +-----------------------+-------------------------------------+ + + Table 1 + + I-DUNNO uses that structure to convey addressing information as + follows: + +3.1. Forming I-DUNNO + + In order to form an I-DUNNO based on the Legacy Notation of an IP + address, the following steps are performed: + + 1. The octets of the IP address are written as a bitstring in + network byte order. + + 2. Working from left to right, the bitstring (32 bits for IPv4; 128 + bits for IPv6) is used to generate a list of valid UTF-8 octet + sequences. To allocate a single UTF-8 sequence: + + a. Choose whether to generate a UTF-8 sequence of 1, 2, 3, or 4 + octets. The choice OUGHT TO be guided by the requirement to + generate a satisfactory Minimum Confusion Level (Section 4.1) + (not to be confused with the minimum Satisfactory Confusion + Level (Section 4.2)). Refer to the character number range in + Table 1 in order to identify which octet sequence lengths are + valid for a given bitstring. For example, a 2-octet UTF-8 + sequence requires the next 11 bits to have a value in the + range 0080-07ff. + + b. Allocate bits from the bitstring to fill the vacant positions + 'x' in the UTF-8 sequence (see Table 1) from left to right. + + c. UTF-8 sequences of 1, 2, 3, and 4 octets require 7, 11, 16, + and 21 bits, respectively, from the bitstring. Since the + number of combinations of UTF-8 sequences accommodating + exactly 32 or 128 bits is limited, in sequences where the + number of bits required does not exactly match the number of + available bits, the final UTF-8 sequence MUST be padded with + additional bits once the available address bits are + exhausted. The sequence may therefore require up to 20 bits + of padding. The content of the padding SHOULD be chosen to + maximize the resulting Confusion Level. + + 3. Once the bits in the bitstring are exhausted, the conversion is + complete. The I-DUNNO representation of the address consists of + the Unicode code points described by the list of generated UTF-8 + sequences, and it MAY now be presented to unsuspecting humans. + +3.2. Deforming I-DUNNO + + This section is intentionally omitted. The machines will know how to + do it, and by definition humans SHOULD NOT attempt the process. + +4. I-DUNNO Confusion Level Requirements + + A sequence of characters is considered I-DUNNO only when there's + enough potential to confuse humans. + + Unallocated code points MUST be avoided. While they might appear to + have great confusion power at the moment, there's a minor chance that + a future allocation to a useful, legible character will reduce this + capacity significantly. Worse, in the (unlikely, but not impossible + -- see Section 3.1.3 of [RFC5894]) event of a code point losing its + DISALLOWED property per IDNA2008 [RFC5894], existing I-DUNNOs could + be rendered less than minimally confusing, with disastrous + consequences. + + The following Confusion Levels are defined: + +4.1. Minimum Confusion Level + + As a minimum, a valid I-DUNNO MUST: + + * Contain at least one UTF-8 octet sequence with a length greater + than one octet. + + * Contain at least one character that is DISALLOWED in IDNA2008. No + code point left behind! Note that this allows machines to + distinguish I-DUNNO from Internationalized Domain Name labels. + + I-DUNNOs on this level will at least puzzle most human users with + knowledge of the Legacy Notation. + +4.2. Satisfactory Confusion Level + + An I-DUNNO with Satisfactory Confusion Level MUST adhere to the + Minimum Confusion Level, and additionally contain two of the + following: + + * At least one non-printable character. + + * Characters from at least two different Scripts. + + * A character from the "Symbol" category. + + The Satisfactory Confusion Level will make many human-machine + interfaces beep, blink, silently fail, or any combination thereof. + This is considered sufficient to discourage most humans from + deforming I-DUNNO. + +4.3. Delightful Confusion Level + + An I-DUNNO with Delightful Confusion Level MUST adhere to the + Satisfactory Confusion Level, and additionally contain at least two + of the following: + + * Characters from scripts with different directionalities. + + * Character classified as "Confusables". + + * One or more emoji. + + An I-DUNNO conforming to this level will cause almost all humans to + U+1F926, with the exception of those subscribed to the idna-update + mailing list. + + (We have also considered a further, higher Confusion Level, + tentatively entitled "BReak EXaminatIon or Twiddling" or "BREXIT" + Level Confusion, but currently we have no idea how to go about + actually implementing it.) + +5. Example + + An I-DUNNO based on the Legacy Notation IPv4 address "198.51.100.164" + is formed and validated as follows: First, the Legacy Notation is + written as a string of 32 bits in network byte order: + + 11000110001100110110010010100100 + + Since I-DUNNO requires at least one UTF-8 octet sequence with a + length greater than one octet, we allocate bits in the following + form: + + seq1 | seq2 | seq3 | seq4 + --------+---------+---------+------------ + 1100011 | 0001100 | 1101100 | 10010100100 + + This translates into the following code points: + + +-------------+-------------------------------------------+ + | Bit Seq. | Character Number (Character Name) | + +=============+===========================================+ + | 1100011 | U+0063 (LATIN SMALL LETTER C) | + +-------------+-------------------------------------------+ + | 0001100 | U+000C (FORM FEED (FF)) | + +-------------+-------------------------------------------+ + | 1101100 | U+006C (LATIN SMALL LETTER L) | + +-------------+-------------------------------------------+ + | 10010100100 | U+04A4 (CYRILLIC CAPITAL LIGATURE EN GHE) | + +-------------+-------------------------------------------+ + + Table 2 + + The resulting string MUST be evaluated against the Confusion Level + Requirements before I-DUNNO can be declared. Given the example + above: + + * There is at least one UTF-8 octet sequence with a length greater + than 1 (U+04A4) . + + * There are two IDNA2008 DISALLOWED characters: U+000C (for good + reason!) and U+04A4. + + * There is one non-printable character (U+000C). + + * There are characters from two different Scripts (Latin and + Cyrillic). + + Therefore, the example above constitutes valid I-DUNNO with a + Satisfactory Confusion Level. U+000C in particular has great + potential in environments where I-DUNNOs would be sent to printers. + +6. IANA Considerations + + If this work is standardized, IANA is kindly requested to revoke all + IPv4 and IPv6 address range allocations that do not allow for at + least one I-DUNNO of Delightful Confusion Level. IPv4 prefixes are + more likely to be affected, hence this can easily be marketed as an + effort to foster IPv6 deployment. + + Furthermore, IANA is urged to expand the Internet TLA Registry + [RFC5513] to accommodate Seven-Letter Acronyms (SLA) for obvious + reasons, and register 'I-DUNNO'. For that purpose, U+002D ("-", + HYPHEN-MINUS) SHALL be declared a Letter. + +7. Security Considerations + + I-DUNNO is not a security algorithm. Quite the contrary -- many + humans are known to develop a strong feeling of insecurity when + confronted with I-DUNNO. + + In the tradition of many other RFCs, the evaluation of other security + aspects of I-DUNNO is left as an exercise for the reader. + +8. References + +8.1. Normative References + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + <https://www.rfc-editor.org/info/rfc2119>. + + [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO + 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November + 2003, <https://www.rfc-editor.org/info/rfc3629>. + + [RFC5894] Klensin, J., "Internationalized Domain Names for + Applications (IDNA): Background, Explanation, and + Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010, + <https://www.rfc-editor.org/info/rfc5894>. + + [RFC6919] Barnes, R., Kent, S., and E. Rescorla, "Further Key Words + for Use in RFCs to Indicate Requirement Levels", RFC 6919, + DOI 10.17487/RFC6919, April 2013, + <https://www.rfc-editor.org/info/rfc6919>. + + [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC + 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, + May 2017, <https://www.rfc-editor.org/info/rfc8174>. + +8.2. Informative References + + [RFC0791] Postel, J., "Internet Protocol", STD 5, RFC 791, + DOI 10.17487/RFC0791, September 1981, + <https://www.rfc-editor.org/info/rfc791>. + + [RFC1034] Mockapetris, P., "Domain names - concepts and facilities", + STD 13, RFC 1034, DOI 10.17487/RFC1034, November 1987, + <https://www.rfc-editor.org/info/rfc1034>. + + [RFC1123] Braden, R., Ed., "Requirements for Internet Hosts - + Application and Support", STD 3, RFC 1123, + DOI 10.17487/RFC1123, October 1989, + <https://www.rfc-editor.org/info/rfc1123>. + + [RFC1883] Deering, S. and R. Hinden, "Internet Protocol, Version 6 + (IPv6) Specification", RFC 1883, DOI 10.17487/RFC1883, + December 1995, <https://www.rfc-editor.org/info/rfc1883>. + + [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing + Architecture", RFC 4291, DOI 10.17487/RFC4291, February + 2006, <https://www.rfc-editor.org/info/rfc4291>. + + [RFC5513] Farrel, A., "IANA Considerations for Three Letter + Acronyms", RFC 5513, DOI 10.17487/RFC5513, April 2009, + <https://www.rfc-editor.org/info/rfc5513>. + + [RFC5952] Kawamura, S. and M. Kawashima, "A Recommendation for IPv6 + Address Text Representation", RFC 5952, + DOI 10.17487/RFC5952, August 2010, + <https://www.rfc-editor.org/info/rfc5952>. + + [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 + (IPv6) Specification", STD 86, RFC 8200, + DOI 10.17487/RFC8200, July 2017, + <https://www.rfc-editor.org/info/rfc8200>. + + [RFC8369] Kaplan, H., "Internationalizing IPv6 Using 128-Bit + Unicode", RFC 8369, DOI 10.17487/RFC8369, April 2018, + <https://www.rfc-editor.org/info/rfc8369>. + + [UNICODE] The Unicode Consortium, "The Unicode Standard (Current + Version)", 2019, + <http://www.unicode.org/versions/latest/>. + +Authors' Addresses + + Alexander Mayrhofer + nic.at GmbH + + Email: alexander.mayrhofer@nic.at + URI: https://i-dunno.at/ + + + Jim Hague + Sinodun + + Email: jim@sinodun.com + URI: https://www.sinodun.com/ |