diff options
Diffstat (limited to 'doc/rfc/rfc6885.txt')
-rw-r--r-- | doc/rfc/rfc6885.txt | 1907 |
1 files changed, 1907 insertions, 0 deletions
diff --git a/doc/rfc/rfc6885.txt b/doc/rfc/rfc6885.txt new file mode 100644 index 0000000..46eb67f --- /dev/null +++ b/doc/rfc/rfc6885.txt @@ -0,0 +1,1907 @@ + + + + + + +Internet Engineering Task Force (IETF) M. Blanchet +Request for Comments: 6885 Viagenie +Category: Informational A. Sullivan +ISSN: 2070-1721 Dyn, Inc. + March 2013 + + + Stringprep Revision and Problem Statement +for the Preparation and Comparison of Internationalized Strings (PRECIS) + +Abstract + + If a protocol expects to compare two strings and is prepared only for + those strings to be ASCII, then using Unicode code points in those + strings requires they be prepared somehow. Internationalizing Domain + Names in Applications (here called IDNA2003) defined and used + Stringprep and Nameprep. Other protocols subsequently defined + Stringprep profiles. A new approach different from Stringprep and + Nameprep is used for a revision of IDNA2003 (called IDNA2008). Other + Stringprep profiles need to be similarly updated, or a replacement of + Stringprep needs to be designed. This document outlines the issues + to be faced by those designing a Stringprep replacement. + +Status of This Memo + + This document is not an Internet Standards Track specification; it is + published for informational purposes. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Not all documents + approved by the IESG are a candidate for any level of Internet + Standard; see Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc6885. + + + + + + + + + + + + + +Blanchet & Sullivan Informational [Page 1] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +Copyright Notice + + Copyright (c) 2013 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Blanchet & Sullivan Informational [Page 2] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 2. Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 + 3. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 6 + 4. Stringprep Profiles Limitations . . . . . . . . . . . . . . . 6 + 5. Major Topics for Consideration . . . . . . . . . . . . . . . . 8 + 5.1. Comparison . . . . . . . . . . . . . . . . . . . . . . . . 8 + 5.1.1. Types of Identifiers . . . . . . . . . . . . . . . . . 8 + 5.1.2. Effect of Comparison . . . . . . . . . . . . . . . . . 8 + 5.2. Dealing with Characters . . . . . . . . . . . . . . . . . 9 + 5.2.1. Case Folding, Case Sensitivity, and Case + Preservation . . . . . . . . . . . . . . . . . . . . . 9 + 5.2.2. Stringprep and NFKC . . . . . . . . . . . . . . . . . 9 + 5.2.3. Character Mapping . . . . . . . . . . . . . . . . . . 10 + 5.2.4. Prohibited Characters . . . . . . . . . . . . . . . . 10 + 5.2.5. Internal Structure, Delimiters, and Special + Characters . . . . . . . . . . . . . . . . . . . . . . 10 + 5.2.6. Restrictions Because of Glyph Similarity . . . . . . . 11 + 5.3. Where the Data Comes from and Where It Goes . . . . . . . 11 + 5.3.1. User Input and the Source of Protocol Elements . . . . 11 + 5.3.2. User Output . . . . . . . . . . . . . . . . . . . . . 12 + 5.3.3. Operations . . . . . . . . . . . . . . . . . . . . . . 12 + 6. Considerations for Stringprep Replacement . . . . . . . . . . 13 + 7. Security Considerations . . . . . . . . . . . . . . . . . . . 14 + 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 + 9. Informative References . . . . . . . . . . . . . . . . . . . . 15 + Appendix A. Classification of Stringprep Profiles . . . . . . . . 19 + Appendix B. Evaluation of Stringprep Profiles . . . . . . . . . . 19 + B.1. iSCSI Stringprep Profile: RFC 3720, RFC 3721, RFC 3722 . . 19 + B.2. SMTP/POP3/ManageSieve Stringprep Profiles: RFC 4954, + RFC 5034, RFC 5804 . . . . . . . . . . . . . . . . . . . . 21 + B.3. IMAP Stringprep Profiles for Usernames: RFC 4314, RFC + 5738 . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 + B.4. IMAP Stringprep Profiles for Passwords: RFC 5738 . . . . . 26 + B.5. Anonymous SASL Stringprep Profiles: RFC 4505 . . . . . . . 28 + B.6. XMPP Stringprep Profiles for Nodeprep: RFC 3920 . . . . . 30 + B.7. XMPP Stringprep Profiles for Resourceprep: RFC 3920 . . . 31 + B.8. EAP Stringprep Profiles: RFC 3748 . . . . . . . . . . . . 33 + + + + + + + + + + + + +Blanchet & Sullivan Informational [Page 3] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +1. Introduction + + Internationalizing Domain Names in Applications (here called + IDNA2003) [RFC3490] [RFC3491] [RFC3492] and [RFC3454] describes a + mechanism for encoding Unicode labels that make up the + Internationalized Domain Names (IDNs) as standard DNS labels. The + labels were processed using a method called Nameprep [RFC3491] and + Punycode [RFC3492]. That method was specific to IDNA2003 but is + generalized as Stringprep [RFC3454]. The general mechanism is used + by other protocols with similar needs but with different constraints + than IDNA2003. + + Stringprep defines a framework within which protocols define their + Stringprep profiles. Some known IETF specifications using Stringprep + are listed below: + + o The Nameprep profile [RFC3490] for use in Internationalized Domain + Names (IDNs); + + o The Inter-Asterisk eXchange (IAX) using Nameprep [RFC5456]; + + o NFSv4 [RFC3530] and NFSv4.1 [RFC5661]; + + o The Internet Small Computer System Interface (iSCSI) profile + [RFC3722] for use in iSCSI names; + + o The Extensible Authentication Protocol (EAP) [RFC3748]; + + o The Nodeprep and Resourceprep profiles [RFC3920] (which was + obsoleted by [RFC6120]) for use in the Extensible Messaging and + Presence Protocol (XMPP), and the XMPP to Common Presence and + Instant Messaging (CPIM) mapping [RFC3922] (the latter of these + relies on the former); + + o The Internationalized Resource Identifier (IRI) and URI in XMPP + [RFC5122]; + + o The Policy MIB profile [RFC4011] for use in the Simple Network + Management Protocol (SNMP); + + o Transport Layer Security (TLS) [RFC4279]; + + o The Lightweight Directory Access Protocol (LDAP) profile [RFC4518] + for use with LDAP [RFC4511] and its authentication methods + [RFC4513]; + + o PKIX subject identification using LDAPprep [RFC4683]; + + + + +Blanchet & Sullivan Informational [Page 4] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + o PKIX Certificate Revocation List (CRL) using LDAPprep [RFC5280]; + + o The Simple Authentication and Security Layer (SASL) [RFC4422] and + SASLprep profile [RFC4013] for use in SASL; + + o Plain SASL using SASLprep [RFC4616]; + + o SMTP Auth using SASLprep [RFC4954]; + + o The Post Office Protocol (POP3) Auth using SASLprep [RFC5034]; + + o TLS Secure Remote Password (SRP) using SASLprep [RFC5054]; + + o SASL Salted Challenge Response Authentication Mechanism (SCRAM) + using SASLprep [RFC5802]; + + o Remote management of Sieve using SASLprep [RFC5804]; + + o The Network News Transfer Protocol (NNTP) using SASLprep + [RFC4643]; + + o IMAP4 using SASLprep [RFC4314]; + + o The trace profile [RFC4505] for use with the SASL ANONYMOUS + mechanism; + + o Internet Application Protocol Collation Registry [RFC4790]; + + o The unicode-casemap Unicode Collation [RFC5051]. + + However, a review (see [78PRECIS]) of these protocol specifications + found that they are very similar and can be grouped into a short + number of classes. Moreover, many reuse the same Stringprep profile, + such as the SASL one. + + IDNA2003 was replaced because of some limitations described in + [RFC4690]. The new IDN specification, called IDNA2008 [RFC5890], + [RFC5891], [RFC5892], [RFC5893] was designed based on the + considerations found in [RFC5894]. One of the effects of IDNA2008 is + that Nameprep and Stringprep are not used at all. Instead, an + algorithm based on Unicode properties of code points is defined. + That algorithm generates a stable and complete table of the supported + Unicode code points for each Unicode version. This algorithm uses an + inclusion-based approach, instead of the exclusion-based approach of + Stringprep/Nameprep. That is, IDNA2003 created an explicit list of + excluded or mapped-away characters; anything in Unicode 3.2 that was + not so listed could be assumed to be allowed under the protocol. + + + + +Blanchet & Sullivan Informational [Page 5] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + IDNA2008 begins instead from the assumption that code points are + disallowed and then relies on Unicode properties to derive whether a + given code point actually is allowed in the protocol. + + This document lists the shortcomings and issues found by protocols + listed above that defined Stringprep profiles. It also lists the + requirements for any potential replacement of Stringprep. + +2. Keywords + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [RFC2119]. + + This document uses various internationalization terms, which are + defined and discussed in [RFC6365]. + + Additionally, this document defines the following keyword: + + PRECIS: Preparation and Comparison of Internationalized Strings + +3. Conventions + + A single Unicode code point in this memo is denoted by "U+" followed + by four to six hexadecimal digits, as used in [Unicode61], + Appendix A. + +4. Stringprep Profiles Limitations + + During IETF 77 (March 2010), a BOF discussed the current state of the + protocols that have defined Stringprep profiles [NEWPREP]. The main + conclusions from that discussion were as follows: + + o Stringprep is bound to Version 3.2 of Unicode. Stringprep has not + been updated to new versions of Unicode. Therefore, the protocols + using Stringprep are stuck at Unicode 3.2, and their + specifications need to be updated to support new versions of + Unicode. + + o The protocols would like to not be bound to a specific version of + Unicode, but rather have better Unicode version agility in the way + of IDNA2008. This is important partly because it is usually + impossible for an application to require Unicode 3.2; the + application gets whatever version of Unicode is available on the + host. + + o The protocols require better bidirectional support (bidi) than + currently offered by Stringprep. + + + +Blanchet & Sullivan Informational [Page 6] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + o If the protocols are updated to use a new version of Stringprep or + another framework, then backward compatibility is an important + requirement. For example, Stringprep normalization is based on + and profiles may use Unicode Normalization Form KC (NFKC) [UAX15], + while IDNA2008 mostly uses Unicode Normalization Form C (NFC) + [UAX15]. + + o Identifiers are passed between protocols. For example, the same + username string of code points may be passed between SASL, XMPP, + LDAP, and EAP. Therefore, a common set of rules or classes of + strings are preferred over specific rules for each protocol. + Without real planning in advance, many Stringprep profiles reuse + other profiles, so this goal was accomplished by accident with + Stringprep. + + Protocols that use Stringprep profiles use strings for different + purposes: + + o XMPP uses a different Stringprep profile for each part of the XMPP + address Jabber Identifier (JID): a localpart, which is similar to + a username and used for authentication; a domainpart, which is a + domain name; and a resourcepart, which is less restrictive than + the localpart. + + o iSCSI uses a Stringprep profile for the names of protocol + participants (called initiators and targets). The iSCSI Qualified + Name (IQN) format of iSCSI names contains a reversed DNS domain + name. + + o SASL and LDAP use a Stringprep profile for usernames. + + o LDAP uses a set of Stringprep profiles. + + The apparent judgement of the BOF attendees [NEWPREP] was that it + would be highly desirable to have a replacement of Stringprep, with + similar characteristics to IDNA2008. That replacement should be + defined so that the protocols could use internationalized strings + without a lot of specialized internationalization work, since + internationalization expertise is not available in the respective + protocols or working groups. Accordingly, the IESG formed the PRECIS + working group to undertake the task. + + Notwithstanding the desire evident in [NEWPREP] and the chartering of + a working group, IDNA2008 may be a poor model for what other + protocols ought to do, because it is designed to support an old + protocol that is designed to operate on the scale of the entire + Internet. Moreover, IDNA2008 is intended to be deployed without any + + + + +Blanchet & Sullivan Informational [Page 7] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + change to the base DNS protocol. Other protocols may aim at + deployment in more local environments, or may have protocol version + negotiation built in. + +5. Major Topics for Consideration + + This section provides an overview of major topics that a Stringprep + replacement needs to address. The headings correspond roughly with + categories under which known Stringprep-using protocol RFCs have been + evaluated. For the details of those evaluations, see Appendix A. + +5.1. Comparison + +5.1.1. Types of Identifiers + + Following [ID-COMP], it is possible to organize identifiers into + three classes in respect of how they may be compared with one + another: + + Absolute Identifiers: Identifiers that can be compared byte-by-byte + for equality. + + Definite Identifiers: Identifiers that have a well-defined + comparison algorithm on which all parties agree. + + Indefinite Identifiers: Identifiers that have no single comparison + algorithm on which all parties agree. + + Definite Identifiers include cases like the comparison of Unicode + code points in different encodings: they do not match byte for byte + but can all be converted to a single encoding which then does match + byte for byte. Indefinite Identifiers are sometimes algorithmically + comparable by well-specified subsets of parties. For more discussion + of these categories, see [ID-COMP]. + + The section on treating the existing known cases, Appendix A, uses + the categories above. + +5.1.2. Effect of Comparison + + The three classes of comparison style outlined in Section 5.1.1 may + have different effects when applied. It is necessary to evaluate the + effects if a comparison results in a false positive or a false + negative, especially in terms of the consequences to security and + usability. + + + + + + +Blanchet & Sullivan Informational [Page 8] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +5.2. Dealing with Characters + + This section outlines a range of issues having to do with characters + in the target protocols, the ways in which IDNA2008 might be a good + analogy to other protocols, and ways in which it might be a poor one. + +5.2.1. Case Folding, Case Sensitivity, and Case Preservation + + In IDNA2003, labels are always mapped to lowercase before the + Punycode transformation. In IDNA2008, there is no mapping at all: + input is either a valid U-label or it is not. At the same time, + uppercase characters are by definition not valid U-labels, because + they fall into the Unstable category (category B) of [RFC5892]. + + If there are protocols that require case be preserved, then the + analogy with IDNA2008 will break down. Accordingly, existing + protocols are to be evaluated according to the following criteria: + + 1. Does the protocol use case folding? For all blocks of code + points or just for certain subsets? + + 2. Is the system or protocol case-sensitive? + + 3. Does the system or protocol preserve case? + +5.2.2. Stringprep and NFKC + + Stringprep profiles may use normalization. If they do, they use NFKC + [UAX15] (most profiles do). It is not clear that NFKC is the right + normalization to use in all cases. In [UAX15], there is the + following observation regarding Normalization Forms KC and KD: "It is + best to think of these Normalization Forms as being like uppercase or + lowercase mappings: useful in certain contexts for identifying core + meanings, but also performing modifications to the text that may not + always be appropriate." In general, it can be said that NFKC is more + aggressive about finding matches between code points than NFC. For + things like the spelling of users' names, NFKC may not be the best + form to use. At the same time, one of the nice things about NFKC is + that it deals with the width of characters that are otherwise + similar, by canonicalizing half-width to full-width. This mapping + step can be crucial in practice. A replacement for Stringprep + depends on analyzing the different use profiles and considering + whether NFKC or NFC is a better normalization for each profile. + + For the purposes of evaluating an existing example of Stringprep use, + it is helpful to know whether it uses no normalization, NFKC, or NFC. + + + + + +Blanchet & Sullivan Informational [Page 9] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +5.2.3. Character Mapping + + Along with the case mapping issues raised in Section 5.2.1, there is + the question of whether some characters are mapped either to other + characters or to nothing during Stringprep. [RFC3454], Section 3, + outlines a number of characters that are mapped to nothing, and also + permits Stringprep profiles to define their own mappings. + +5.2.4. Prohibited Characters + + Along with case folding and other character mappings, many protocols + have characters that are simply disallowed. For example, control + characters and special characters such as "@" or "/" may be + prohibited in a protocol. + + One of the primary changes of IDNA2008 is in the way it approaches + Unicode code points, using the new inclusion-based approach (see + Section 1). + + Because of the default assumption in IDNA2008 that a code point is + not allowed by the protocol, it has more than one class of "allowed + by the protocol"; this is unlike IDNA2003. While some code points + are disallowed outright, some are allowed only in certain contexts. + The reasons for the context-dependent rules have to do with the way + some characters are used. For instance, the ZERO WIDTH JOINER and + ZERO WIDTH NON-JOINER (ZWJ, U+200D and ZWNJ, U+200C) are allowed with + contextual rules because they are required in some circumstances, yet + are considered punctuation by Unicode and would therefore be + DISALLOWED under the usual IDNA2008 derivation rules. The goal of + IDNA2008 is to provide the widest repertoire of code points possible + and consistent with the traditional DNS "LDH" (letters, digits, + hyphen) rule (see [RFC0952]), trusting to the operators of individual + zones to make sensible (and usually more restrictive) policies for + their zones. + +5.2.5. Internal Structure, Delimiters, and Special Characters + + IDNA2008 has a special problem with delimiters, because the delimiter + "character" in the DNS wire format is not really part of the data. + In DNS, labels are not separated exactly; instead, a label carries + with it an indicator that says how long the label is. When the label + is displayed in presentation format as part of a fully qualified + domain name, the label separator FULL STOP, U+002E (.) is used to + break up the labels. But because that label separator does not + travel with the wire format of the domain name, there is no way to + encode a different, "internationalized" separator in IDNA2008. + + + + + +Blanchet & Sullivan Informational [Page 10] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Other protocols may include characters with similar special meaning + within the protocol. Common characters for these purposes include + FULL STOP, U+002E (.); COMMERCIAL AT, U+0040 (@); HYPHEN-MINUS, + U+002D (-); SOLIDUS, U+002F (/); and LOW LINE, U+005F (_). The mere + inclusion of such a character in the protocol is not enough for it to + be considered similar to another protocol using the same character; + instead, handling of the character must be taken into consideration + as well. + + An important issue to tackle here is whether it is valuable to map to + or from these special characters as part of the Stringprep + replacement. In some locales, the analogue to FULL STOP, U+002E is + some other character, and users may expect to be able to substitute + their normal stop for FULL STOP, U+002E. At the same time, there are + predictability arguments in favor of treating identifiers with FULL + STOP, U+002E in them just the way they are treated under IDNA2008. + +5.2.6. Restrictions Because of Glyph Similarity + + Homoglyphs are similarly (or identically) rendered glyphs of + different code points. For DNS names, homoglyphs may enable + phishing. If a protocol requires some visual comparison by end- + users, then the issue of homoglyphs is to be considered. In the DNS + context, these issues are documented in [RFC5894] and [RFC4690]. + However, IDNA2008 does not have a mechanism to deal with them, + trusting DNS zone operators to enact sensible policies for the subset + of Unicode they wish to support, given their user community. A + similar policy/protocol split may not be desirable in every protocol. + +5.3. Where the Data Comes from and Where It Goes + +5.3.1. User Input and the Source of Protocol Elements + + Some protocol elements are provided by users, and others are not. + Those that are not may presumably be subject to greater restrictions, + whereas those that users provide likely need to permit the broadest + range of code points. The following questions are helpful: + + 1. Do users input the strings directly? + + 2. If so, how? (keyboard, stylus, voice, copy-paste, etc.) + + 3. Where do we place the dividing line between user interface and + protocol? (see [RFC5895]) + + + + + + + +Blanchet & Sullivan Informational [Page 11] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +5.3.2. User Output + + Just as only some protocol elements are expected to be entered + directly by users, only some protocol elements are intended to be + consumed directly by users. It is important to know how users are + expected to be able to consume the protocol elements, because + different environments present different challenges. An element that + is only ever delivered as part of a vCard remains in machine-readable + format, so the problem of visual confusion is not a great one. Is + the protocol element published as part of a vCard, a web directory, + on a business card, or on "the side of a bus"? Do users use the + protocol element as an identifier (which means that they might enter + it again in some other context)? (See also Section 5.2.6.) + +5.3.3. Operations + + Some strings are useful as part of the protocol but are not used as + input to other operations (for instance, purely informative or + descriptive text). Other strings are used directly as input to other + operations (such as cryptographic hash functions), or are used + together with other strings to (such as concatenating a string with + some others to form a unique identifier). + +5.3.3.1. String Classes + + Strings often have a similar function in different protocols. For + instance, many different protocols contain user identifiers or + passwords. A single profile for all such uses might be desirable. + + Often, a string in a protocol is effectively a protocol element from + another protocol. For instance, different systems might use the same + credentials database for authentication. + +5.3.3.2. Community Considerations + + A Stringprep replacement that does anything more than just update + Stringprep to the latest version of Unicode will probably entail some + changes. It is important to identify the willingness of the + protocol-using community to accept backwards-incompatible changes. + By the same token, it is important to evaluate the desire of the + community for features not available under Stringprep. + +5.3.3.3. Unicode Incompatible Changes + + IDNA2008 uses an algorithm to derive the validity of a Unicode code + point for use under IDNA2008. It does this by using the properties + of each code point to test its validity. + + + + +Blanchet & Sullivan Informational [Page 12] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + This approach depends crucially on the idea that code points, once + valid for a protocol profile, will not later be made invalid. That + is not a guarantee currently provided by Unicode. Properties of code + points may change between versions of Unicode. Rarely, such a change + could cause a given code point to become invalid under a protocol + profile, even though the code point would be valid with an earlier + version of Unicode. This is not merely a theoretical possibility, + because it has occurred [RFC6452]. + + Accordingly, as in IDNA2008, a Stringprep replacement that intends to + be Unicode version agnostic will need to work out a mechanism to + address cases where incompatible changes occur because of new Unicode + versions. + +6. Considerations for Stringprep Replacement + + The above suggests the following guidance: + + o A Stringprep replacement should be defined. + + o The replacement should take an approach similar to IDNA2008 (e.g., + by using properties of code points instead of whitelisting of code + points), in that it enables better Unicode agility. + + o Protocols share similar characteristics of strings. Therefore, + defining internationalization preparation algorithms for the + smallest set of string classes may be sufficient for most cases, + providing coherence among a set of related protocols or protocols + where identifiers are exchanged. + + o The sets of string classes need to be evaluated according to the + considerations that make up the headings in Section 5 + + o It is reasonable to limit scope to Unicode code points and rule + the mapping of data from other character encodings outside the + scope of this effort. + + o The replacement ought to at least provide guidance to applications + using the replacement on how to handle protocol incompatibilities + resulting from changes to Unicode. In an ideal world, the + Stringprep replacement would handle the changes automatically, but + it appears that such automatic handling would require magic and + cannot be expected. + + o Compatibility within each protocol between a technique that is + Stringprep-based and the technique's replacement has to be + considered very carefully. + + + + +Blanchet & Sullivan Informational [Page 13] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Existing deployments already depend on Stringprep profiles. + Therefore, a replacement must consider the effects of any new + strategy on existing deployments. By way of comparison, it is worth + noting that some characters were acceptable in IDNA labels under + IDNA2003, but are not protocol-valid under IDNA2008 (and conversely); + disagreement about what to do during the transition has resulted in + different approaches to mapping. Different implementers may make + different decisions about what to do in such cases; this could have + interoperability effects. It is necessary to trade better support + for different linguistic environments against the potential side + effects of backward incompatibility. + +7. Security Considerations + + This document merely states what problems are to be solved and does + not define a protocol. There are undoubtedly security implications + of the particular results that will come from the work to be + completed. Moreover, the Stringprep Security Considerations + [RFC3454] Section applies. See also the analysis in the subsections + of Appendix B, below. + +8. Acknowledgements + + This document is the product of the PRECIS IETF Working Group, and + participants in that working group were helpful in addressing issues + with the text. + + Specific contributions came from David Black, Alan DeKok, Simon + Josefsson, Bill McQuillan, Alexey Melnikov, Peter Saint-Andre, Dave + Thaler, and Yoshiro Yoneya. + + Dave Thaler provided the "buckets" insight in Section 5.1.1, central + to the organization of the problem. + + Evaluations of Stringprep profiles that are included in Appendix B + were done by David Black, Alexey Melnikov, Peter Saint-Andre, and + Dave Thaler. + + + + + + + + + + + + + + +Blanchet & Sullivan Informational [Page 14] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +9. Informative References + + [78PRECIS] Blanchet, M., "PRECIS Framework", Proceedings of IETF + 78, July 2010, <http://www.ietf.org/proceedings/78/ + slides/precis-2.pdf>. + + [ID-COMP] Thaler, D., Ed., "Issues in Identifier Comparison for + Security Purposes", Work in Progress, March 2013. + + [NEWPREP] "Newprep BoF Meeting Minutes", March 2010, + <http://www.ietf.org/proceedings/77/minutes/ + newprep.txt>. + + [RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD + Internet host table specification", RFC 952, + October 1985. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of + Internationalized Strings ("stringprep")", RFC 3454, + December 2002. + + [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, + "Internationalizing Domain Names in Applications + (IDNA)", RFC 3490, March 2003. + + [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep + Profile for Internationalized Domain Names (IDN)", + RFC 3491, March 2003. + + [RFC3492] Costello, A., "Punycode: A Bootstring encoding of + Unicode for Internationalized Domain Names in + Applications (IDNA)", RFC 3492, March 2003. + + [RFC3530] Shepler, S., Callaghan, B., Robinson, D., Thurlow, R., + Beame, C., Eisler, M., and D. Noveck, "Network File + System (NFS) version 4 Protocol", RFC 3530, April 2003. + + [RFC3722] Bakke, M., "String Profile for Internet Small Computer + Systems Interface (iSCSI) Names", RFC 3722, April 2004. + + [RFC3748] Aboba, B., Blunk, L., Vollbrecht, J., Carlson, J., and + H. Levkowetz, "Extensible Authentication Protocol + (EAP)", RFC 3748, June 2004. + + + + + +Blanchet & Sullivan Informational [Page 15] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + [RFC3920] Saint-Andre, P., Ed., "Extensible Messaging and Presence + Protocol (XMPP): Core", RFC 3920, October 2004. + + [RFC3922] Saint-Andre, P., "Mapping the Extensible Messaging and + Presence Protocol (XMPP) to Common Presence and Instant + Messaging (CPIM)", RFC 3922, October 2004. + + [RFC4011] Waldbusser, S., Saperia, J., and T. Hongal, "Policy + Based Management MIB", RFC 4011, March 2005. + + [RFC4013] Zeilenga, K., "SASLprep: Stringprep Profile for User + Names and Passwords", RFC 4013, February 2005. + + [RFC4279] Eronen, P. and H. Tschofenig, "Pre-Shared Key + Ciphersuites for Transport Layer Security (TLS)", + RFC 4279, December 2005. + + [RFC4314] Melnikov, A., "IMAP4 Access Control List (ACL) + Extension", RFC 4314, December 2005. + + [RFC4422] Melnikov, A. and K. Zeilenga, "Simple Authentication and + Security Layer (SASL)", RFC 4422, June 2006. + + [RFC4505] Zeilenga, K., "Anonymous Simple Authentication and + Security Layer (SASL) Mechanism", RFC 4505, June 2006. + + [RFC4511] Sermersheim, J., "Lightweight Directory Access Protocol + (LDAP): The Protocol", RFC 4511, June 2006. + + [RFC4513] Harrison, R., "Lightweight Directory Access Protocol + (LDAP): Authentication Methods and Security Mechanisms", + RFC 4513, June 2006. + + [RFC4518] Zeilenga, K., "Lightweight Directory Access Protocol + (LDAP): Internationalized String Preparation", RFC 4518, + June 2006. + + [RFC4616] Zeilenga, K., "The PLAIN Simple Authentication and + Security Layer (SASL) Mechanism", RFC 4616, August 2006. + + [RFC4643] Vinocur, J. and K. Murchison, "Network News Transfer + Protocol (NNTP) Extension for Authentication", RFC 4643, + October 2006. + + [RFC4683] Park, J., Lee, J., Lee, H., Park, S., and T. Polk, + "Internet X.509 Public Key Infrastructure Subject + Identification Method (SIM)", RFC 4683, October 2006. + + + + +Blanchet & Sullivan Informational [Page 16] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review + and Recommendations for Internationalized Domain Names + (IDNs)", RFC 4690, September 2006. + + [RFC4790] Newman, C., Duerst, M., and A. Gulbrandsen, "Internet + Application Protocol Collation Registry", RFC 4790, + March 2007. + + [RFC4954] Siemborski, R. and A. Melnikov, "SMTP Service Extension + for Authentication", RFC 4954, July 2007. + + [RFC5034] Siemborski, R. and A. Menon-Sen, "The Post Office + Protocol (POP3) Simple Authentication and Security Layer + (SASL) Authentication Mechanism", RFC 5034, July 2007. + + [RFC5051] Crispin, M., "i;unicode-casemap - Simple Unicode + Collation Algorithm", RFC 5051, October 2007. + + [RFC5054] Taylor, D., Wu, T., Mavrogiannopoulos, N., and T. + Perrin, "Using the Secure Remote Password (SRP) Protocol + for TLS Authentication", RFC 5054, November 2007. + + [RFC5122] Saint-Andre, P., "Internationalized Resource Identifiers + (IRIs) and Uniform Resource Identifiers (URIs) for the + Extensible Messaging and Presence Protocol (XMPP)", + RFC 5122, February 2008. + + [RFC5280] Cooper, D., Santesson, S., Farrell, S., Boeyen, S., + Housley, R., and W. Polk, "Internet X.509 Public Key + Infrastructure Certificate and Certificate Revocation + List (CRL) Profile", RFC 5280, May 2008. + + [RFC5456] Spencer, M., Capouch, B., Guy, E., Miller, F., and K. + Shumard, "IAX: Inter-Asterisk eXchange Version 2", + RFC 5456, February 2010. + + [RFC5661] Shepler, S., Eisler, M., and D. Noveck, "Network File + System (NFS) Version 4 Minor Version 1 Protocol", + RFC 5661, January 2010. + + [RFC5802] Newman, C., Menon-Sen, A., Melnikov, A., and N. + Williams, "Salted Challenge Response Authentication + Mechanism (SCRAM) SASL and GSS-API Mechanisms", + RFC 5802, July 2010. + + [RFC5804] Melnikov, A. and T. Martin, "A Protocol for Remotely + Managing Sieve Scripts", RFC 5804, July 2010. + + + + +Blanchet & Sullivan Informational [Page 17] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + [RFC5890] Klensin, J., "Internationalized Domain Names for + Applications (IDNA): Definitions and Document + Framework", RFC 5890, August 2010. + + [RFC5891] Klensin, J., "Internationalized Domain Names in + Applications (IDNA): Protocol", RFC 5891, August 2010. + + [RFC5892] Faltstrom, P., "The Unicode Code Points and + Internationalized Domain Names for Applications (IDNA)", + RFC 5892, August 2010. + + [RFC5893] Alvestrand, H. and C. Karp, "Right-to-Left Scripts for + Internationalized Domain Names for Applications (IDNA)", + RFC 5893, August 2010. + + [RFC5894] Klensin, J., "Internationalized Domain Names for + Applications (IDNA): Background, Explanation, and + Rationale", RFC 5894, August 2010. + + [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for + Internationalized Domain Names in Applications (IDNA) + 2008", RFC 5895, September 2010. + + [RFC6120] Saint-Andre, P., "Extensible Messaging and Presence + Protocol (XMPP): Core", RFC 6120, March 2011. + + [RFC6365] Hoffman, P. and J. Klensin, "Terminology Used in + Internationalization in the IETF", BCP 166, RFC 6365, + September 2011. + + [RFC6452] Faltstrom, P. and P. Hoffman, "The Unicode Code Points + and Internationalized Domain Names for Applications + (IDNA) - Unicode 6.0", RFC 6452, November 2011. + + [UAX15] "Unicode Standard Annex #15: Unicode Normalization + Forms", UAX 15, September 2009. + + [Unicode61] The Unicode Consortium. The Unicode Standard, Version + 6.1.0, (Mountain View, CA: The Unicode Consortium, 2012. + ISBN 978-1-936213-02-3). + <http://www.unicode.org/versions/Unicode6.1.0/>. + + + + + + + + + + +Blanchet & Sullivan Informational [Page 18] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +Appendix A. Classification of Stringprep Profiles + + A number of the known cases of Stringprep use were evaluated during + the preparation of this document. The known cases are here described + in two ways. The types of identifiers the protocol uses is first + called out in the ID type column (from Section 5.1.1) using the short + forms "a" for Absolute, "d" for Definite, and "i" for Indefinite. + Next, there is a column that contains an "i" if the protocol string + comes from user input, an "o" if the protocol string becomes user- + facing output, "b" if both are true, and "n" if neither is true. + + +------+--------+-------+ + | RFC | IDtype | User? | + +------+--------+-------+ + | 3722 | a | b | + | 3748 | - | - | + | 3920 | a,d | b | + | 4505 | a | i | + | 4314 | a,d | b | + | 4954 | a,d | b | + | 5034 | a,d | b | + | 5804 | a,d | b | + +------+--------+-------+ + + Table 1 + +Appendix B. Evaluation of Stringprep Profiles + + This section is a summary of evaluation of Stringprep profiles that + was done to get a good understanding of the usage of Stringprep. + This summary is by no means normative nor the actual evaluations + themselves. A template was used for reviewers to get a coherent view + of all evaluations. + +B.1. iSCSI Stringprep Profile: RFC 3720, RFC 3721, RFC 3722 + + Description: An iSCSI session consists of an initiator (i.e., host + or server that uses storage) communicating with a target (i.e., a + storage array or other system that provides storage). Both the + iSCSI initiator and target are named by iSCSI names. The iSCSI + Stringprep profile is used for iSCSI names. + + How it is used: iSCSI initiators and targets (see above). They can + also be used to identify SCSI ports (these are software entities + in the iSCSI protocol, not hardware ports) and iSCSI logical units + (storage volumes), although both are unusual in practice. + + + + + +Blanchet & Sullivan Informational [Page 19] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + What entities create these identifiers? Generally, a human user (1) + configures an automated system (2) that generates the names. + Advance configuration of the system is required due to the + embedded use of external unique identifier (from the DNS or IEEE). + + How is the string input in the system? Keyboard and copy-paste are + common. Copy-paste is common because iSCSI names are long enough + to be problematic for humans to remember, causing use of email, + sneaker-net, text files, etc., to avoid mistype mistakes. + + Where do we place the dividing line between user interface and + protocol? The iSCSI protocol requires that all + internationalization string preparation occur in the user + interface. The iSCSI protocol treats iSCSI names as opaque + identifiers that are compared byte-by-byte for equality. iSCSI + names are generally not checked for correct formatting by the + protocol. + + What entities enforce the rules? There are no iSCSI-specific + enforcement entities, although the use of unique identifier + information in the names relies on DNS registrars and the IEEE + Registration Authority. + + Comparison: Byte-by-byte. + + Case Folding, Sensitivity, Preservation: Case folding is required + for the code blocks specified in RFC 3454, Table B.2. The overall + iSCSI naming system (UI + protocol) is case-insensitive. + + What is the impact if the comparison results in a false positive? + Potential access to the wrong storage. + + - If the initiator has no access to the wrong storage, an + authentication failure is the probable result. + + - If the initiator has access to the wrong storage, the resulting + misidentification could result in use of the wrong data and + possible corruption of stored data. + + What is the impact if the comparison results in a false negative? + Denial of authorized storage access. + + What are the security impacts? iSCSI names may be used as the + authentication identities for storage systems. Comparison + problems could result in authentication problems, although note + that authentication failure ameliorates some of the false positive + cases. + + + + +Blanchet & Sullivan Informational [Page 20] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Normalization: NFKC, as specified by RFC 3454. + + Mapping: Yes, as specified by Table B.1 in RFC 3454. + + Disallowed Characters: Only the following characters are allowed: + - ASCII dash, dot, colon + - ASCII lowercase letters and digits + - Unicode lowercase characters as specified by RFC 3454. + All other characters are disallowed. + + Which other strings or identifiers are these most similar to? + None -- iSCSI names are unique to iSCSI. + + Are these strings or identifiers sometimes the same as strings or + identifiers from other protocols? No. + + Does the identifier have internal structure that needs to be + respected? Yes. ASCII dot, dash, and colon are used for internal + name structure. These are not reserved characters, in that they + can occur in the name in locations other than those used for + structuring purposes (e.g., only the first occurrence of a colon + character is structural, others are not). + + How are users exposed to these strings? How are they published? + iSCSI names appear in server and storage system configuration + interfaces. They also appear in system logs. + + Is the string / identifier used as input to other operations? + Effectively, no. The rarely used port and logical unit names + involve concatenation, which effectively extends a unique iSCSI + name for a target to uniquely identify something within that + target. + + How much tolerance for change from existing Stringprep approach? + Good tolerance; the community would prefer that + internationalization experts solve internationalization problems. + + How strong a desire for change (e.g., for Unicode agility)? Unicode + agility is desired, in principle, as long as nothing significant + breaks. + +B.2. SMTP/POP3/ManageSieve Stringprep Profiles: RFC 4954, RFC 5034, + RFC 5804 + + Description: Authorization identity (user identifier) exchanged + during SASL authentication: AUTH (SMTP/POP3) or AUTHENTICATE + (ManageSieve) command. + + + + +Blanchet & Sullivan Informational [Page 21] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + How It's Used: Used for proxy authorization, e.g., to [lawfully] + impersonate a particular user after a privileged authentication. + + Who Generates It: + - Typically generated by email system administrators using some + tools/conventions, sometimes from some backend database. + - In some setups, human users can register their own usernames + (e.g., webmail self-registration). + + User Input Methods: + - typing or selecting from a list + - copy and paste + - voice input + - in configuration files or on the command line + + Enforcement: Rules enforced by server / add-on service (e.g., + gateway service) on registration of account. + + Comparison Method: "Type 1" (byte-for-byte) or "Type 2" (compare by + a common algorithm that everyone agrees on (e.g., normalize and + then compare the result byte-by-byte). + + Case Folding, Sensitivity, Preservation: Most likely case-sensitive. + Exact requirements on case-sensitivity/case-preservation depend on + a specific implementation, e.g., an implementation might treat all + user identifiers as case-insensitive (or case-insensitive for + US-ASCII subset only). + + Impact of Comparison: False positives: an unauthorized user is + allowed email service access (login). False negatives: an + authorized user is denied email service access. + + Normalization: NFKC (as per RFC 4013). + + Mapping: (see Section 2 of RFC 4013 for the full list) Non-ASCII + spaces are mapped to space, etc. + + Disallowed Characters: (see Section 2 of RFC 4013 for the full list) + Unicode Control characters, etc. + + String Classes: Simple username. See Section 2 of RFC 4013 for + details on restrictions. Note that some implementations allow + spaces in these. While implementations are not required to use a + specific format, an authorization identity frequently has the same + format as an email address (and Email Address Internationalization + (EAI) email address in the future), or as a left hand side of an + email address. Note: whatever is recommended for SMTP/POP/ + + + + +Blanchet & Sullivan Informational [Page 22] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + ManageSieve authorization identity should also be used for IMAP + authorization identities, as IMAP/POP3/SMTP/ManageSieve are + frequently implemented together. + + Internal Structure: None + + User Output: Unlikely, but possible. For example, if it is the same + as an email address. + + Operations: Sometimes concatenated with other data and then used as + input to a cryptographic hash function. + + How much tolerance for change from existing Stringprep approach? Not + sure. + + Background Information: + In RFC 5034, when describing the POP3 AUTH command: + + The authorization identity generated by the SASL exchange is a + simple username, and SHOULD use the SASLprep profile (see + [RFC4013]) of the StringPrep algorithm (see [RFC3454]) to + prepare these names for matching. If preparation of the + authorization identity fails or results in an empty string + (unless it was transmitted as the empty string), the server + MUST fail the authentication. + + In RFC 4954, when describing the SMTP AUTH command: + + The authorization identity generated by this [SASL] exchange is + a "simple username" (in the sense defined in [SASLprep]), and + both client and server SHOULD (*) use the [SASLprep] profile of + the [StringPrep] algorithm to prepare these names for + transmission or comparison. If preparation of the + authorization identity fails or results in an empty string + (unless it was transmitted as the empty string), the server + MUST fail the authentication. + + (*) Note: Future revision of this specification may change this + requirement to MUST. Currently, the SHOULD is used in order to + avoid breaking the majority of existing implementations. + + + + + + + + + + + +Blanchet & Sullivan Informational [Page 23] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + In RFC 5804, when describing the ManageSieve AUTHENTICATE command: + + The authorization identity generated by this [SASL] exchange is + a "simple username" (in the sense defined in [SASLprep]), and + both client and server MUST use the [SASLprep] profile of the + [StringPrep] algorithm to prepare these names for transmission + or comparison. If preparation of the authorization identity + fails or results in an empty string (unless it was transmitted + as the empty string), the server MUST fail the authentication. + +B.3. IMAP Stringprep Profiles for Usernames: RFC 4314, RFC 5738 + + Evaluation Note: These documents have 2 types of strings (usernames + and passwords), so there are two separate templates. + + Description: "username" parameter to the IMAP LOGIN command, + identifiers in IMAP Access Control List (ACL) commands. Note that + any valid username is also an IMAP ACL identifier, but IMAP ACL + identifiers can include other things like the name of a group of + users. + + How It's Used: Used for authentication (Usernames), or in IMAP + Access Control Lists (Usernames or Group names). + + Who Generates It: + - Typically generated by email system administrators using some + tools/conventions, sometimes from some backend database. + - In some setups, human users can register own usernames (e.g., + webmail self-registration). + + User Input Methods: + - typing or selecting from a list + - copy and paste + - voice input + - in configuration files or on the command line + + Enforcement: Rules enforced by server / add-on service (e.g., + gateway service) on registration of account. + + Comparison Method: "Type 1" (byte-for-byte) or "Type 2" (compare by + a common algorithm that everyone agrees on (e.g., normalize and + then compare the result byte-by-byte). + + Case Folding, Sensitivity, Preservation: Most likely case-sensitive. + Exact requirements on case-sensitivity/case-preservation depend on + a specific implementation, e.g., an implementation might treat all + user identifiers as case-insensitive (or case-insensitive for + US-ASCII subset only). + + + +Blanchet & Sullivan Informational [Page 24] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Impact of Comparison: False positives: an unauthorized user is + allowed IMAP access (login), privileges improperly granted (e.g., + access to a specific mailbox, ability to manage ACLs for a + mailbox). False negatives: an authorized user is denied IMAP + access, unable to use granted privileges (e.g., access to a + specific mailbox, ability to manage ACLs for a mailbox). + + Normalization: NFKC (as per RFC 4013) + + Mapping: (see Section 2 of RFC 4013 for the full list) Non-ASCII + spaces are mapped to space. + + Disallowed Characters: (see Section 2 of RFC 4013 for the full list) + Unicode Control characters, etc. + + String Classes: Simple username. See Section 2 of RFC 4013 for + details on restrictions. Note that some implementations allow + spaces in these. While IMAP implementations are not required to + use a specific format, an IMAP username frequently has the same + format as an email address (and EAI email address in the future), + or as a left hand side of an email address. Note: whatever is + recommended for the IMAP username should also be used for + ManageSieve, POP3 and SMTP authorization identities, as IMAP/POP3/ + SMTP/ManageSieve are frequently implemented together. + + Internal Structure: None. + + User Output: Unlikely, but possible. For example, if it is the same + as an email address, access control lists (e.g. in IMAP ACL + extension), both when managing membership and listing membership + of existing access control lists. Often shows up as mailbox names + (under Other Users IMAP namespace). + + Operations: Sometimes concatenated with other data and then used as + input to a cryptographic hash function. + + How much tolerance for change from existing Stringprep approach? Not + sure. Non-ASCII IMAP usernames are currently prohibited by IMAP + (RFC 3501). However, they are allowed when used in IMAP ACL + extension. + + + + + + + + + + + +Blanchet & Sullivan Informational [Page 25] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + +B.4. IMAP Stringprep Profiles for Passwords: RFC 5738 + + Description: "Password" parameter to the IMAP LOGIN command. + + How It's Used: Used for authentication (Passwords). + + Who Generates It: Either generated by email system administrators + using some tools/conventions, or specified by the human user. + + User Input Methods: + - typing or selecting from a list + - copy and paste + - voice input + - in configuration files or on the command line + + Enforcement: Rules enforced by server / add-on service (e.g., + gateway service or backend database) on registration of account. + + Comparison Method: "Type 1" (byte-for-byte). + + Case Folding, Sensitivity, Preservation: Most likely case-sensitive. + + Impact of Comparison: False positives: an unauthorized user is + allowed IMAP access (login). False negatives: an authorized user + is denied IMAP access. + + Normalization: NFKC (as per RFC 4013). + + Mapping: (see Section 2 of RFC 4013 for the full list) Non-ASCII + spaces are mapped to space. + + Disallowed Characters: (see Section 2 of RFC 4013 for the full list) + Unicode Control characters, etc. + + String Classes: Currently defined as "simple username" (see Section + 2 of RFC 4013 for details on restrictions); however, this is + likely to be a different class from usernames. Note that some + implementations allow spaces in these. Password in all email + related protocols should be treated in the same way. Same + passwords are frequently shared with web, IM, and etc. + applications. + + Internal Structure: None. + + User Output: Text of email messages (e.g. in "you forgot your + password" email messages), web page / directory, side of the bus / + in ads -- possible. + + + + +Blanchet & Sullivan Informational [Page 26] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Operations: Sometimes concatenated with other data and then used as + input to a cryptographic hash function. Frequently stored as is, + or hashed. + + How much tolerance for change from existing Stringprep approach? Not + sure. Non-ASCII IMAP passwords are currently prohibited by IMAP + (RFC 3501); however, they are likely to be in widespread use. + + Background Information: + RFC 5738, Section 5 ("UTF8=USER Capability"): + + If the "UTF8=USER" capability is advertised, that indicates the + server accepts UTF-8 user names and passwords and applies + SASLprep [RFC4013] to both arguments of the LOGIN command. The + server MUST reject UTF-8 that fails to comply with the formal + syntax in RFC 3629 [RFC3629] or if it encounters Unicode + characters listed in Section 2.3 of SASLprep RFC 4013 + [RFC4013]. + + RFC 4314, Section 3 ("Access control management commands and + responses"): + + Servers, when processing a command that has an identifier as a + parameter (i.e., any of SETACL, DELETEACL, and LISTRIGHTS + commands), SHOULD first prepare the received identifier using + "SASLprep" profile [SASLprep] of the "stringprep" algorithm + [Stringprep]. If the preparation of the identifier fails or + results in an empty string, the server MUST refuse to perform + the command with a BAD response. Note that Section 6 + recommends additional identifier's verification steps. + + RFC 4314, Section 6 ("Security Considerations"): + + This document relies on [SASLprep] to describe steps required + to perform identifier canonicalization (preparation). The + preparation algorithm in SASLprep was specifically designed + such that its output is canonical, and it is well-formed. + However, due to an anomaly [PR29] in the specification of + Unicode normalization, canonical equivalence is not guaranteed + for a select few character sequences. Identifiers prepared + with SASLprep can be stored and returned by an ACL server. The + anomaly affects ACL manipulation and evaluation of identifiers + containing the selected character sequences. These sequences, + however, do not appear in well-formed text. In order to + address this problem, an ACL server MAY reject identifiers + containing sequences described in [PR29] by sending the tagged + + + + + +Blanchet & Sullivan Informational [Page 27] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + BAD response. This is in addition to the requirement to reject + identifiers that fail SASLprep preparation as described in + Section 3. + +B.5. Anonymous SASL Stringprep Profiles: RFC 4505 + + Description: RFC 4505 defines a "trace" field: + + Comparison: this field is not intended for comparison (only used for + logging) + + Case folding; case-sensitivity, preserve case: No case folding/ + case-sensitive + + Do users input the strings directly? Yes. Possibly entered in + configuration UIs, or on a command line. Can also be stored in + configuration files. The value can also be automatically + generated by clients (e.g., a fixed string is used, or a user's + email address). + + How users input strings? Keyboard/voice, stylus (pick from a list). + Copy-paste - possibly. + + Normalization: None. + + Disallowed Characters: Control characters are disallowed. (See + Section 3 of RFC 4505). + + Which other strings or identifiers are these most similar to? + RFC 4505 says that the trace "should take one of two forms: an + Internet email address, or an opaque string that does not contain + the '@' (U+0040) character and that can be interpreted by the + system administrator of the client's domain". In practice, this + is a free-form text, so it belongs to a different class from + "email address" or "username". + + Are these strings or identifiers sometimes the same as strings or + identifiers from other protocols (e.g., does an IM system + sometimes use the same credentials database for authentication as + an email system)? Yes: see above. However, there is no strong + need to keep them consistent in the future. + + How are users exposed to these strings, how are they published? No. + However, the value can be seen in server logs. + + + + + + + +Blanchet & Sullivan Informational [Page 28] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Impacts of false positives and false negatives: + False positive: a user can be confused with another user. + False negative: two distinct users are treated as the same user. + But note that the trace field is not authenticated, so it can be + easily falsified. + + Tolerance of changes in the community: The community would be + flexible. + + Delimiters: No internal structure, but see comments above about + frequent use of email addresses. + + Background Information: + RFC 4505, Section 2 ("The Anonymous Mechanism"): + + The mechanism consists of a single message from the client to the + server. The client may include in this message trace information + in the form of a string of [UTF-8]-encoded [Unicode] characters + prepared in accordance with [StringPrep] and the "trace" + stringprep profile defined in Section 3 of this document. The + trace information, which has no semantical value, should take one + of two forms: an Internet email address, or an opaque string that + does not contain the '@' (U+0040) character and that can be + interpreted by the system administrator of the client's domain. + For privacy reasons, an Internet email address or other + information identifying the user should only be used with + permission from the user. + + RFC 4505, Section 3 ('The "trace" Profile of "Stringprep"'): + This section defines the "trace" profile of [StringPrep]. This + profile is designed for use with the SASL ANONYMOUS Mechanism. + Specifically, the client is to prepare the <message> production in + accordance with this profile. + + The character repertoire of this profile is Unicode 3.2 [Unicode]. + + No mapping is required by this profile. + + No Unicode normalization is required by this profile. + + The list of unassigned code points for this profile is that + provided in Appendix A of [StringPrep]. Unassigned code points + are not prohibited. + + + + + + + + +Blanchet & Sullivan Informational [Page 29] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Characters from the following tables of [StringPrep] are + prohibited: + + - C.2.1 (ASCII control characters) + - C.2.2 (Non-ASCII control characters) + - C.3 (Private use characters) + - C.4 (Non-character code points) + - C.5 (Surrogate codes) + - C.6 (Inappropriate for plain text) + - C.8 (Change display properties are deprecated) + - C.9 (Tagging characters) + + No additional characters are prohibited. + + This profile requires bidirectional character checking per Section 6 + of [StringPrep]. + +B.6. XMPP Stringprep Profiles for Nodeprep: RFC 3920 + + Description: Localpart of JabberID ("JID"), as in: + localpart@domainpart/resourcepart + + How It's Used: + - Usernames (e.g., stpeter@jabber.org) + - Chatroom names (e.g., precis@jabber.ietf.org) + - Publish-subscribe nodes + - Bot names + + Who Generates It: + - Typically, end users via an XMPP client + - Sometimes created in an automated fashion + + User Input Methods: + - typing + - copy and paste + - voice input + - clicking a URI/IRI + + Enforcement: Rules enforced by server / add-on service (e.g., + chatroom service) on registration of account, creation of room, + etc. + + Comparison Method: "Type 2" (common algorithm) + + Case Folding, Sensitivity, Preservation: + - Strings are always folded to lowercase + - Case is not preserved + + + + +Blanchet & Sullivan Informational [Page 30] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Impact of Comparison: + False positives: + - unable to authenticate at server (or authenticate to wrong + account) + - add wrong person to buddy list + - join the wrong chatroom + - improperly grant privileges (e.g., chatroom admin) + - subscribe to wrong pubsub node + - interact with wrong bot + - allow communication with blocked entity + + False negatives: + - unable to authenticate + - unable to add someone to buddy list + - unable to join desired chatroom + - unable to use granted privileges (e.g., chatroom admin) + - unable to subscribe to desired pubsub node + - unable to interact with desired bot + - disallow communication with unblocked entity + + Normalization: NFKC + + Mapping: Spaces are mapped to nothing + + Disallowed Characters: ",&,',/,:,<,>,@ + + String Classes: + - Often similar to generic username + - Often similar to localpart of email address + - Sometimes same as localpart of email address + + Internal Structure: None + + User Output: + - vCard + - email signature + - web page / directory + - text of message (e.g., in a chatroom) + + Operations: Sometimes concatenated with other data and then used as + input to a cryptographic hash function + +B.7. XMPP Stringprep Profiles for Resourceprep: RFC 3920 + + Description: + - Resourcepart of JabberID ("JID"), as in: + localpart@domainpart/resourcepart + - Typically free-form text + + + +Blanchet & Sullivan Informational [Page 31] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + How It's Used: + - Device / session names (e.g., stpeter@jabber.org/Home) + - Nicknames (e.g., precis@jabber.ietf.org/StPeter) + + Who Generates It: + - Often human users via an XMPP client + - Often generated in an automated fashion by client or server + + User Input Methods: + - typing + - copy and paste + - voice input + - clicking a URI/IRI + + Enforcement: Rules enforced by server / add-on service (e.g., + chatroom service) on account login, joining a chatroom, etc. + + Comparison Method: "Type 2" (byte-for-byte) + + Case Folding, Sensitivity, Preservation: + - Strings are never folded + - Case is preserved + + Impact of Comparison: + False positives: + - interact with wrong device (e.g., for file transfer or voice + call) + - interact with wrong chatroom participant + - improperly grant privileges (e.g., chatroom moderator) + - allow communication with blocked entity + False negatives: + - unable to choose desired chatroom nickname + - unable to use granted privileges (e.g., chatroom moderator) + - disallow communication with unblocked entity + + Normalization: NFKC + + Mapping: Spaces are mapped to nothing + + Disallowed Characters: None + + String Classes: Basically a free-form identifier + + Internal Structure: None + + User Output: + - text of message (e.g., in a chatroom) + - device names often not exposed to human users + + + +Blanchet & Sullivan Informational [Page 32] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Operations: Sometimes concatenated with other data and then used as + input to a cryptographic hash function + +B.8. EAP Stringprep Profiles: RFC 3748 + + Description: RFC 3748, Section 5, references Stringprep, but the WG + did not agree with the text (was added by IESG) and there are no + known implementations that use Stringprep. The main problem with + that text is that the use of strings is a per-method concept, not + a generic EAP concept and so RFC 3748 itself does not really use + Stringprep, but individual EAP methods could. As such, the + answers to the template questions are mostly not applicable, but a + few answers are universal across methods. The list of IANA + registered EAP methods is at + <http://www.iana.org/assignments/eap-numbers/eap-numbers.xml>. + + Comparison Methods: n/a (per-method) + + Case Folding, Case-Sensitivity, Case Preservation: n/a (per-method) + + Impact of comparison: A false positive results in unauthorized + network access (and possibly theft of service if some else is + billed). A false negative results in lack of authorized network + access (no connectivity). + + User input: n/a (per-method) + + Normalization: n/a (per-method) + + Mapping: n/a (per-method) + + Disallowed characters: n/a (per-method) + + String classes: Although some EAP methods may use a syntax similar + to other types of identifiers, EAP mandates that the actual values + must not be assumed to be identifiers usable with anything else. + + Internal structure: n/a (per-method) + + User output: Identifiers are never human displayed except perhaps as + they're typed by a human. + + Operations: n/a (per-method) + + + + + + + + +Blanchet & Sullivan Informational [Page 33] + +RFC 6885 Stringprep Revision Problem Statement March 2013 + + + Community considerations: There is no resistance to change for the + base EAP protocol (as noted, the WG didn't want the existing + text). However, actual use of Stringprep, if any, within specific + EAP methods may have resistance. It is currently unknown whether + any EAP methods use Stringprep. + +Authors' Addresses + + Marc Blanchet + Viagenie + 246 Aberdeen + Quebec, QC G1R 2E1 + Canada + + EMail: Marc.Blanchet@viagenie.ca + URI: http://viagenie.ca + + + Andrew Sullivan + Dyn, Inc. + 150 Dow St + Manchester, NH 03101 + U.S.A. + + EMail: asullivan@dyn.com + + + + + + + + + + + + + + + + + + + + + + + + + + +Blanchet & Sullivan Informational [Page 34] + |