1 files changed, 1907 insertions, 0 deletions
diff --git a/doc/rfc/rfc6885.txt b/doc/rfc/rfc6885.txt
new file mode 100644
index 0000000..46eb67f
--- /dev/null
+++ b/doc/rfc/rfc6885.txt
@@ -0,0 +1,1907 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                       M. Blanchet
+Request for Comments: 6885                                      Viagenie
+Category: Informational                                      A. Sullivan
+ISSN: 2070-1721                                                Dyn, Inc.
+                                                              March 2013
+
+
+               Stringprep Revision and Problem Statement
+for the Preparation and Comparison of Internationalized Strings (PRECIS)
+
+Abstract
+
+   If a protocol expects to compare two strings and is prepared only for
+   those strings to be ASCII, then using Unicode code points in those
+   strings requires they be prepared somehow.  Internationalizing Domain
+   Names in Applications (here called IDNA2003) defined and used
+   Stringprep and Nameprep.  Other protocols subsequently defined
+   Stringprep profiles.  A new approach different from Stringprep and
+   Nameprep is used for a revision of IDNA2003 (called IDNA2008).  Other
+   Stringprep profiles need to be similarly updated, or a replacement of
+   Stringprep needs to be designed.  This document outlines the issues
+   to be faced by those designing a Stringprep replacement.
+
+Status of This Memo
+
+   This document is not an Internet Standards Track specification; it is
+   published for informational purposes.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Not all documents
+   approved by the IESG are a candidate for any level of Internet
+   Standard; see Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc6885.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 1]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+Copyright Notice
+
+   Copyright (c) 2013 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 2]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+Table of Contents
+
+   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
+   2.  Keywords . . . . . . . . . . . . . . . . . . . . . . . . . . .  6
+   3.  Conventions  . . . . . . . . . . . . . . . . . . . . . . . . .  6
+   4.  Stringprep Profiles Limitations  . . . . . . . . . . . . . . .  6
+   5.  Major Topics for Consideration . . . . . . . . . . . . . . . .  8
+     5.1.  Comparison . . . . . . . . . . . . . . . . . . . . . . . .  8
+       5.1.1.  Types of Identifiers . . . . . . . . . . . . . . . . .  8
+       5.1.2.  Effect of Comparison . . . . . . . . . . . . . . . . .  8
+     5.2.  Dealing with Characters  . . . . . . . . . . . . . . . . .  9
+       5.2.1.  Case Folding, Case Sensitivity, and Case
+               Preservation . . . . . . . . . . . . . . . . . . . . .  9
+       5.2.2.  Stringprep and NFKC  . . . . . . . . . . . . . . . . .  9
+       5.2.3.  Character Mapping  . . . . . . . . . . . . . . . . . . 10
+       5.2.4.  Prohibited Characters  . . . . . . . . . . . . . . . . 10
+       5.2.5.  Internal Structure, Delimiters, and Special
+               Characters . . . . . . . . . . . . . . . . . . . . . . 10
+       5.2.6.  Restrictions Because of Glyph Similarity . . . . . . . 11
+     5.3.  Where the Data Comes from and Where It Goes  . . . . . . . 11
+       5.3.1.  User Input and the Source of Protocol Elements . . . . 11
+       5.3.2.  User Output  . . . . . . . . . . . . . . . . . . . . . 12
+       5.3.3.  Operations . . . . . . . . . . . . . . . . . . . . . . 12
+   6.  Considerations for Stringprep Replacement  . . . . . . . . . . 13
+   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 14
+   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14
+   9.  Informative References . . . . . . . . . . . . . . . . . . . . 15
+   Appendix A.  Classification of Stringprep Profiles . . . . . . . . 19
+   Appendix B.  Evaluation of Stringprep Profiles . . . . . . . . . . 19
+     B.1.  iSCSI Stringprep Profile: RFC 3720, RFC 3721, RFC 3722 . . 19
+     B.2.  SMTP/POP3/ManageSieve Stringprep Profiles: RFC 4954,
+           RFC 5034, RFC 5804 . . . . . . . . . . . . . . . . . . . . 21
+     B.3.  IMAP Stringprep Profiles for Usernames: RFC 4314, RFC
+           5738 . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
+     B.4.  IMAP Stringprep Profiles for Passwords: RFC 5738 . . . . . 26
+     B.5.  Anonymous SASL Stringprep Profiles: RFC 4505 . . . . . . . 28
+     B.6.  XMPP Stringprep Profiles for Nodeprep: RFC 3920  . . . . . 30
+     B.7.  XMPP Stringprep Profiles for Resourceprep: RFC 3920  . . . 31
+     B.8.  EAP Stringprep Profiles: RFC 3748  . . . . . . . . . . . . 33
+
+
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 3]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+1.  Introduction
+
+   Internationalizing Domain Names in Applications (here called
+   IDNA2003) [RFC3490] [RFC3491] [RFC3492] and [RFC3454] describes a
+   mechanism for encoding Unicode labels that make up the
+   Internationalized Domain Names (IDNs) as standard DNS labels.  The
+   labels were processed using a method called Nameprep [RFC3491] and
+   Punycode [RFC3492].  That method was specific to IDNA2003 but is
+   generalized as Stringprep [RFC3454].  The general mechanism is used
+   by other protocols with similar needs but with different constraints
+   than IDNA2003.
+
+   Stringprep defines a framework within which protocols define their
+   Stringprep profiles.  Some known IETF specifications using Stringprep
+   are listed below:
+
+   o  The Nameprep profile [RFC3490] for use in Internationalized Domain
+      Names (IDNs);
+
+   o  The Inter-Asterisk eXchange (IAX) using Nameprep [RFC5456];
+
+   o  NFSv4 [RFC3530] and NFSv4.1 [RFC5661];
+
+   o  The Internet Small Computer System Interface (iSCSI) profile
+      [RFC3722] for use in iSCSI names;
+
+   o  The Extensible Authentication Protocol (EAP) [RFC3748];
+
+   o  The Nodeprep and Resourceprep profiles [RFC3920] (which was
+      obsoleted by [RFC6120]) for use in the Extensible Messaging and
+      Presence Protocol (XMPP), and the XMPP to Common Presence and
+      Instant Messaging (CPIM) mapping [RFC3922] (the latter of these
+      relies on the former);
+
+   o  The Internationalized Resource Identifier (IRI) and URI in XMPP
+      [RFC5122];
+
+   o  The Policy MIB profile [RFC4011] for use in the Simple Network
+      Management Protocol (SNMP);
+
+   o  Transport Layer Security (TLS) [RFC4279];
+
+   o  The Lightweight Directory Access Protocol (LDAP) profile [RFC4518]
+      for use with LDAP [RFC4511] and its authentication methods
+      [RFC4513];
+
+   o  PKIX subject identification using LDAPprep [RFC4683];
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 4]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   o  PKIX Certificate Revocation List (CRL) using LDAPprep [RFC5280];
+
+   o  The Simple Authentication and Security Layer (SASL) [RFC4422] and
+      SASLprep profile [RFC4013] for use in SASL;
+
+   o  Plain SASL using SASLprep [RFC4616];
+
+   o  SMTP Auth using SASLprep [RFC4954];
+
+   o  The Post Office Protocol (POP3) Auth using SASLprep [RFC5034];
+
+   o  TLS Secure Remote Password (SRP) using SASLprep [RFC5054];
+
+   o  SASL Salted Challenge Response Authentication Mechanism (SCRAM)
+      using SASLprep [RFC5802];
+
+   o  Remote management of Sieve using SASLprep [RFC5804];
+
+   o  The Network News Transfer Protocol (NNTP) using SASLprep
+      [RFC4643];
+
+   o  IMAP4 using SASLprep [RFC4314];
+
+   o  The trace profile [RFC4505] for use with the SASL ANONYMOUS
+      mechanism;
+
+   o  Internet Application Protocol Collation Registry [RFC4790];
+
+   o  The unicode-casemap Unicode Collation [RFC5051].
+
+   However, a review (see [78PRECIS]) of these protocol specifications
+   found that they are very similar and can be grouped into a short
+   number of classes.  Moreover, many reuse the same Stringprep profile,
+   such as the SASL one.
+
+   IDNA2003 was replaced because of some limitations described in
+   [RFC4690].  The new IDN specification, called IDNA2008 [RFC5890],
+   [RFC5891], [RFC5892], [RFC5893] was designed based on the
+   considerations found in [RFC5894].  One of the effects of IDNA2008 is
+   that Nameprep and Stringprep are not used at all.  Instead, an
+   algorithm based on Unicode properties of code points is defined.
+   That algorithm generates a stable and complete table of the supported
+   Unicode code points for each Unicode version.  This algorithm uses an
+   inclusion-based approach, instead of the exclusion-based approach of
+   Stringprep/Nameprep.  That is, IDNA2003 created an explicit list of
+   excluded or mapped-away characters; anything in Unicode 3.2 that was
+   not so listed could be assumed to be allowed under the protocol.
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 5]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   IDNA2008 begins instead from the assumption that code points are
+   disallowed and then relies on Unicode properties to derive whether a
+   given code point actually is allowed in the protocol.
+
+   This document lists the shortcomings and issues found by protocols
+   listed above that defined Stringprep profiles.  It also lists the
+   requirements for any potential replacement of Stringprep.
+
+2.  Keywords
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in [RFC2119].
+
+   This document uses various internationalization terms, which are
+   defined and discussed in [RFC6365].
+
+   Additionally, this document defines the following keyword:
+
+      PRECIS: Preparation and Comparison of Internationalized Strings
+
+3.  Conventions
+
+   A single Unicode code point in this memo is denoted by "U+" followed
+   by four to six hexadecimal digits, as used in [Unicode61],
+   Appendix A.
+
+4.  Stringprep Profiles Limitations
+
+   During IETF 77 (March 2010), a BOF discussed the current state of the
+   protocols that have defined Stringprep profiles [NEWPREP].  The main
+   conclusions from that discussion were as follows:
+
+   o  Stringprep is bound to Version 3.2 of Unicode.  Stringprep has not
+      been updated to new versions of Unicode.  Therefore, the protocols
+      using Stringprep are stuck at Unicode 3.2, and their
+      specifications need to be updated to support new versions of
+      Unicode.
+
+   o  The protocols would like to not be bound to a specific version of
+      Unicode, but rather have better Unicode version agility in the way
+      of IDNA2008.  This is important partly because it is usually
+      impossible for an application to require Unicode 3.2; the
+      application gets whatever version of Unicode is available on the
+      host.
+
+   o  The protocols require better bidirectional support (bidi) than
+      currently offered by Stringprep.
+
+
+
+Blanchet & Sullivan           Informational                     [Page 6]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   o  If the protocols are updated to use a new version of Stringprep or
+      another framework, then backward compatibility is an important
+      requirement.  For example, Stringprep normalization is based on
+      and profiles may use Unicode Normalization Form KC (NFKC) [UAX15],
+      while IDNA2008 mostly uses Unicode Normalization Form C (NFC)
+      [UAX15].
+
+   o  Identifiers are passed between protocols.  For example, the same
+      username string of code points may be passed between SASL, XMPP,
+      LDAP, and EAP.  Therefore, a common set of rules or classes of
+      strings are preferred over specific rules for each protocol.
+      Without real planning in advance, many Stringprep profiles reuse
+      other profiles, so this goal was accomplished by accident with
+      Stringprep.
+
+   Protocols that use Stringprep profiles use strings for different
+   purposes:
+
+   o  XMPP uses a different Stringprep profile for each part of the XMPP
+      address Jabber Identifier (JID): a localpart, which is similar to
+      a username and used for authentication; a domainpart, which is a
+      domain name; and a resourcepart, which is less restrictive than
+      the localpart.
+
+   o  iSCSI uses a Stringprep profile for the names of protocol
+      participants (called initiators and targets).  The iSCSI Qualified
+      Name (IQN) format of iSCSI names contains a reversed DNS domain
+      name.
+
+   o  SASL and LDAP use a Stringprep profile for usernames.
+
+   o  LDAP uses a set of Stringprep profiles.
+
+   The apparent judgement of the BOF attendees [NEWPREP] was that it
+   would be highly desirable to have a replacement of Stringprep, with
+   similar characteristics to IDNA2008.  That replacement should be
+   defined so that the protocols could use internationalized strings
+   without a lot of specialized internationalization work, since
+   internationalization expertise is not available in the respective
+   protocols or working groups.  Accordingly, the IESG formed the PRECIS
+   working group to undertake the task.
+
+   Notwithstanding the desire evident in [NEWPREP] and the chartering of
+   a working group, IDNA2008 may be a poor model for what other
+   protocols ought to do, because it is designed to support an old
+   protocol that is designed to operate on the scale of the entire
+   Internet.  Moreover, IDNA2008 is intended to be deployed without any
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 7]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   change to the base DNS protocol.  Other protocols may aim at
+   deployment in more local environments, or may have protocol version
+   negotiation built in.
+
+5.  Major Topics for Consideration
+
+   This section provides an overview of major topics that a Stringprep
+   replacement needs to address.  The headings correspond roughly with
+   categories under which known Stringprep-using protocol RFCs have been
+   evaluated.  For the details of those evaluations, see Appendix A.
+
+5.1.  Comparison
+
+5.1.1.  Types of Identifiers
+
+   Following [ID-COMP], it is possible to organize identifiers into
+   three classes in respect of how they may be compared with one
+   another:
+
+   Absolute Identifiers:  Identifiers that can be compared byte-by-byte
+      for equality.
+
+   Definite Identifiers:  Identifiers that have a well-defined
+      comparison algorithm on which all parties agree.
+
+   Indefinite Identifiers:  Identifiers that have no single comparison
+      algorithm on which all parties agree.
+
+   Definite Identifiers include cases like the comparison of Unicode
+   code points in different encodings: they do not match byte for byte
+   but can all be converted to a single encoding which then does match
+   byte for byte.  Indefinite Identifiers are sometimes algorithmically
+   comparable by well-specified subsets of parties.  For more discussion
+   of these categories, see [ID-COMP].
+
+   The section on treating the existing known cases, Appendix A, uses
+   the categories above.
+
+5.1.2.  Effect of Comparison
+
+   The three classes of comparison style outlined in Section 5.1.1 may
+   have different effects when applied.  It is necessary to evaluate the
+   effects if a comparison results in a false positive or a false
+   negative, especially in terms of the consequences to security and
+   usability.
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 8]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+5.2.  Dealing with Characters
+
+   This section outlines a range of issues having to do with characters
+   in the target protocols, the ways in which IDNA2008 might be a good
+   analogy to other protocols, and ways in which it might be a poor one.
+
+5.2.1.  Case Folding, Case Sensitivity, and Case Preservation
+
+   In IDNA2003, labels are always mapped to lowercase before the
+   Punycode transformation.  In IDNA2008, there is no mapping at all:
+   input is either a valid U-label or it is not.  At the same time,
+   uppercase characters are by definition not valid U-labels, because
+   they fall into the Unstable category (category B) of [RFC5892].
+
+   If there are protocols that require case be preserved, then the
+   analogy with IDNA2008 will break down.  Accordingly, existing
+   protocols are to be evaluated according to the following criteria:
+
+   1.  Does the protocol use case folding?  For all blocks of code
+       points or just for certain subsets?
+
+   2.  Is the system or protocol case-sensitive?
+
+   3.  Does the system or protocol preserve case?
+
+5.2.2.  Stringprep and NFKC
+
+   Stringprep profiles may use normalization.  If they do, they use NFKC
+   [UAX15] (most profiles do).  It is not clear that NFKC is the right
+   normalization to use in all cases.  In [UAX15], there is the
+   following observation regarding Normalization Forms KC and KD: "It is
+   best to think of these Normalization Forms as being like uppercase or
+   lowercase mappings: useful in certain contexts for identifying core
+   meanings, but also performing modifications to the text that may not
+   always be appropriate."  In general, it can be said that NFKC is more
+   aggressive about finding matches between code points than NFC.  For
+   things like the spelling of users' names, NFKC may not be the best
+   form to use.  At the same time, one of the nice things about NFKC is
+   that it deals with the width of characters that are otherwise
+   similar, by canonicalizing half-width to full-width.  This mapping
+   step can be crucial in practice.  A replacement for Stringprep
+   depends on analyzing the different use profiles and considering
+   whether NFKC or NFC is a better normalization for each profile.
+
+   For the purposes of evaluating an existing example of Stringprep use,
+   it is helpful to know whether it uses no normalization, NFKC, or NFC.
+
+
+
+
+
+Blanchet & Sullivan           Informational                     [Page 9]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+5.2.3.  Character Mapping
+
+   Along with the case mapping issues raised in Section 5.2.1, there is
+   the question of whether some characters are mapped either to other
+   characters or to nothing during Stringprep.  [RFC3454], Section 3,
+   outlines a number of characters that are mapped to nothing, and also
+   permits Stringprep profiles to define their own mappings.
+
+5.2.4.  Prohibited Characters
+
+   Along with case folding and other character mappings, many protocols
+   have characters that are simply disallowed.  For example, control
+   characters and special characters such as "@" or "/" may be
+   prohibited in a protocol.
+
+   One of the primary changes of IDNA2008 is in the way it approaches
+   Unicode code points, using the new inclusion-based approach (see
+   Section 1).
+
+   Because of the default assumption in IDNA2008 that a code point is
+   not allowed by the protocol, it has more than one class of "allowed
+   by the protocol"; this is unlike IDNA2003.  While some code points
+   are disallowed outright, some are allowed only in certain contexts.
+   The reasons for the context-dependent rules have to do with the way
+   some characters are used.  For instance, the ZERO WIDTH JOINER and
+   ZERO WIDTH NON-JOINER (ZWJ, U+200D and ZWNJ, U+200C) are allowed with
+   contextual rules because they are required in some circumstances, yet
+   are considered punctuation by Unicode and would therefore be
+   DISALLOWED under the usual IDNA2008 derivation rules.  The goal of
+   IDNA2008 is to provide the widest repertoire of code points possible
+   and consistent with the traditional DNS "LDH" (letters, digits,
+   hyphen) rule (see [RFC0952]), trusting to the operators of individual
+   zones to make sensible (and usually more restrictive) policies for
+   their zones.
+
+5.2.5.  Internal Structure, Delimiters, and Special Characters
+
+   IDNA2008 has a special problem with delimiters, because the delimiter
+   "character" in the DNS wire format is not really part of the data.
+   In DNS, labels are not separated exactly; instead, a label carries
+   with it an indicator that says how long the label is.  When the label
+   is displayed in presentation format as part of a fully qualified
+   domain name, the label separator FULL STOP, U+002E (.) is used to
+   break up the labels.  But because that label separator does not
+   travel with the wire format of the domain name, there is no way to
+   encode a different, "internationalized" separator in IDNA2008.
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 10]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Other protocols may include characters with similar special meaning
+   within the protocol.  Common characters for these purposes include
+   FULL STOP, U+002E (.); COMMERCIAL AT, U+0040 (@); HYPHEN-MINUS,
+   U+002D (-); SOLIDUS, U+002F (/); and LOW LINE, U+005F (_).  The mere
+   inclusion of such a character in the protocol is not enough for it to
+   be considered similar to another protocol using the same character;
+   instead, handling of the character must be taken into consideration
+   as well.
+
+   An important issue to tackle here is whether it is valuable to map to
+   or from these special characters as part of the Stringprep
+   replacement.  In some locales, the analogue to FULL STOP, U+002E is
+   some other character, and users may expect to be able to substitute
+   their normal stop for FULL STOP, U+002E.  At the same time, there are
+   predictability arguments in favor of treating identifiers with FULL
+   STOP, U+002E in them just the way they are treated under IDNA2008.
+
+5.2.6.  Restrictions Because of Glyph Similarity
+
+   Homoglyphs are similarly (or identically) rendered glyphs of
+   different code points.  For DNS names, homoglyphs may enable
+   phishing.  If a protocol requires some visual comparison by end-
+   users, then the issue of homoglyphs is to be considered.  In the DNS
+   context, these issues are documented in [RFC5894] and [RFC4690].
+   However, IDNA2008 does not have a mechanism to deal with them,
+   trusting DNS zone operators to enact sensible policies for the subset
+   of Unicode they wish to support, given their user community.  A
+   similar policy/protocol split may not be desirable in every protocol.
+
+5.3.  Where the Data Comes from and Where It Goes
+
+5.3.1.  User Input and the Source of Protocol Elements
+
+   Some protocol elements are provided by users, and others are not.
+   Those that are not may presumably be subject to greater restrictions,
+   whereas those that users provide likely need to permit the broadest
+   range of code points.  The following questions are helpful:
+
+   1.  Do users input the strings directly?
+
+   2.  If so, how? (keyboard, stylus, voice, copy-paste, etc.)
+
+   3.  Where do we place the dividing line between user interface and
+       protocol? (see [RFC5895])
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 11]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+5.3.2.  User Output
+
+   Just as only some protocol elements are expected to be entered
+   directly by users, only some protocol elements are intended to be
+   consumed directly by users.  It is important to know how users are
+   expected to be able to consume the protocol elements, because
+   different environments present different challenges.  An element that
+   is only ever delivered as part of a vCard remains in machine-readable
+   format, so the problem of visual confusion is not a great one.  Is
+   the protocol element published as part of a vCard, a web directory,
+   on a business card, or on "the side of a bus"?  Do users use the
+   protocol element as an identifier (which means that they might enter
+   it again in some other context)?  (See also Section 5.2.6.)
+
+5.3.3.  Operations
+
+   Some strings are useful as part of the protocol but are not used as
+   input to other operations (for instance, purely informative or
+   descriptive text).  Other strings are used directly as input to other
+   operations (such as cryptographic hash functions), or are used
+   together with other strings to (such as concatenating a string with
+   some others to form a unique identifier).
+
+5.3.3.1.  String Classes
+
+   Strings often have a similar function in different protocols.  For
+   instance, many different protocols contain user identifiers or
+   passwords.  A single profile for all such uses might be desirable.
+
+   Often, a string in a protocol is effectively a protocol element from
+   another protocol.  For instance, different systems might use the same
+   credentials database for authentication.
+
+5.3.3.2.  Community Considerations
+
+   A Stringprep replacement that does anything more than just update
+   Stringprep to the latest version of Unicode will probably entail some
+   changes.  It is important to identify the willingness of the
+   protocol-using community to accept backwards-incompatible changes.
+   By the same token, it is important to evaluate the desire of the
+   community for features not available under Stringprep.
+
+5.3.3.3.  Unicode Incompatible Changes
+
+   IDNA2008 uses an algorithm to derive the validity of a Unicode code
+   point for use under IDNA2008.  It does this by using the properties
+   of each code point to test its validity.
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 12]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   This approach depends crucially on the idea that code points, once
+   valid for a protocol profile, will not later be made invalid.  That
+   is not a guarantee currently provided by Unicode.  Properties of code
+   points may change between versions of Unicode.  Rarely, such a change
+   could cause a given code point to become invalid under a protocol
+   profile, even though the code point would be valid with an earlier
+   version of Unicode.  This is not merely a theoretical possibility,
+   because it has occurred [RFC6452].
+
+   Accordingly, as in IDNA2008, a Stringprep replacement that intends to
+   be Unicode version agnostic will need to work out a mechanism to
+   address cases where incompatible changes occur because of new Unicode
+   versions.
+
+6.  Considerations for Stringprep Replacement
+
+   The above suggests the following guidance:
+
+   o  A Stringprep replacement should be defined.
+
+   o  The replacement should take an approach similar to IDNA2008 (e.g.,
+      by using properties of code points instead of whitelisting of code
+      points), in that it enables better Unicode agility.
+
+   o  Protocols share similar characteristics of strings.  Therefore,
+      defining internationalization preparation algorithms for the
+      smallest set of string classes may be sufficient for most cases,
+      providing coherence among a set of related protocols or protocols
+      where identifiers are exchanged.
+
+   o  The sets of string classes need to be evaluated according to the
+      considerations that make up the headings in Section 5
+
+   o  It is reasonable to limit scope to Unicode code points and rule
+      the mapping of data from other character encodings outside the
+      scope of this effort.
+
+   o  The replacement ought to at least provide guidance to applications
+      using the replacement on how to handle protocol incompatibilities
+      resulting from changes to Unicode.  In an ideal world, the
+      Stringprep replacement would handle the changes automatically, but
+      it appears that such automatic handling would require magic and
+      cannot be expected.
+
+   o  Compatibility within each protocol between a technique that is
+      Stringprep-based and the technique's replacement has to be
+      considered very carefully.
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 13]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Existing deployments already depend on Stringprep profiles.
+   Therefore, a replacement must consider the effects of any new
+   strategy on existing deployments.  By way of comparison, it is worth
+   noting that some characters were acceptable in IDNA labels under
+   IDNA2003, but are not protocol-valid under IDNA2008 (and conversely);
+   disagreement about what to do during the transition has resulted in
+   different approaches to mapping.  Different implementers may make
+   different decisions about what to do in such cases; this could have
+   interoperability effects.  It is necessary to trade better support
+   for different linguistic environments against the potential side
+   effects of backward incompatibility.
+
+7.  Security Considerations
+
+   This document merely states what problems are to be solved and does
+   not define a protocol.  There are undoubtedly security implications
+   of the particular results that will come from the work to be
+   completed.  Moreover, the Stringprep Security Considerations
+   [RFC3454] Section applies.  See also the analysis in the subsections
+   of Appendix B, below.
+
+8.  Acknowledgements
+
+   This document is the product of the PRECIS IETF Working Group, and
+   participants in that working group were helpful in addressing issues
+   with the text.
+
+   Specific contributions came from David Black, Alan DeKok, Simon
+   Josefsson, Bill McQuillan, Alexey Melnikov, Peter Saint-Andre, Dave
+   Thaler, and Yoshiro Yoneya.
+
+   Dave Thaler provided the "buckets" insight in Section 5.1.1, central
+   to the organization of the problem.
+
+   Evaluations of Stringprep profiles that are included in Appendix B
+   were done by David Black, Alexey Melnikov, Peter Saint-Andre, and
+   Dave Thaler.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 14]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+9.  Informative References
+
+   [78PRECIS]   Blanchet, M., "PRECIS Framework", Proceedings of IETF
+                78, July 2010, <http://www.ietf.org/proceedings/78/
+                slides/precis-2.pdf>.
+
+   [ID-COMP]    Thaler, D., Ed., "Issues in Identifier Comparison for
+                Security Purposes", Work in Progress, March 2013.
+
+   [NEWPREP]    "Newprep BoF Meeting Minutes", March 2010,
+                <http://www.ietf.org/proceedings/77/minutes/
+                newprep.txt>.
+
+   [RFC0952]    Harrenstien, K., Stahl, M., and E. Feinler, "DoD
+                Internet host table specification", RFC 952,
+                October 1985.
+
+   [RFC2119]    Bradner, S., "Key words for use in RFCs to Indicate
+                Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+   [RFC3454]    Hoffman, P. and M. Blanchet, "Preparation of
+                Internationalized Strings ("stringprep")", RFC 3454,
+                December 2002.
+
+   [RFC3490]    Faltstrom, P., Hoffman, P., and A. Costello,
+                "Internationalizing Domain Names in Applications
+                (IDNA)", RFC 3490, March 2003.
+
+   [RFC3491]    Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
+                Profile for Internationalized Domain Names (IDN)",
+                RFC 3491, March 2003.
+
+   [RFC3492]    Costello, A., "Punycode: A Bootstring encoding of
+                Unicode for Internationalized Domain Names in
+                Applications (IDNA)", RFC 3492, March 2003.
+
+   [RFC3530]    Shepler, S., Callaghan, B., Robinson, D., Thurlow, R.,
+                Beame, C., Eisler, M., and D. Noveck, "Network File
+                System (NFS) version 4 Protocol", RFC 3530, April 2003.
+
+   [RFC3722]    Bakke, M., "String Profile for Internet Small Computer
+                Systems Interface (iSCSI) Names", RFC 3722, April 2004.
+
+   [RFC3748]    Aboba, B., Blunk, L., Vollbrecht, J., Carlson, J., and
+                H. Levkowetz, "Extensible Authentication Protocol
+                (EAP)", RFC 3748, June 2004.
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 15]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   [RFC3920]    Saint-Andre, P., Ed., "Extensible Messaging and Presence
+                Protocol (XMPP): Core", RFC 3920, October 2004.
+
+   [RFC3922]    Saint-Andre, P., "Mapping the Extensible Messaging and
+                Presence Protocol (XMPP) to Common Presence and Instant
+                Messaging (CPIM)", RFC 3922, October 2004.
+
+   [RFC4011]    Waldbusser, S., Saperia, J., and T. Hongal, "Policy
+                Based Management MIB", RFC 4011, March 2005.
+
+   [RFC4013]    Zeilenga, K., "SASLprep: Stringprep Profile for User
+                Names and Passwords", RFC 4013, February 2005.
+
+   [RFC4279]    Eronen, P. and H. Tschofenig, "Pre-Shared Key
+                Ciphersuites for Transport Layer Security (TLS)",
+                RFC 4279, December 2005.
+
+   [RFC4314]    Melnikov, A., "IMAP4 Access Control List (ACL)
+                Extension", RFC 4314, December 2005.
+
+   [RFC4422]    Melnikov, A. and K. Zeilenga, "Simple Authentication and
+                Security Layer (SASL)", RFC 4422, June 2006.
+
+   [RFC4505]    Zeilenga, K., "Anonymous Simple Authentication and
+                Security Layer (SASL) Mechanism", RFC 4505, June 2006.
+
+   [RFC4511]    Sermersheim, J., "Lightweight Directory Access Protocol
+                (LDAP): The Protocol", RFC 4511, June 2006.
+
+   [RFC4513]    Harrison, R., "Lightweight Directory Access Protocol
+                (LDAP): Authentication Methods and Security Mechanisms",
+                RFC 4513, June 2006.
+
+   [RFC4518]    Zeilenga, K., "Lightweight Directory Access Protocol
+                (LDAP): Internationalized String Preparation", RFC 4518,
+                June 2006.
+
+   [RFC4616]    Zeilenga, K., "The PLAIN Simple Authentication and
+                Security Layer (SASL) Mechanism", RFC 4616, August 2006.
+
+   [RFC4643]    Vinocur, J. and K. Murchison, "Network News Transfer
+                Protocol (NNTP) Extension for Authentication", RFC 4643,
+                October 2006.
+
+   [RFC4683]    Park, J., Lee, J., Lee, H., Park, S., and T. Polk,
+                "Internet X.509 Public Key Infrastructure Subject
+                Identification Method (SIM)", RFC 4683, October 2006.
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 16]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   [RFC4690]    Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review
+                and Recommendations for Internationalized Domain Names
+                (IDNs)", RFC 4690, September 2006.
+
+   [RFC4790]    Newman, C., Duerst, M., and A. Gulbrandsen, "Internet
+                Application Protocol Collation Registry", RFC 4790,
+                March 2007.
+
+   [RFC4954]    Siemborski, R. and A. Melnikov, "SMTP Service Extension
+                for Authentication", RFC 4954, July 2007.
+
+   [RFC5034]    Siemborski, R. and A. Menon-Sen, "The Post Office
+                Protocol (POP3) Simple Authentication and Security Layer
+                (SASL) Authentication Mechanism", RFC 5034, July 2007.
+
+   [RFC5051]    Crispin, M., "i;unicode-casemap - Simple Unicode
+                Collation Algorithm", RFC 5051, October 2007.
+
+   [RFC5054]    Taylor, D., Wu, T., Mavrogiannopoulos, N., and T.
+                Perrin, "Using the Secure Remote Password (SRP) Protocol
+                for TLS Authentication", RFC 5054, November 2007.
+
+   [RFC5122]    Saint-Andre, P., "Internationalized Resource Identifiers
+                (IRIs) and Uniform Resource Identifiers (URIs) for the
+                Extensible Messaging and Presence Protocol (XMPP)",
+                RFC 5122, February 2008.
+
+   [RFC5280]    Cooper, D., Santesson, S., Farrell, S., Boeyen, S.,
+                Housley, R., and W. Polk, "Internet X.509 Public Key
+                Infrastructure Certificate and Certificate Revocation
+                List (CRL) Profile", RFC 5280, May 2008.
+
+   [RFC5456]    Spencer, M., Capouch, B., Guy, E., Miller, F., and K.
+                Shumard, "IAX: Inter-Asterisk eXchange Version 2",
+                RFC 5456, February 2010.
+
+   [RFC5661]    Shepler, S., Eisler, M., and D. Noveck, "Network File
+                System (NFS) Version 4 Minor Version 1 Protocol",
+                RFC 5661, January 2010.
+
+   [RFC5802]    Newman, C., Menon-Sen, A., Melnikov, A., and N.
+                Williams, "Salted Challenge Response Authentication
+                Mechanism (SCRAM) SASL and GSS-API Mechanisms",
+                RFC 5802, July 2010.
+
+   [RFC5804]    Melnikov, A. and T. Martin, "A Protocol for Remotely
+                Managing Sieve Scripts", RFC 5804, July 2010.
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 17]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   [RFC5890]    Klensin, J., "Internationalized Domain Names for
+                Applications (IDNA): Definitions and Document
+                Framework", RFC 5890, August 2010.
+
+   [RFC5891]    Klensin, J., "Internationalized Domain Names in
+                Applications (IDNA): Protocol", RFC 5891, August 2010.
+
+   [RFC5892]    Faltstrom, P., "The Unicode Code Points and
+                Internationalized Domain Names for Applications (IDNA)",
+                RFC 5892, August 2010.
+
+   [RFC5893]    Alvestrand, H. and C. Karp, "Right-to-Left Scripts for
+                Internationalized Domain Names for Applications (IDNA)",
+                RFC 5893, August 2010.
+
+   [RFC5894]    Klensin, J., "Internationalized Domain Names for
+                Applications (IDNA): Background, Explanation, and
+                Rationale", RFC 5894, August 2010.
+
+   [RFC5895]    Resnick, P. and P. Hoffman, "Mapping Characters for
+                Internationalized Domain Names in Applications (IDNA)
+                2008", RFC 5895, September 2010.
+
+   [RFC6120]    Saint-Andre, P., "Extensible Messaging and Presence
+                Protocol (XMPP): Core", RFC 6120, March 2011.
+
+   [RFC6365]    Hoffman, P. and J. Klensin, "Terminology Used in
+                Internationalization in the IETF", BCP 166, RFC 6365,
+                September 2011.
+
+   [RFC6452]    Faltstrom, P. and P. Hoffman, "The Unicode Code Points
+                and Internationalized Domain Names for Applications
+                (IDNA) - Unicode 6.0", RFC 6452, November 2011.
+
+   [UAX15]      "Unicode Standard Annex #15: Unicode Normalization
+                Forms", UAX 15, September 2009.
+
+   [Unicode61]  The Unicode Consortium.  The Unicode Standard, Version
+                6.1.0, (Mountain View, CA: The Unicode Consortium, 2012.
+                ISBN 978-1-936213-02-3).
+                <http://www.unicode.org/versions/Unicode6.1.0/>.
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 18]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+Appendix A.  Classification of Stringprep Profiles
+
+   A number of the known cases of Stringprep use were evaluated during
+   the preparation of this document.  The known cases are here described
+   in two ways.  The types of identifiers the protocol uses is first
+   called out in the ID type column (from Section 5.1.1) using the short
+   forms "a" for Absolute, "d" for Definite, and "i" for Indefinite.
+   Next, there is a column that contains an "i" if the protocol string
+   comes from user input, an "o" if the protocol string becomes user-
+   facing output, "b" if both are true, and "n" if neither is true.
+
+                         +------+--------+-------+
+                         |  RFC | IDtype | User? |
+                         +------+--------+-------+
+                         | 3722 |    a   |   b   |
+                         | 3748 |    -   |   -   |
+                         | 3920 |   a,d  |   b   |
+                         | 4505 |    a   |   i   |
+                         | 4314 |   a,d  |   b   |
+                         | 4954 |   a,d  |   b   |
+                         | 5034 |   a,d  |   b   |
+                         | 5804 |   a,d  |   b   |
+                         +------+--------+-------+
+
+                                  Table 1
+
+Appendix B.  Evaluation of Stringprep Profiles
+
+   This section is a summary of evaluation of Stringprep profiles that
+   was done to get a good understanding of the usage of Stringprep.
+   This summary is by no means normative nor the actual evaluations
+   themselves.  A template was used for reviewers to get a coherent view
+   of all evaluations.
+
+B.1.  iSCSI Stringprep Profile: RFC 3720, RFC 3721, RFC 3722
+
+   Description:  An iSCSI session consists of an initiator (i.e., host
+      or server that uses storage) communicating with a target (i.e., a
+      storage array or other system that provides storage).  Both the
+      iSCSI initiator and target are named by iSCSI names.  The iSCSI
+      Stringprep profile is used for iSCSI names.
+
+   How it is used:  iSCSI initiators and targets (see above).  They can
+      also be used to identify SCSI ports (these are software entities
+      in the iSCSI protocol, not hardware ports) and iSCSI logical units
+      (storage volumes), although both are unusual in practice.
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 19]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   What entities create these identifiers?  Generally, a human user (1)
+      configures an automated system (2) that generates the names.
+      Advance configuration of the system is required due to the
+      embedded use of external unique identifier (from the DNS or IEEE).
+
+   How is the string input in the system?  Keyboard and copy-paste are
+      common.  Copy-paste is common because iSCSI names are long enough
+      to be problematic for humans to remember, causing use of email,
+      sneaker-net, text files, etc., to avoid mistype mistakes.
+
+   Where do we place the dividing line between user interface and
+      protocol?  The iSCSI protocol requires that all
+      internationalization string preparation occur in the user
+      interface.  The iSCSI protocol treats iSCSI names as opaque
+      identifiers that are compared byte-by-byte for equality. iSCSI
+      names are generally not checked for correct formatting by the
+      protocol.
+
+   What entities enforce the rules?  There are no iSCSI-specific
+      enforcement entities, although the use of unique identifier
+      information in the names relies on DNS registrars and the IEEE
+      Registration Authority.
+
+   Comparison:  Byte-by-byte.
+
+   Case Folding, Sensitivity, Preservation:  Case folding is required
+      for the code blocks specified in RFC 3454, Table B.2.  The overall
+      iSCSI naming system (UI + protocol) is case-insensitive.
+
+   What is the impact if the comparison results in a false positive?
+      Potential access to the wrong storage.
+
+      -  If the initiator has no access to the wrong storage, an
+         authentication failure is the probable result.
+
+      -  If the initiator has access to the wrong storage, the resulting
+         misidentification could result in use of the wrong data and
+         possible corruption of stored data.
+
+   What is the impact if the comparison results in a false negative?
+      Denial of authorized storage access.
+
+   What are the security impacts?  iSCSI names may be used as the
+      authentication identities for storage systems.  Comparison
+      problems could result in authentication problems, although note
+      that authentication failure ameliorates some of the false positive
+      cases.
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 20]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Normalization:  NFKC, as specified by RFC 3454.
+
+   Mapping:  Yes, as specified by Table B.1 in RFC 3454.
+
+   Disallowed Characters:  Only the following characters are allowed:
+      -  ASCII dash, dot, colon
+      -  ASCII lowercase letters and digits
+      -  Unicode lowercase characters as specified by RFC 3454.
+      All other characters are disallowed.
+
+   Which other strings or identifiers are these most similar to?
+      None -- iSCSI names are unique to iSCSI.
+
+   Are these strings or identifiers sometimes the same as strings or
+      identifiers from other protocols?  No.
+
+   Does the identifier have internal structure that needs to be
+      respected?  Yes. ASCII dot, dash, and colon are used for internal
+      name structure.  These are not reserved characters, in that they
+      can occur in the name in locations other than those used for
+      structuring purposes (e.g., only the first occurrence of a colon
+      character is structural, others are not).
+
+   How are users exposed to these strings?  How are they published?
+      iSCSI names appear in server and storage system configuration
+      interfaces.  They also appear in system logs.
+
+   Is the string / identifier used as input to other operations?
+      Effectively, no.  The rarely used port and logical unit names
+      involve concatenation, which effectively extends a unique iSCSI
+      name for a target to uniquely identify something within that
+      target.
+
+   How much tolerance for change from existing Stringprep approach?
+      Good tolerance; the community would prefer that
+      internationalization experts solve internationalization problems.
+
+   How strong a desire for change (e.g., for Unicode agility)?  Unicode
+      agility is desired, in principle, as long as nothing significant
+      breaks.
+
+B.2.  SMTP/POP3/ManageSieve Stringprep Profiles: RFC 4954, RFC 5034,
+      RFC 5804
+
+   Description:  Authorization identity (user identifier) exchanged
+      during SASL authentication: AUTH (SMTP/POP3) or AUTHENTICATE
+      (ManageSieve) command.
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 21]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   How It's Used:  Used for proxy authorization, e.g., to [lawfully]
+      impersonate a particular user after a privileged authentication.
+
+   Who Generates It:
+      -  Typically generated by email system administrators using some
+         tools/conventions, sometimes from some backend database.
+      -  In some setups, human users can register their own usernames
+         (e.g., webmail self-registration).
+
+   User Input Methods:
+      -  typing or selecting from a list
+      -  copy and paste
+      -  voice input
+      -  in configuration files or on the command line
+
+   Enforcement:  Rules enforced by server / add-on service (e.g.,
+      gateway service) on registration of account.
+
+   Comparison Method:  "Type 1" (byte-for-byte) or "Type 2" (compare by
+      a common algorithm that everyone agrees on (e.g., normalize and
+      then compare the result byte-by-byte).
+
+   Case Folding, Sensitivity, Preservation:  Most likely case-sensitive.
+      Exact requirements on case-sensitivity/case-preservation depend on
+      a specific implementation, e.g., an implementation might treat all
+      user identifiers as case-insensitive (or case-insensitive for
+      US-ASCII subset only).
+
+   Impact of Comparison:  False positives: an unauthorized user is
+      allowed email service access (login).  False negatives: an
+      authorized user is denied email service access.
+
+   Normalization:  NFKC (as per RFC 4013).
+
+   Mapping:  (see Section 2 of RFC 4013 for the full list) Non-ASCII
+      spaces are mapped to space, etc.
+
+   Disallowed Characters:  (see Section 2 of RFC 4013 for the full list)
+      Unicode Control characters, etc.
+
+   String Classes:  Simple username.  See Section 2 of RFC 4013 for
+      details on restrictions.  Note that some implementations allow
+      spaces in these.  While implementations are not required to use a
+      specific format, an authorization identity frequently has the same
+      format as an email address (and Email Address Internationalization
+      (EAI) email address in the future), or as a left hand side of an
+      email address.  Note: whatever is recommended for SMTP/POP/
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 22]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+      ManageSieve authorization identity should also be used for IMAP
+      authorization identities, as IMAP/POP3/SMTP/ManageSieve are
+      frequently implemented together.
+
+   Internal Structure:  None
+
+   User Output:  Unlikely, but possible.  For example, if it is the same
+      as an email address.
+
+   Operations:  Sometimes concatenated with other data and then used as
+      input to a cryptographic hash function.
+
+   How much tolerance for change from existing Stringprep approach?  Not
+      sure.
+
+   Background Information:
+      In RFC 5034, when describing the POP3 AUTH command:
+
+         The authorization identity generated by the SASL exchange is a
+         simple username, and SHOULD use the SASLprep profile (see
+         [RFC4013]) of the StringPrep algorithm (see [RFC3454]) to
+         prepare these names for matching.  If preparation of the
+         authorization identity fails or results in an empty string
+         (unless it was transmitted as the empty string), the server
+         MUST fail the authentication.
+
+      In RFC 4954, when describing the SMTP AUTH command:
+
+         The authorization identity generated by this [SASL] exchange is
+         a "simple username" (in the sense defined in [SASLprep]), and
+         both client and server SHOULD (*) use the [SASLprep] profile of
+         the [StringPrep] algorithm to prepare these names for
+         transmission or comparison.  If preparation of the
+         authorization identity fails or results in an empty string
+         (unless it was transmitted as the empty string), the server
+         MUST fail the authentication.
+
+         (*) Note: Future revision of this specification may change this
+         requirement to MUST.  Currently, the SHOULD is used in order to
+         avoid breaking the majority of existing implementations.
+
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 23]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+      In RFC 5804, when describing the ManageSieve AUTHENTICATE command:
+
+         The authorization identity generated by this [SASL] exchange is
+         a "simple username" (in the sense defined in [SASLprep]), and
+         both client and server MUST use the [SASLprep] profile of the
+         [StringPrep] algorithm to prepare these names for transmission
+         or comparison.  If preparation of the authorization identity
+         fails or results in an empty string (unless it was transmitted
+         as the empty string), the server MUST fail the authentication.
+
+B.3.  IMAP Stringprep Profiles for Usernames: RFC 4314, RFC 5738
+
+   Evaluation Note:  These documents have 2 types of strings (usernames
+      and passwords), so there are two separate templates.
+
+   Description:  "username" parameter to the IMAP LOGIN command,
+      identifiers in IMAP Access Control List (ACL) commands.  Note that
+      any valid username is also an IMAP ACL identifier, but IMAP ACL
+      identifiers can include other things like the name of a group of
+      users.
+
+   How It's Used:  Used for authentication (Usernames), or in IMAP
+      Access Control Lists (Usernames or Group names).
+
+   Who Generates It:
+      -  Typically generated by email system administrators using some
+         tools/conventions, sometimes from some backend database.
+      -  In some setups, human users can register own usernames (e.g.,
+         webmail self-registration).
+
+   User Input Methods:
+      -  typing or selecting from a list
+      -  copy and paste
+      -  voice input
+      -  in configuration files or on the command line
+
+   Enforcement:  Rules enforced by server / add-on service (e.g.,
+      gateway service) on registration of account.
+
+   Comparison Method:  "Type 1" (byte-for-byte) or "Type 2" (compare by
+      a common algorithm that everyone agrees on (e.g., normalize and
+      then compare the result byte-by-byte).
+
+   Case Folding, Sensitivity, Preservation:  Most likely case-sensitive.
+      Exact requirements on case-sensitivity/case-preservation depend on
+      a specific implementation, e.g., an implementation might treat all
+      user identifiers as case-insensitive (or case-insensitive for
+      US-ASCII subset only).
+
+
+
+Blanchet & Sullivan           Informational                    [Page 24]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Impact of Comparison:  False positives: an unauthorized user is
+      allowed IMAP access (login), privileges improperly granted (e.g.,
+      access to a specific mailbox, ability to manage ACLs for a
+      mailbox).  False negatives: an authorized user is denied IMAP
+      access, unable to use granted privileges (e.g., access to a
+      specific mailbox, ability to manage ACLs for a mailbox).
+
+   Normalization:  NFKC (as per RFC 4013)
+
+   Mapping:  (see Section 2 of RFC 4013 for the full list) Non-ASCII
+      spaces are mapped to space.
+
+   Disallowed Characters:  (see Section 2 of RFC 4013 for the full list)
+      Unicode Control characters, etc.
+
+   String Classes:  Simple username.  See Section 2 of RFC 4013 for
+      details on restrictions.  Note that some implementations allow
+      spaces in these.  While IMAP implementations are not required to
+      use a specific format, an IMAP username frequently has the same
+      format as an email address (and EAI email address in the future),
+      or as a left hand side of an email address.  Note: whatever is
+      recommended for the IMAP username should also be used for
+      ManageSieve, POP3 and SMTP authorization identities, as IMAP/POP3/
+      SMTP/ManageSieve are frequently implemented together.
+
+   Internal Structure:  None.
+
+   User Output:  Unlikely, but possible.  For example, if it is the same
+      as an email address, access control lists (e.g. in IMAP ACL
+      extension), both when managing membership and listing membership
+      of existing access control lists.  Often shows up as mailbox names
+      (under Other Users IMAP namespace).
+
+   Operations:  Sometimes concatenated with other data and then used as
+      input to a cryptographic hash function.
+
+   How much tolerance for change from existing Stringprep approach?  Not
+      sure.  Non-ASCII IMAP usernames are currently prohibited by IMAP
+      (RFC 3501).  However, they are allowed when used in IMAP ACL
+      extension.
+
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 25]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+B.4.  IMAP Stringprep Profiles for Passwords: RFC 5738
+
+   Description:  "Password" parameter to the IMAP LOGIN command.
+
+   How It's Used:  Used for authentication (Passwords).
+
+   Who Generates It:  Either generated by email system administrators
+      using some tools/conventions, or specified by the human user.
+
+   User Input Methods:
+      -  typing or selecting from a list
+      -  copy and paste
+      -  voice input
+      -  in configuration files or on the command line
+
+   Enforcement:  Rules enforced by server / add-on service (e.g.,
+      gateway service or backend database) on registration of account.
+
+   Comparison Method:  "Type 1" (byte-for-byte).
+
+   Case Folding, Sensitivity, Preservation:  Most likely case-sensitive.
+
+   Impact of Comparison:  False positives: an unauthorized user is
+      allowed IMAP access (login).  False negatives: an authorized user
+      is denied IMAP access.
+
+   Normalization:  NFKC (as per RFC 4013).
+
+   Mapping:  (see Section 2 of RFC 4013 for the full list) Non-ASCII
+      spaces are mapped to space.
+
+   Disallowed Characters:  (see Section 2 of RFC 4013 for the full list)
+      Unicode Control characters, etc.
+
+   String Classes:  Currently defined as "simple username" (see Section
+      2 of RFC 4013 for details on restrictions); however, this is
+      likely to be a different class from usernames.  Note that some
+      implementations allow spaces in these.  Password in all email
+      related protocols should be treated in the same way.  Same
+      passwords are frequently shared with web, IM, and etc.
+      applications.
+
+   Internal Structure:  None.
+
+   User Output:  Text of email messages (e.g. in "you forgot your
+      password" email messages), web page / directory, side of the bus /
+      in ads -- possible.
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 26]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Operations:  Sometimes concatenated with other data and then used as
+      input to a cryptographic hash function.  Frequently stored as is,
+      or hashed.
+
+   How much tolerance for change from existing Stringprep approach?  Not
+      sure.  Non-ASCII IMAP passwords are currently prohibited by IMAP
+      (RFC 3501); however, they are likely to be in widespread use.
+
+   Background Information:
+      RFC 5738, Section 5 ("UTF8=USER Capability"):
+
+         If the "UTF8=USER" capability is advertised, that indicates the
+         server accepts UTF-8 user names and passwords and applies
+         SASLprep [RFC4013] to both arguments of the LOGIN command.  The
+         server MUST reject UTF-8 that fails to comply with the formal
+         syntax in RFC 3629 [RFC3629] or if it encounters Unicode
+         characters listed in Section 2.3 of SASLprep RFC 4013
+         [RFC4013].
+
+      RFC 4314, Section 3 ("Access control management commands and
+      responses"):
+
+         Servers, when processing a command that has an identifier as a
+         parameter (i.e., any of SETACL, DELETEACL, and LISTRIGHTS
+         commands), SHOULD first prepare the received identifier using
+         "SASLprep" profile [SASLprep] of the "stringprep" algorithm
+         [Stringprep].  If the preparation of the identifier fails or
+         results in an empty string, the server MUST refuse to perform
+         the command with a BAD response.  Note that Section 6
+         recommends additional identifier's verification steps.
+
+      RFC 4314, Section 6 ("Security Considerations"):
+
+         This document relies on [SASLprep] to describe steps required
+         to perform identifier canonicalization (preparation).  The
+         preparation algorithm in SASLprep was specifically designed
+         such that its output is canonical, and it is well-formed.
+         However, due to an anomaly [PR29] in the specification of
+         Unicode normalization, canonical equivalence is not guaranteed
+         for a select few character sequences.  Identifiers prepared
+         with SASLprep can be stored and returned by an ACL server.  The
+         anomaly affects ACL manipulation and evaluation of identifiers
+         containing the selected character sequences.  These sequences,
+         however, do not appear in well-formed text.  In order to
+         address this problem, an ACL server MAY reject identifiers
+         containing sequences described in [PR29] by sending the tagged
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 27]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+         BAD response.  This is in addition to the requirement to reject
+         identifiers that fail SASLprep preparation as described in
+         Section 3.
+
+B.5.  Anonymous SASL Stringprep Profiles: RFC 4505
+
+   Description:  RFC 4505 defines a "trace" field:
+
+   Comparison:  this field is not intended for comparison (only used for
+      logging)
+
+   Case folding; case-sensitivity, preserve case:  No case folding/
+      case-sensitive
+
+   Do users input the strings directly?  Yes. Possibly entered in
+      configuration UIs, or on a command line.  Can also be stored in
+      configuration files.  The value can also be automatically
+      generated by clients (e.g., a fixed string is used, or a user's
+      email address).
+
+   How users input strings?  Keyboard/voice, stylus (pick from a list).
+      Copy-paste - possibly.
+
+   Normalization:  None.
+
+   Disallowed Characters:  Control characters are disallowed.  (See
+      Section 3 of RFC 4505).
+
+   Which other strings or identifiers are these most similar to?
+      RFC 4505 says that the trace "should take one of two forms: an
+      Internet email address, or an opaque string that does not contain
+      the '@' (U+0040) character and that can be interpreted by the
+      system administrator of the client's domain".  In practice, this
+      is a free-form text, so it belongs to a different class from
+      "email address" or "username".
+
+   Are these strings or identifiers sometimes the same as strings or
+      identifiers from other protocols (e.g., does an IM system
+      sometimes use the same credentials database for authentication as
+      an email system)?  Yes: see above.  However, there is no strong
+      need to keep them consistent in the future.
+
+   How are users exposed to these strings, how are they published?  No.
+      However, the value can be seen in server logs.
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 28]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Impacts of false positives and false negatives:
+      False positive: a user can be confused with another user.
+      False negative: two distinct users are treated as the same user.
+      But note that the trace field is not authenticated, so it can be
+      easily falsified.
+
+   Tolerance of changes in the community:  The community would be
+      flexible.
+
+   Delimiters:  No internal structure, but see comments above about
+      frequent use of email addresses.
+
+   Background Information:
+      RFC 4505, Section 2 ("The Anonymous Mechanism"):
+
+      The mechanism consists of a single message from the client to the
+      server.  The client may include in this message trace information
+      in the form of a string of [UTF-8]-encoded [Unicode] characters
+      prepared in accordance with [StringPrep] and the "trace"
+      stringprep profile defined in Section 3 of this document.  The
+      trace information, which has no semantical value, should take one
+      of two forms: an Internet email address, or an opaque string that
+      does not contain the '@' (U+0040) character and that can be
+      interpreted by the system administrator of the client's domain.
+      For privacy reasons, an Internet email address or other
+      information identifying the user should only be used with
+      permission from the user.
+
+      RFC 4505, Section 3 ('The "trace" Profile of "Stringprep"'):
+      This section defines the "trace" profile of [StringPrep].  This
+      profile is designed for use with the SASL ANONYMOUS Mechanism.
+      Specifically, the client is to prepare the <message> production in
+      accordance with this profile.
+
+      The character repertoire of this profile is Unicode 3.2 [Unicode].
+
+      No mapping is required by this profile.
+
+      No Unicode normalization is required by this profile.
+
+      The list of unassigned code points for this profile is that
+      provided in Appendix A of [StringPrep].  Unassigned code points
+      are not prohibited.
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 29]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+      Characters from the following tables of [StringPrep] are
+      prohibited:
+
+         - C.2.1 (ASCII control characters)
+         - C.2.2 (Non-ASCII control characters)
+         - C.3 (Private use characters)
+         - C.4 (Non-character code points)
+         - C.5 (Surrogate codes)
+         - C.6 (Inappropriate for plain text)
+         - C.8 (Change display properties are deprecated)
+         - C.9 (Tagging characters)
+
+   No additional characters are prohibited.
+
+   This profile requires bidirectional character checking per Section 6
+   of [StringPrep].
+
+B.6.  XMPP Stringprep Profiles for Nodeprep: RFC 3920
+
+   Description:  Localpart of JabberID ("JID"), as in:
+      localpart@domainpart/resourcepart
+
+   How It's Used:
+      -  Usernames (e.g., stpeter@jabber.org)
+      -  Chatroom names (e.g., precis@jabber.ietf.org)
+      -  Publish-subscribe nodes
+      -  Bot names
+
+   Who Generates It:
+      -  Typically, end users via an XMPP client
+      -  Sometimes created in an automated fashion
+
+   User Input Methods:
+      -  typing
+      -  copy and paste
+      -  voice input
+      -  clicking a URI/IRI
+
+   Enforcement:  Rules enforced by server / add-on service (e.g.,
+      chatroom service) on registration of account, creation of room,
+      etc.
+
+   Comparison Method:  "Type 2" (common algorithm)
+
+   Case Folding, Sensitivity, Preservation:
+      -  Strings are always folded to lowercase
+      -  Case is not preserved
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 30]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Impact of Comparison:
+      False positives:
+      -  unable to authenticate at server (or authenticate to wrong
+         account)
+      -  add wrong person to buddy list
+      -  join the wrong chatroom
+      -  improperly grant privileges (e.g., chatroom admin)
+      -  subscribe to wrong pubsub node
+      -  interact with wrong bot
+      -  allow communication with blocked entity
+
+      False negatives:
+      -  unable to authenticate
+      -  unable to add someone to buddy list
+      -  unable to join desired chatroom
+      -  unable to use granted privileges (e.g., chatroom admin)
+      -  unable to subscribe to desired pubsub node
+      -  unable to interact with desired bot
+      -  disallow communication with unblocked entity
+
+   Normalization:  NFKC
+
+   Mapping:  Spaces are mapped to nothing
+
+   Disallowed Characters:  ",&,',/,:,<,>,@
+
+   String Classes:
+      -  Often similar to generic username
+      -  Often similar to localpart of email address
+      -  Sometimes same as localpart of email address
+
+   Internal Structure:  None
+
+   User Output:
+      -  vCard
+      -  email signature
+      -  web page / directory
+      -  text of message (e.g., in a chatroom)
+
+   Operations:  Sometimes concatenated with other data and then used as
+      input to a cryptographic hash function
+
+B.7.  XMPP Stringprep Profiles for Resourceprep: RFC 3920
+
+   Description:
+      -  Resourcepart of JabberID ("JID"), as in:
+         localpart@domainpart/resourcepart
+      -  Typically free-form text
+
+
+
+Blanchet & Sullivan           Informational                    [Page 31]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   How It's Used:
+      -  Device / session names (e.g., stpeter@jabber.org/Home)
+      -  Nicknames (e.g., precis@jabber.ietf.org/StPeter)
+
+   Who Generates It:
+      -  Often human users via an XMPP client
+      -  Often generated in an automated fashion by client or server
+
+   User Input Methods:
+      -  typing
+      -  copy and paste
+      -  voice input
+      -  clicking a URI/IRI
+
+   Enforcement:  Rules enforced by server / add-on service (e.g.,
+      chatroom service) on account login, joining a chatroom, etc.
+
+   Comparison Method:  "Type 2" (byte-for-byte)
+
+   Case Folding, Sensitivity, Preservation:
+      -  Strings are never folded
+      -  Case is preserved
+
+   Impact of Comparison:
+      False positives:
+      -  interact with wrong device (e.g., for file transfer or voice
+         call)
+      -  interact with wrong chatroom participant
+      -  improperly grant privileges (e.g., chatroom moderator)
+      -  allow communication with blocked entity
+      False negatives:
+      -  unable to choose desired chatroom nickname
+      -  unable to use granted privileges (e.g., chatroom moderator)
+      -  disallow communication with unblocked entity
+
+   Normalization:  NFKC
+
+   Mapping:  Spaces are mapped to nothing
+
+   Disallowed Characters:  None
+
+   String Classes:  Basically a free-form identifier
+
+   Internal Structure:  None
+
+   User Output:
+      -  text of message (e.g., in a chatroom)
+      -  device names often not exposed to human users
+
+
+
+Blanchet & Sullivan           Informational                    [Page 32]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Operations:  Sometimes concatenated with other data and then used as
+      input to a cryptographic hash function
+
+B.8.  EAP Stringprep Profiles: RFC 3748
+
+   Description:  RFC 3748, Section 5, references Stringprep, but the WG
+      did not agree with the text (was added by IESG) and there are no
+      known implementations that use Stringprep.  The main problem with
+      that text is that the use of strings is a per-method concept, not
+      a generic EAP concept and so RFC 3748 itself does not really use
+      Stringprep, but individual EAP methods could.  As such, the
+      answers to the template questions are mostly not applicable, but a
+      few answers are universal across methods.  The list of IANA
+      registered EAP methods is at
+      <http://www.iana.org/assignments/eap-numbers/eap-numbers.xml>.
+
+   Comparison Methods:  n/a (per-method)
+
+   Case Folding, Case-Sensitivity, Case Preservation:  n/a (per-method)
+
+   Impact of comparison:  A false positive results in unauthorized
+      network access (and possibly theft of service if some else is
+      billed).  A false negative results in lack of authorized network
+      access (no connectivity).
+
+   User input:  n/a (per-method)
+
+   Normalization:  n/a (per-method)
+
+   Mapping:  n/a (per-method)
+
+   Disallowed characters:  n/a (per-method)
+
+   String classes:  Although some EAP methods may use a syntax similar
+      to other types of identifiers, EAP mandates that the actual values
+      must not be assumed to be identifiers usable with anything else.
+
+   Internal structure:  n/a (per-method)
+
+   User output:  Identifiers are never human displayed except perhaps as
+      they're typed by a human.
+
+   Operations:  n/a (per-method)
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 33]
+
+RFC 6885          Stringprep Revision Problem Statement       March 2013
+
+
+   Community considerations:  There is no resistance to change for the
+      base EAP protocol (as noted, the WG didn't want the existing
+      text).  However, actual use of Stringprep, if any, within specific
+      EAP methods may have resistance.  It is currently unknown whether
+      any EAP methods use Stringprep.
+
+Authors' Addresses
+
+   Marc Blanchet
+   Viagenie
+   246 Aberdeen
+   Quebec, QC  G1R 2E1
+   Canada
+
+   EMail: Marc.Blanchet@viagenie.ca
+   URI:   http://viagenie.ca
+
+
+   Andrew Sullivan
+   Dyn, Inc.
+   150 Dow St
+   Manchester, NH  03101
+   U.S.A.
+
+   EMail: asullivan@dyn.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Blanchet & Sullivan           Informational                    [Page 34]
+