diff options
Diffstat (limited to 'doc/rfc/rfc6943.txt')
| -rw-r--r-- | doc/rfc/rfc6943.txt | 1459 | 
1 files changed, 1459 insertions, 0 deletions
| diff --git a/doc/rfc/rfc6943.txt b/doc/rfc/rfc6943.txt new file mode 100644 index 0000000..16e9098 --- /dev/null +++ b/doc/rfc/rfc6943.txt @@ -0,0 +1,1459 @@ + + + + + + +Internet Architecture Board (IAB)                         D. Thaler, Ed. +Request for Comments: 6943                                     Microsoft +Category: Informational                                         May 2013 +ISSN: 2070-1721 + + +         Issues in Identifier Comparison for Security Purposes + +Abstract + +   Identifiers such as hostnames, URIs, IP addresses, and email +   addresses are often used in security contexts to identify security +   principals and resources.  In such contexts, an identifier presented +   via some protocol is often compared using some policy to make +   security decisions such as whether the security principal may access +   the resource, what level of authentication or encryption is required, +   etc.  If the parties involved in a security decision use different +   algorithms to compare identifiers, then failure scenarios ranging +   from denial of service to elevation of privilege can result.  This +   document provides a discussion of these issues that designers should +   consider when defining identifiers and protocols, and when +   constructing architectures that use multiple protocols. + +Status of This Memo + +   This document is not an Internet Standards Track specification; it is +   published for informational purposes. + +   This document is a product of the Internet Architecture Board (IAB) +   and represents information that the IAB has deemed valuable to +   provide for permanent record.  It represents the consensus of the +   Internet Architecture Board (IAB).  Documents approved for +   publication by the IAB are not a candidate for any level of Internet +   Standard; see Section 2 of RFC 5741. + +   Information about the current status of this document, any errata, +   and how to provide feedback on it may be obtained at +   http://www.rfc-editor.org/info/rfc6943. + + + + + + + + + + + + + +Thaler                        Informational                     [Page 1] + +RFC 6943                  Identifier Comparison                 May 2013 + + +Copyright Notice + +   Copyright (c) 2013 IETF Trust and the persons identified as the +   document authors.  All rights reserved. + +   This document is subject to BCP 78 and the IETF Trust's Legal +   Provisions Relating to IETF Documents +   (http://trustee.ietf.org/license-info) in effect on the date of +   publication of this document.  Please review these documents +   carefully, as they describe your rights and restrictions with respect +   to this document. + +Table of Contents + +   1. Introduction ....................................................3 +      1.1. Classes of Identifiers .....................................5 +      1.2. Canonicalization ...........................................5 +   2. Identifier Use in Security Policies and Decisions ...............6 +      2.1. False Positives and Negatives ..............................7 +      2.2. Hypothetical Example .......................................8 +   3. Comparison Issues with Common Identifiers .......................9 +      3.1. Hostnames ..................................................9 +           3.1.1. IPv4 Literals ......................................11 +           3.1.2. IPv6 Literals ......................................12 +           3.1.3. Internationalization ...............................13 +           3.1.4. Resolution for Comparison ..........................14 +      3.2. Port Numbers and Service Names ............................14 +      3.3. URIs ......................................................15 +           3.3.1. Scheme Component ...................................16 +           3.3.2. Authority Component ................................16 +           3.3.3. Path Component .....................................17 +           3.3.4. Query Component ....................................17 +           3.3.5. Fragment Component .................................17 +           3.3.6. Resolution for Comparison ..........................18 +      3.4. Email Address-Like Identifiers ............................18 +   4. General Issues .................................................19 +      4.1. Conflation ................................................19 +      4.2. Internationalization ......................................20 +      4.3. Scope .....................................................21 +      4.4. Temporality ...............................................21 +   5. Security Considerations ........................................22 +   6. Acknowledgements ...............................................22 +   7. IAB Members at the Time of Approval ............................23 +   8. Informative References .........................................23 + + + + + + + +Thaler                        Informational                     [Page 2] + +RFC 6943                  Identifier Comparison                 May 2013 + + +1.  Introduction + +   In computing and the Internet, various types of "identifiers" are +   used to identify humans, devices, content, etc.  This document +   provides a discussion of some security issues that designers should +   consider when defining identifiers and protocols, and when +   constructing architectures that use multiple protocols.  Before +   discussing these security issues, we first give some background on +   some typical processes involving identifiers.  Terms such as +   "identifier", "identity", and "principal" are used as defined in +   [RFC4949]. + +   As depicted in Figure 1, there are multiple processes relevant to our +   discussion. + +   1.  An identifier is first generated.  If the identifier is intended +       to be unique, the generation process must include some mechanism, +       such as allocation by a central authority or verification among +       the members of a distributed authority, to help ensure +       uniqueness.  However, the notion of "unique" involves determining +       whether a putative identifier matches any other identifier that +       has already been allocated.  As we will see, for many types of +       identifiers, this is not simply an exact binary match. + +       After generating the identifier, it is often stored in two +       locations: with the requester or "holder" of the identifier, and +       with some repository of identifiers (e.g., DNS).  For example, if +       the identifier was allocated by a central authority, the +       repository might be that authority.  If the identifier identifies +       a device or content on a device, the repository might be that +       device. + +   2.  The identifier is distributed, either by the holder of the +       identifier or by a repository of identifiers, to others who could +       use the identifier.  This distribution might be electronic, but +       sometimes it is via other channels such as voice, business card, +       billboard, or other form of advertisement.  The identifier itself +       might be distributed directly, or it might be used to generate a +       portion of another type of identifier that is then distributed. +       For example, a URI or email address might include a server name, +       and hence distributing the URI or email address also inherently +       distributes the server name. + +   3.  The identifier is used by some party.  Generally, the user +       supplies the identifier, which is (directly or indirectly) sent +       to the repository of identifiers.  The repository of identifiers +       must then attempt to match the user-supplied identifier with an +       identifier in its repository. + + + +Thaler                        Informational                     [Page 3] + +RFC 6943                  Identifier Comparison                 May 2013 + + +       For example, using an email address to send email to the holder +       of an identifier may result in the email arriving at the holder's +       email server, which has access to the mail stores. + +                          +------------+ +                          |  Holder of |     1. Generation +                          | identifier +<---------+ +                          +----+-------+          | +                               |                  | Match +                               |                  v/ +                               |          +-------+-------+ +                               +----------+ Repository of | +                               |          |  identifiers  | +                               |          +-------+-------+ +               2. Distribution |                  ^\ +                               |                  | Match +                               v                  | +                     +---------+-------+          | +                     |      User of    |          | +                     |    identifier   +----------+ +                     +-----------------+    3. Use + +                  Figure 1: Typical Identifier Processes + +   Another variation is where a user is given the identifier of a +   resource (e.g., a web site) to access securely, sometimes known as a +   "reference identifier" [RFC6125], and the server hosting the resource +   then presents its identity at the time of use.  In this case, the +   user application attempts to match the presented identity against the +   reference identifier. + +   One key aspect is that the identifier values passed in generation, +   distribution, and use may all be in different forms.  For example, an +   identifier might be exchanged in printed form at generation time, +   distributed to a user via voice, and then used electronically.  As +   such, the match process can be complicated. + +   Furthermore, in many cases, the relationship between holder, +   repositories, and users may be more involved.  For example, when a +   hierarchy of web caches exists, each cache is itself a repository of +   a sort, and the match process is usually intended to be the same as +   on the origin server. + +   Another aspect to keep in mind is that there can be multiple +   identifiers that refer to the same object (i.e., resource, human, +   device, etc.).  For example, a human might have a passport number and +   a drivers license number, and an RFC might be available at multiple +   locations (rfc-editor.org and ietf.org).  In this document, we focus + + + +Thaler                        Informational                     [Page 4] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   on comparing two identifiers to see whether they are the same +   identifier, rather than comparing two different identifiers to see +   whether they refer to the same entity (although a few issues with the +   latter are touched on in several places, such as Sections 3.1.4 and +   3.3.6). + +1.1.  Classes of Identifiers + +   In this document, we will refer to the following classes of +   identifiers: + +   o  Absolute: identifiers that can be compared byte-by-byte for +      equality.  Two identifiers that have different bytes are defined +      to be different.  For example, binary IP addresses are in this +      class. + +   o  Definite: identifiers that have a single well-defined comparison +      algorithm.  For example, URI scheme names are required to be +      US-ASCII [USASCII] and are defined to match in a case-insensitive +      way; the comparison is thus definite, since there is a well- +      specified algorithm (Section 9.2.1 of [RFC4790]) on how to do a +      case-insensitive match among ASCII strings. + +   o  Indefinite: identifiers that have no single well-defined +      comparison algorithm.  For example, human names are in this class. +      Everyone might want the comparison to be tailored for their +      locale, for some definition of "locale".  In some cases, there may +      be limited subsets of parties that might be able to agree (e.g., +      ASCII users might all agree on a common comparison algorithm, +      whereas users of other Roman-derived scripts, such as Turkish, may +      not), but identifiers often tend to leak out of such limited +      environments. + +1.2.  Canonicalization + +   Perhaps the most common algorithm for comparison involves first +   converting each identifier to a canonical form (a process known as +   "canonicalization" or "normalization") and then testing the resulting +   canonical representations for bitwise equality.  In so doing, it is +   thus critical that all entities involved agree on the same canonical +   form and use the same canonicalization algorithm so that the overall +   comparison process is also the same. + +   Note that in some contexts, such as in internationalization, the +   terms "canonicalization" and "normalization" have a precise meaning. +   In this document, however, we use these terms synonymously in their +   more generic form, to mean conversion to some standard form. + + + + +Thaler                        Informational                     [Page 5] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   While the most common method of comparison includes canonicalization, +   comparison can also be done by defining an equivalence algorithm, +   where no single form is canonical.  However, in most cases, a +   canonical form is useful for other purposes, such as output, and so +   in such cases defining a canonical form suffices to define a +   comparison method. + +2.  Identifier Use in Security Policies and Decisions + +   Identifiers such as hostnames, URIs, and email addresses are used in +   security contexts to identify security principals (i.e., entities +   that can be authenticated) and resources as well as other security +   parameters such as types and values of claims.  Those identifiers are +   then used to make security decisions based on an identifier presented +   via some protocol.  For example: + +   o  Authentication: a protocol might match a security principal's +      identifier to look up expected keying material and then match +      keying material. + +   o  Authorization: a protocol might match a resource name against some +      policy.  For example, it might look up an access control list +      (ACL) and then look up the security principal's identifier (or a +      surrogate for it) in that ACL. + +   o  Accounting: a system might create an accounting record for a +      security principal's identifier or resource name, and then might +      later need to match a presented identifier to (for example) add +      new filtering rules based on the records in order to stop an +      attack. + +   If the parties involved in a security decision use different matching +   algorithms for the same identifiers, then failure scenarios ranging +   from denial of service to elevation of privilege can result, as we +   will see. + +   This is especially complicated in cases involving multiple parties +   and multiple protocols.  For example, there are many scenarios where +   some form of "security token service" is used to grant to a requester +   permission to access a resource, where the resource is held by a +   third party that relies on the security token service (see Figure 2). +   The protocol used to request permission (e.g., Kerberos or OAuth) may +   be different from the protocol used to access the resource (e.g., +   HTTP).  Opportunities for security problems arise when two protocols +   define different comparison algorithms for the same type of +   identifier, or when a protocol is ambiguously specified and two +   endpoints (e.g., a security token service and a resource holder) +   implement different algorithms within the same protocol. + + + +Thaler                        Informational                     [Page 6] + +RFC 6943                  Identifier Comparison                 May 2013 + + +         +----------+ +         | security | +         |  token   | +         | service  | +         +----------+ +              ^ +              | 1. supply credentials and +              |    get token for resource +              |                                             +--------+ +         +----------+  2. supply token and access resource  |resource| +         |requester |=------------------------------------->| holder | +         +----------+                                       +--------+ + +                    Figure 2: Simple Security Exchange + +   In many cases, the situation is more complex.  With X.509 Public Key +   Infrastructure (PKIX) certificates [RFC6125], for example, the name +   in a certificate gets compared against names in ACLs or other things. +   In the case of web site security, the name in the certificate gets +   compared to a portion of the URI that a user may have typed into a +   browser.  The fact that many different people are doing the typing, +   on many different types of systems, complicates the problem. + +   Add to this the certificate enrollment step, and the certificate +   issuance step, and two more parties have an opportunity to adjust the +   encoding, or worse, the software that supports them might make +   changes that the parties are unaware are happening. + +2.1.  False Positives and Negatives + +   It is first worth discussing in more detail the effects of errors in +   the comparison algorithm.  A "false positive" results when two +   identifiers compare as if they were equal but in reality refer to two +   different objects (e.g., security principals or resources).  When +   privilege is granted on a match, a false positive thus results in an +   elevation of privilege -- for example, allowing execution of an +   operation that should not have been permitted otherwise.  When +   privilege is denied on a match (e.g., matching an entry in a +   block/deny list or a revocation list), a permissible operation is +   denied.  At best, this can cause worse performance (e.g., a cache +   miss or forcing redundant authentication) and at worst can result in +   a denial of service. + + + + + + + + + +Thaler                        Informational                     [Page 7] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   A "false negative" results when two identifiers that in reality refer +   to the same thing compare as if they were different, and the effects +   are the reverse of those for false positives.  That is, when +   privilege is granted on a match, the result is at best worse +   performance and at worst a denial of service; when privilege is +   denied on a match, elevation of privilege results. + +   Figure 3 summarizes these effects. + +                      | "Grant on match"       | "Deny on match" +       ---------------+------------------------+----------------------- +       False positive | Elevation of privilege | Denial of service +       ---------------+------------------------+----------------------- +       False negative | Denial of service      | Elevation of privilege +       ---------------+------------------------+----------------------- + +           Figure 3: Worst Effects of False Positives/Negatives + +   When designing a comparison algorithm, one can typically modify it to +   increase the likelihood of false positives and decrease the +   likelihood of false negatives, or vice versa.  Which outcome is +   better depends on the context. + +   Elevation of privilege is almost always seen as far worse than denial +   of service.  Hence, for URIs, for example, Section 6.1 of [RFC3986] +   states that "comparison methods are designed to minimize false +   negatives while strictly avoiding false positives". + +   Thus, URIs were defined with a "grant privilege on match" paradigm in +   mind, where it is critical to prevent elevation of privilege while +   minimizing denial of service.  Using URIs in a "deny privilege on +   match" system can thus be problematic. + +2.2.  Hypothetical Example + +   In this example, both security principals and resources are +   identified using URIs.  Foo Corp has paid example.com for access to +   the Stuff service.  Foo Corp allows its employees to create accounts +   on the Stuff service.  Alice gets the account +   "http://example.com/Stuff/FooCorp/alice" and Bob gets +   "http://example.com/Stuff/FooCorp/bob".  It turns out, however, that +   Foo Corp's URI canonicalizer includes URI fragment components in +   comparisons whereas example.com's does not, and Foo Corp does not +   disallow the # character in the account name.  So Chuck, who is a +   malicious employee of Foo Corp, asks to create an account at +   example.com with the name alice#stuff.  Foo Corp's URI logic checks +   its records for accounts it has created with stuff and sees that +   there is no account with the name alice#stuff.  Hence, in its + + + +Thaler                        Informational                     [Page 8] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   records, it associates the account alice#stuff with Chuck and will +   only issue tokens good for use with +   "http://example.com/Stuff/FooCorp/alice#stuff" to Chuck. + +   Chuck, the attacker, goes to a security token service at Foo Corp and +   asks for a security token good for +   "http://example.com/Stuff/FooCorp/alice#stuff".  Foo Corp issues the +   token, since Chuck is the legitimate owner (in Foo Corp's view) of +   the alice#stuff account.  Chuck then submits the security token in a +   request to "http://example.com/Stuff/FooCorp/alice". + +   But example.com uses a URI canonicalizer that, for the purposes of +   checking equality, ignores fragments.  So when example.com looks in +   the security token to see if the requester has permission from Foo +   Corp to access the given account, it successfully matches the URI in +   the security token, "http://example.com/Stuff/FooCorp/alice#stuff", +   with the requested resource name +   "http://example.com/Stuff/FooCorp/alice". + +   Leveraging the inconsistencies in the canonicalizers used by Foo Corp +   and example.com, Chuck is able to successfully launch an elevation- +   of-privilege attack and access Alice's resource. + +   Furthermore, consider an attacker using a similar corporation, such +   as "foocorp" (or any variation containing a non-ASCII character that +   some humans might expect to represent the same corporation).  If the +   resource holder treats them as different but the security token +   service treats them as the same, then elevation of privilege can +   occur in this scenario as well. + +3.  Comparison Issues with Common Identifiers + +   In this section, we walk through a number of common types of +   identifiers and discuss various issues related to comparison that may +   affect security whenever they are used to identify security +   principals or resources.  These examples illustrate common patterns +   that may arise with other types of identifiers. + +3.1.  Hostnames + +   Hostnames (composed of dot-separated labels) are commonly used either +   directly as identifiers, or as components in identifiers such as in +   URIs and email addresses.  Another example is in Sections 7.2 and 7.3 +   of [RFC5280] (and updated in Section 3 of [RFC6818]), which specify +   use in PKIX certificates. + +   In this section, we discuss a number of issues in comparing strings +   that appear to be some form of hostname. + + + +Thaler                        Informational                     [Page 9] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   It is first worth pointing out that the term "hostname" itself is +   often ambiguous, and hence it is important that any use clarify which +   definition is intended.  Some examples of definitions include: + +   a.  A Fully Qualified Domain Name (FQDN), + +   b.  An FQDN that is associated with address records in the DNS, + +   c.  The leftmost label in an FQDN, or + +   d.  The leftmost label in an FQDN that is associated with address +       records. + +   The use of different definitions in different places results in +   questions such as whether "example" and "example.com" are considered +   equal or not, and hence it is important when writing new +   specifications to be clear about which definition is meant. + +   Section 3 of [RFC6055] discusses the differences between a "hostname" +   and a "DNS name", where the former is a subset of the latter by using +   a restricted set of characters (letters, digits, and hyphens).  If +   one canonicalizer uses the "DNS name" definition whereas another uses +   a "hostname" definition, a name might be valid in the former but +   invalid in the latter.  As long as invalid identifiers are denied +   privilege, this difference will not result in elevation of privilege. + +   Section 3.1 of [RFC1034] discusses the difference between a +   "complete" domain name, which ends with a dot (such as +   "example.com."), and a multi-label relative name such as +   "example.com" that assumes the root (".") is in the suffix search +   list.  In most contexts, these are considered equal, but there may be +   issues if different entities in a security architecture have +   different interpretations of a relative domain name. + +   [IAB1123] briefly discusses issues with the ambiguity around whether +   a label will be "alphabetic" -- including, among other issues, how +   "alphabetic" should be interpreted in an internationalized +   environment -- and whether a hostname can be interpreted as an IP +   address.  We explore this last issue in more detail below. + + + + + + + + + + + + +Thaler                        Informational                    [Page 10] + +RFC 6943                  Identifier Comparison                 May 2013 + + +3.1.1.  IPv4 Literals + +   Section 2.1 of [RFC1123] states: + +      Whenever a user inputs the identity of an Internet host, it SHOULD +      be possible to enter either (1) a host domain name or (2) an IP +      address in dotted-decimal ("#.#.#.#") form.  The host SHOULD check +      the string syntactically for a dotted-decimal number before +      looking it up in the Domain Name System. + +   and + +      This last requirement is not intended to specify the complete +      syntactic form for entering a dotted-decimal host number; that is +      considered to be a user-interface issue. + +   In specifying the inet_addr() API, the Portable Operating System +   Interface (POSIX) standard [IEEE-1003.1] defines "IPv4 dotted decimal +   notation" as allowing not only strings of the form "10.0.1.2" but +   also allowing octal and hexadecimal, and addresses with less than +   four parts.  For example, "10.0.258", "0xA000102", and "012.0x102" +   all represent the same IPv4 address in standard "IPv4 dotted decimal" +   notation.  We will refer to this as the "loose" syntax of an IPv4 +   address literal. + +   In Section 6.1 of [RFC3493], getaddrinfo() is defined to support the +   same (loose) syntax as inet_addr(): + +      If the specified address family is AF_INET or AF_UNSPEC, address +      strings using Internet standard dot notation as specified in +      inet_addr() are valid. + +   In contrast, Section 6.3 of the same RFC states, specifying +   inet_pton(): + +      If the af argument of inet_pton() is AF_INET, the src string shall +      be in the standard IPv4 dotted-decimal form: + +            ddd.ddd.ddd.ddd + +      where "ddd" is a one to three digit decimal number between 0 and +      255.  The inet_pton() function does not accept other formats (such +      as the octal numbers, hexadecimal numbers, and fewer than four +      numbers that inet_addr() accepts). + + + + + + + +Thaler                        Informational                    [Page 11] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   As shown above, inet_pton() uses what we will refer to as the +   "strict" form of an IPv4 address literal.  Some platforms also use +   the strict form with getaddrinfo() when the AI_NUMERICHOST flag is +   passed to it. + +   Both the strict and loose forms are standard forms, and hence a +   protocol specification is still ambiguous if it simply defines a +   string to be in the "standard IPv4 dotted decimal form".  And, as a +   result of these differences, names such as "10.11.12" are ambiguous +   as to whether they are an IP address or a hostname, and even +   "10.11.12.13" can be ambiguous because of the "SHOULD" in the above +   text from RFC 1123, making it optional whether to treat it as an +   address or a DNS name. + +   Protocols and data formats that can use addresses in string form for +   security purposes need to resolve these ambiguities.  For example, +   for the host component of URIs, Section 3.2.2 of [RFC3986] resolves +   the first ambiguity by only allowing the strict form and resolves the +   second ambiguity by specifying that it is considered an IPv4 address +   literal.  New protocols and data formats should similarly consider +   using the strict form rather than the loose form in order to better +   match user expectations. + +   A string might be valid under the "loose" definition but invalid +   under the "strict" definition.  As long as invalid identifiers are +   denied privilege, this difference will not result in elevation of +   privilege.  Some protocols, however, use strings that can be either +   an IP address literal or a hostname.  Such strings are at best +   Definite identifiers, and often turn out to be Indefinite +   identifiers.  (See Section 4.1 for more discussion.) + +3.1.2.  IPv6 Literals + +   IPv6 addresses similarly have a wide variety of alternate but +   semantically identical string representations, as defined in +   Section 2.2 of [RFC4291] and Section 2 of [RFC6874].  As discussed in +   Section 3.2.5 of [RFC5952], this fact causes problems in security +   contexts if comparison (such as in PKIX certificates) is done between +   strings rather than between the binary representations of addresses. + +   [RFC5952] specified a recommended canonical string format as an +   attempt to solve this problem, but it may not be ubiquitously +   supported at present.  And, when strings can contain non-ASCII +   characters, the same issues (and more, since hexadecimal and colons +   are allowed) arise as with IPv4 literals. + + + + + + +Thaler                        Informational                    [Page 12] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   Whereas (binary) IPv6 addresses are Absolute identifiers, IPv6 +   address literals are Definite identifiers, since string-to-address +   conversion for IPv6 address literals is unambiguous. + +3.1.3.  Internationalization + +   The IETF policy on character sets and languages [RFC2277] requires +   support for UTF-8 in protocols, and as a result many protocols now do +   support non-ASCII characters.  When a hostname is sent in a UTF-8 +   field, there are a number of ways it may be encoded.  For example, +   hostname labels might be encoded directly in UTF-8, or they might +   first be Punycode-encoded [RFC3492] or even percent-encoded from +   UTF-8. + +   For example, in URIs, Section 3.2.2 of [RFC3986] specifically allows +   for the use of percent-encoded UTF-8 characters in the hostname as +   well as the use of Internationalized Domain Names in Applications +   (IDNA) encoding [RFC3490] using the Punycode algorithm. + +   Percent-encoding is unambiguous for hostnames, since the percent +   character cannot appear in the strict definition of a "hostname", +   though it can appear in a DNS name. + +   Punycode-encoded labels (or "A-labels"), on the other hand, can be +   ambiguous if hosts are actually allowed to be named with a name +   starting with "xn--", and false positives can result.  While this may +   be extremely unlikely for normal scenarios, it nevertheless provides +   a possible vector for an attacker. + +   A hostname comparator thus needs to decide whether a Punycode-encoded +   label should or should not be considered a valid hostname label, and +   if so, then whether it should match a label encoded in some other +   form such as a percent-encoded Unicode label (U-label). + +   For example, Section 3 of "Transport Layer Security (TLS) Extensions: +   Extension Definitions" [RFC6066] states: + +      "HostName" contains the fully qualified DNS hostname of the +      server, as understood by the client.  The hostname is represented +      as a byte string using ASCII encoding without a trailing dot. +      This allows the support of internationalized domain names through +      the use of A-labels defined in [RFC5890].  DNS hostnames are case- +      insensitive.  The algorithm to compare hostnames is described in +      [RFC5890], Section 2.3.2.4. + +   For some additional discussion of security issues that arise with +   internationalization, see Section 4.2 and [TR36]. + + + + +Thaler                        Informational                    [Page 13] + +RFC 6943                  Identifier Comparison                 May 2013 + + +3.1.4.  Resolution for Comparison + +   Some systems (specifically Java URLs [JAVAURL]) use the rule that if +   two hostnames resolve to the same IP address(es) then the hostnames +   are considered equal.  That is, the canonicalization algorithm +   involves name resolution with an IP address being the canonical form. + +   For example, if resolution was done via DNS, and DNS contained: + +                       example.com.  IN A 10.0.0.6 +                       example.net.  CNAME example.com. +                       example.org.  IN A 10.0.0.6 + +   then the algorithm might treat all three names as equal, even though +   the third name might refer to a different entity. + +   With the introduction of dynamic IP addresses; private IP addresses; +   multiple IP addresses per name; multiple address families (e.g., IPv4 +   vs. IPv6); devices that roam to new locations; commonly deployed DNS +   tricks that result in the answer depending on factors such as the +   requester's location and the load on the server whose address is +   returned; etc., this method of comparison cannot be relied upon. +   There is no guarantee that two names for the same host will resolve +   the name to the same IP addresses; nor that the addresses resolved +   refer to the same entity, such as when the names resolve to private +   IP addresses; nor even that the system has connectivity (and the +   willingness to wait for the delay) to resolve names at the time the +   answer is needed.  The lifetime of the identifier, and of any cached +   state from a previous resolution, also affects security (see +   Section 4.4). + +   In addition, a comparison mechanism that relies on the ability to +   resolve identifiers such as hostnames to other identifiers such as IP +   addresses leaks information about security decisions to outsiders if +   these queries are publicly observable.  (See [PRIVACY-CONS] for a +   deeper discussion of information disclosure.) + +   Finally, it is worth noting that resolving two identifiers to +   determine if they refer to the same entity can be thought of as a use +   of such identifiers, as opposed to actually comparing the identifiers +   themselves, which is the focus of this document. + +3.2.  Port Numbers and Service Names + +   Port numbers and service names are discussed in depth in [RFC6335]. +   Historically, there were port numbers, service names used in SRV +   records, and mnemonic identifiers for assigned port numbers (known as +   port "keywords" at [IANA-PORT]).  The latter two are now unified, and + + + +Thaler                        Informational                    [Page 14] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   various protocols use one or more of these types in strings.  For +   example, the common syntax used by many URI schemes allows port +   numbers but not service names.  Some implementations of the +   getaddrinfo() API support strings that can be either port numbers or +   port keywords (but not service names). + +   For protocols that use service names that must be resolved, the +   issues are the same as those for resolution of addresses in +   Section 3.1.4.  In addition, Section 5.1 of [RFC6335] clarifies that +   service names/port keywords must contain at least one letter.  This +   prevents confusion with port numbers in strings where both are +   allowed. + +3.3.  URIs + +   This section looks at issues related to using URIs for security +   purposes.  For example, Section 7.4 of [RFC5280] specifies comparison +   of URIs in certificates.  Examples of URIs in security-token-based +   access control systems include WS-*, SAML 2.0 [OASIS-SAMLv2-CORE], +   and OAuth Web Resource Authorization Profiles (WRAP) [OAuth-WRAP]. +   In such systems, a variety of participants in the security +   infrastructure are identified by URIs.  For example, requesters of +   security tokens are sometimes identified with URIs.  The issuers of +   security tokens and the relying parties who are intended to consume +   security tokens are frequently identified by URIs.  Claims in +   security tokens often have their types defined using URIs, and the +   values of the claims can also be URIs. + +   URIs are defined with multiple components, each of which has its own +   rules.  We cover each in turn below.  However, it is also important +   to note that there exist multiple comparison algorithms.  Section 6.2 +   of [RFC3986] states: + +      A variety of methods are used in practice to test URI equivalence. +      These methods fall into a range, distinguished by the amount of +      processing required and the degree to which the probability of +      false negatives is reduced.  As noted above, false negatives +      cannot be eliminated.  In practice, their probability can be +      reduced, but this reduction requires more processing and is not +      cost-effective for all applications. + +      If this range of comparison practices is considered as a ladder, +      the following discussion will climb the ladder, starting with +      practices that are cheap but have a relatively higher chance of +      producing false negatives, and proceeding to those that have +      higher computational cost and lower risk of false negatives. + + + + + +Thaler                        Informational                    [Page 15] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   The ladder approach has both pros and cons.  On the pro side, it +   allows some uses to optimize for security, and other uses to optimize +   for cost, thus allowing URIs to be applicable to a wide range of +   uses.  A disadvantage is that when different approaches are taken by +   different components in the same system using the same identifiers, +   the inconsistencies can result in security issues. + +3.3.1.  Scheme Component + +   [RFC3986] defines URI schemes as being case-insensitive US-ASCII and +   in Section 6.2.2.1 specifies that scheme names should be normalized +   to lowercase characters. + +   New schemes can be defined over time.  In general, however, two URIs +   with an unrecognized scheme cannot be safely compared.  This is +   because the canonicalization and comparison rules for the other +   components may vary by scheme.  For example, a new URI scheme might +   have a default port of X, and without that knowledge, a comparison +   algorithm cannot know whether "example.com" and "example.com:X" +   should be considered to match in the authority component.  Hence, for +   security purposes, it is safest for unrecognized schemes to be +   treated as invalid identifiers.  However, if the URIs are only used +   with a "grant access on match" paradigm, then unrecognized schemes +   can be supported by doing a generic case-sensitive comparison, at the +   expense of some false negatives. + +3.3.2.  Authority Component + +   The authority component is scheme-specific, but many schemes follow a +   common syntax that allows for userinfo, host, and port. + +3.3.2.1.  Host + +   Section 3.1 discusses issues with hostnames in general.  In addition, +   Section 3.2.2 of [RFC3986] allows future changes using the IPvFuture +   production.  As with IPv4 and IPv6 literals, IPvFuture formats may +   have issues with multiple semantically identical string +   representations and may also be semantically identical to an IPv4 or +   IPv6 address.  As such, false negatives may be common if IPvFuture is +   used. + +3.3.2.2.  Port + +   See discussion in Section 3.2. + + + + + + + +Thaler                        Informational                    [Page 16] + +RFC 6943                  Identifier Comparison                 May 2013 + + +3.3.2.3.  Userinfo + +   [RFC3986] defines the userinfo production that allows arbitrary data +   about the user of the URI to be placed before '@' signs in URIs.  For +   example, "ftp://alice:bob@example.com/bar" has the value "alice:bob" +   as its userinfo.  When comparing URIs in a security context, one must +   decide whether to treat the userinfo as being significant or not. +   Some URI comparison services, for example, treat +   "ftp://alice:ick@example.com" and "ftp://example.com" as being equal. + +   When the userinfo is treated as being significant, it has additional +   considerations (e.g., whether or not it is case sensitive), which we +   cover in Section 3.4. + +3.3.3.  Path Component + +   [RFC3986] supports the use of path segment values such as "./" or +   "../" for relative URIs.  As discussed in Section 6.2.2.3 of +   [RFC3986], they are intended only for use within a reference relative +   to some other base URI, but Section 5.2.4 of [RFC3986] nevertheless +   defines an algorithm to remove them as part of URI normalization. + +   Unless a scheme states otherwise, the path component is defined to be +   case sensitive.  However, if the resource is stored and accessed +   using a filesystem using case-insensitive paths, there will be many +   paths that refer to the same resource.  As such, false negatives can +   be common in this case. + +3.3.4.  Query Component + +   There is the question as to whether "http://example.com/foo", +   "http://example.com/foo?", and "http://example.com/foo?bar" are each +   considered equal or different. + +   Similarly, it is unspecified whether the order of values matters. +   For example, should "http://example.com/blah?ick=bick&foo=bar" be +   considered equal to "http://example.com/blah?foo=bar&ick=bick"?  And +   if a domain name is permitted to appear in a query component (e.g., +   in a reference to another URI), the same issues in Section 3.1 apply. + +3.3.5.  Fragment Component + +   Some URI formats include fragment identifiers.  These are typically +   handles to locations within a resource and are used for local +   reference.  A classic example is the use of fragments in HTTP URIs +   where a URI of the form "http://example.com/blah.html#ick" means +   retrieve the resource "http://example.com/blah.html" and, once it has +   arrived locally, find the HTML anchor named "ick" and display that. + + + +Thaler                        Informational                    [Page 17] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   So, for example, when a user clicks on the link +   "http://example.com/blah.html#baz", a browser will check its cache by +   doing a URI comparison for "http://example.com/blah.html" and, if the +   resource is present in the cache, a match is declared. + +   Hence, comparisons for security purposes typically ignore the +   fragment component and treat all fragments as equal to the full +   resource.  However, if one were actually trying to compare the piece +   of a resource that was identified by the fragment identifier, +   ignoring it would result in potential false positives. + +3.3.6.  Resolution for Comparison + +   It may be tempting to define a URI comparison algorithm based on +   whether URIs resolve to the same content, along the lines of +   resolving hostnames as described in Section 3.1.4.  However, such an +   algorithm would result in similar problems, including content that +   dynamically changes over time or that is based on factors such as the +   requester's location, potential lack of external connectivity at the +   time or place that comparison is done, introduction of potentially +   undesirable delay, etc. + +   In addition, as noted in Section 3.1.4, resolution leaks information +   about security decisions to outsiders if the queries are publicly +   observable. + +3.4.  Email Address-Like Identifiers + +   Section 3.4.1 of [RFC5322] defines the syntax of an email address- +   like identifier, and Section 3.2 of [RFC6532] updates it to support +   internationalization.  Section 7.5 of [RFC5280] further discusses the +   use of internationalized email addresses in certificates. + +   Regarding the security impact of internationalized email headers, +   [RFC6532] points to Section 14 of [RFC6530], which contains a +   discussion of many issues resulting from internationalization. + +   Email address-like identifiers have a local part and a domain part. +   The issues with the domain part are essentially the same as with +   hostnames, as covered earlier in Section 3.1. + +   The local part is left for each domain to define.  People quite +   commonly use email addresses as usernames with web sites such as +   banks or shopping sites, but the site doesn't know whether +   foo@example.com is the same person as FOO@example.com.  Thus, email +   address-like identifiers are typically Indefinite identifiers. + + + + + +Thaler                        Informational                    [Page 18] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   To avoid false positives, some security mechanisms (such as those +   described in [RFC5280]) compare the local part using an exact match. +   Hence, like URIs, email address-like identifiers are designed for use +   in grant-on-match security schemes, not in deny-on-match schemes. + +   Furthermore, when such identifiers are actually used as email +   addresses, Section 2.4 of [RFC5321] states that the local part of a +   mailbox must be treated as case sensitive, but if a mailbox is stored +   and accessed using a filesystem using case-insensitive paths, there +   may be many paths that refer to the same mailbox.  As such, false +   negatives can be common in this case. + +4.  General Issues + +4.1.  Conflation + +   There are a number of examples (some in the preceding sections) of +   strings that conflate two types of identifiers, using some heuristic +   to try to determine which type of identifier is given.  Similarly, +   two ways of encoding the same type of identifier might be conflated +   within the same string. + +   Some examples include: + +   1.  A string that might be an IPv4 address literal or an IPv6 address +       literal + +   2.  A string that might be an IP address literal or a hostname + +   3.  A string that might be a port number or a service name + +   4.  A DNS label that might be literal or be Punycode-encoded + +   Strings that allow such conflation can only be considered Definite if +   there exists a well-defined rule to determine which identifier type +   is meant.  One way to do so is to ensure that the valid syntax for +   the two is disjoint (e.g., distinguishing IPv4 vs. IPv6 address +   literals by the use of colons in the latter).  A second way to do so +   is to define a precedence rule that results in some identifiers being +   inaccessible via a conflated string (e.g., a host literally named +   "xn--de-jg4avhby1noc0d" may be inaccessible due to the "xn--" prefix +   denoting the use of Punycode encoding).  In some cases, such +   inaccessible space may be reserved so that the actual set of +   identifiers in use is unambiguous.  For example, Section 2.5.5.2 of +   [RFC4291] defines a range of the IPv6 address space for representing +   IPv4 addresses. + + + + + +Thaler                        Informational                    [Page 19] + +RFC 6943                  Identifier Comparison                 May 2013 + + +4.2.  Internationalization + +   In addition to the issues with hostnames discussed in Section 3.1.3, +   there are a number of internationalization issues that apply to many +   types of Definite and Indefinite identifiers. + +   First, there is no DNS mechanism for identifying whether +   non-identical strings would be seen by a human as being equivalent. +   There are problematic examples even with US-ASCII (Basic Latin) +   strings, including regional spelling variations such as "color" and +   "colour", and with many non-English cases, including partially +   numeric strings in Arabic script contexts, Chinese strings in +   Simplified and Traditional forms, and so on.  Attempts to produce +   such alternate forms algorithmically could produce false positives +   and hence have an adverse effect on security. + +   Second, some strings are visually confusable with others, and hence +   if a security decision is made by a user based on visual inspection, +   many opportunities for false positives exist.  As such, using visual +   inspection for security is unreliable.  In addition to the security +   issues, visual confusability also adversely affects the usability of +   identifiers distributed via visual media.  Similar issues can arise +   with audible confusability when using audio (e.g., for radio +   distribution, accessibility to the blind, etc.) in place of a visual +   medium.  Furthermore, when strings conflate two types of identifiers +   as discussed in Section 4.1, allowing non-ASCII characters can cause +   one type of identifier to appear to a human as another type of +   identifier.  For example, characters that may look like digits and +   dots may appear to be an IPv4 literal to a human (especially to one +   who might expect digits to appear in his or her native script). +   Hence, conflation often increases the chance of confusability. + +   Determining whether a string is a valid identifier should typically +   be done after, or as part of, canonicalization.  Otherwise, an +   attacker might use the canonicalization algorithm to inject (e.g., +   via percent encoding, Normalization Form KC (NFKC), or non-shortest- +   form UTF-8) delimiters such as '@' in an email address-like +   identifier, or a '.' in a hostname. + +   Any case-insensitive comparisons need to define how comparison is +   done, since such comparisons may vary by the locale of the endpoint. +   As such, using case-insensitive comparisons in general often results +   in identifiers being either Indefinite or, if the legal character set +   is restricted (e.g., to US-ASCII), Definite. + +   See also [WEBER] for a more visual discussion of many of these +   issues. + + + + +Thaler                        Informational                    [Page 20] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   Finally, the set of permitted characters and the canonical form of +   the characters (and hence the canonicalization algorithm) sometimes +   vary by protocol today, even when the intent is to use the same +   identifier, such as when one protocol passes identifiers to the +   other.  See [RFC6885] for further discussion. + +4.3.  Scope + +   Another issue arises when an identifier (e.g., "localhost", +   "10.11.12.13", etc.) is not globally unique.  Section 1.1 of +   [RFC3986] states: + +      URIs have a global scope and are interpreted consistently +      regardless of context, though the result of that interpretation +      may be in relation to the end-user's context.  For example, +      "http://localhost/" has the same interpretation for every user of +      that reference, even though the network interface corresponding to +      "localhost" may be different for each end-user: interpretation is +      independent of access. + +   Whenever an identifier that is not globally unique is passed to +   another entity outside of the scope of uniqueness, it will refer to a +   different resource and can result in a false positive.  This problem +   is often addressed by using the identifier together with some other +   unique identifier of the context.  For example, "alice" may uniquely +   identify a user within a system but must be used with "example.com" +   (as in "alice@example.com") to uniquely identify the context outside +   of that system. + +   It is also worth noting that IPv6 addresses that are not globally +   scoped can be written with, or otherwise associated with, a "zone ID" +   to identify the context (see [RFC4007] for more information). +   However, zone IDs are only unique within a host, so they typically +   narrow, rather than expand, the scope of uniqueness of the resulting +   identifier. + +4.4.  Temporality + +   Often, identifiers are not unique across all time but have some +   lifetime associated with them after which they may be reassigned to +   another entity.  For example, bob@example.com might be assigned to an +   employee of the Example company, but if he leaves and another Bob is +   later hired, the same identifier might be reused.  As another +   example, IP address 203.0.113.1 might be assigned to one subscriber +   and then later reassigned to another subscriber.  Security issues can +   arise if updates are not made in all entities that store the +   identifier (e.g., in an access control list as discussed in +   Section 2, or in a resolution cache as discussed in Section 3.1.4). + + + +Thaler                        Informational                    [Page 21] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   This issue is similar to the issue of scope discussed in Section 4.3, +   except that the scope of uniqueness is temporal rather than +   topological. + +5.  Security Considerations + +   This entire document is about security considerations. + +   To minimize issues related to elevation of privilege, any system that +   requires the ability to use both deny and allow operations within the +   same identifier space should avoid the use of Indefinite identifiers +   in security comparisons. + +   To minimize future security risks, any new identifiers being designed +   should specify an Absolute or Definite comparison algorithm, and if +   extensibility is allowed (e.g., as new schemes in URIs allow), then +   the comparison algorithm should remain invariant so that unrecognized +   extensions can be compared.  That is, security risks can be reduced +   by specifying the comparison algorithm, making sure to resolve any +   ambiguities pointed out in this document (e.g., "standard dotted +   decimal"). + +   Some issues (such as unrecognized extensions) can be mitigated by +   treating such identifiers as invalid.  Validity checking of +   identifiers is further discussed in [RFC3696]. + +   Perhaps the hardest issues arise when multiple protocols are used +   together, such as in Figure 2, where the two protocols are defined or +   implemented using different comparison algorithms.  When constructing +   an architecture that uses multiple such protocols, designers should +   pay attention to any differences in comparison algorithms among the +   protocols in order to fully understand the security risks.  How to +   deal with such security risks in current systems is an area for +   future work. + +6.  Acknowledgements + +   Yaron Goland contributed to the discussion on URIs.  Patrik Faltstrom +   contributed to the background on identifiers.  John Klensin +   contributed text in a number of different sections.  Additional +   helpful feedback and suggestions came from Bernard Aboba, Fred Baker, +   Leslie Daigle, Mark Davis, Jeff Hodges, Bjoern Hoehrmann, Russ +   Housley, Christian Huitema, Magnus Nystrom, Tom Petch, and Chris +   Weber. + + + + + + + +Thaler                        Informational                    [Page 22] + +RFC 6943                  Identifier Comparison                 May 2013 + + +7.  IAB Members at the Time of Approval + +   Bernard Aboba +   Jari Arkko +   Marc Blanchet +   Ross Callon +   Alissa Cooper +   Spencer Dawkins +   Joel Halpern +   Russ Housley +   David Kessens +   Danny McPherson +   Jon Peterson +   Dave Thaler +   Hannes Tschofenig + +8.  Informative References + +   [IAB1123]  Internet Architecture Board, "IAB Statement: 'The +              interpretation of rules in the ICANN gTLD Applicant +              Guidebook'", February 2012, <http://www.iab.org/documents/ +              correspondence-reports-documents/2012-2/iab-statement-the- +              interpretation-of-rules-in-the-icann-gtld-applicant- +              guidebook>. + +   [IANA-PORT] +              IANA, "Service Name and Transport Protocol Port Number +              Registry", March 2013, +              <http://www.iana.org/assignments/service-names-port- +              numbers/>. + +   [IEEE-1003.1] +              IEEE and The Open Group, "The Open Group Base +              Specifications, Issue 6, IEEE Std 1003.1, 2004 Edition", +              IEEE Std 1003.1, 2004. + +   [JAVAURL]  Oracle, "Class URL", Java(TM) Platform Standard Ed. 7, +              2013, <http://docs.oracle.com/javase/7/docs/api/java/net/ +              URL.html>. + +   [OASIS-SAMLv2-CORE] +              Cantor, S., Ed., Kemp, J., Ed., Philpott, R., Ed., and E. +              Maler, Ed., "Assertions and Protocols for the OASIS +              Security Assertion Markup Language (SAML) V2.0", OASIS +              Standard saml-core-2.0-os, March 2005, +              <http://docs.oasis-open.org/security/saml/v2.0/ +              saml-core-2.0-os.pdf>. + + + + +Thaler                        Informational                    [Page 23] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   [OAuth-WRAP] +              Hardt, D., Ed., Tom, A., Eaton, B., and Y. Goland, "OAuth +              Web Resource Authorization Profiles", Work in Progress, +              January 2010. + +   [PRIVACY-CONS] +              Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., +              Morris, J., Hansen, M., and R. Smith, "Privacy +              Considerations for Internet Protocols", Work in Progress, +              April 2013. + +   [RFC1034]  Mockapetris, P., "Domain names - concepts and facilities", +              STD 13, RFC 1034, November 1987. + +   [RFC1123]  Braden, R., "Requirements for Internet Hosts - Application +              and Support", STD 3, RFC 1123, October 1989. + +   [RFC2277]  Alvestrand, H.T., "IETF Policy on Character Sets and +              Languages", BCP 18, RFC 2277, January 1998. + +   [RFC3490]  Faltstrom, P., Hoffman, P., and A. Costello, +              "Internationalizing Domain Names in Applications (IDNA)", +              RFC 3490, March 2003. + +   [RFC3492]  Costello, A., "Punycode: A Bootstring encoding of Unicode +              for Internationalized Domain Names in Applications +              (IDNA)", RFC 3492, March 2003. + +   [RFC3493]  Gilligan, R., Thomson, S., Bound, J., McCann, J., and W. +              Stevens, "Basic Socket Interface Extensions for IPv6", +              RFC 3493, February 2003. + +   [RFC3696]  Klensin, J., "Application Techniques for Checking and +              Transformation of Names", RFC 3696, February 2004. + +   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform +              Resource Identifier (URI): Generic Syntax", STD 66, +              RFC 3986, January 2005. + +   [RFC4007]  Deering, S., Haberman, B., Jinmei, T., Nordmark, E., and +              B. Zill, "IPv6 Scoped Address Architecture", RFC 4007, +              March 2005. + +   [RFC4291]  Hinden, R. and S. Deering, "IP Version 6 Addressing +              Architecture", RFC 4291, February 2006. + + + + + + +Thaler                        Informational                    [Page 24] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   [RFC4790]  Newman, C., Duerst, M., and A. Gulbrandsen, "Internet +              Application Protocol Collation Registry", RFC 4790, +              March 2007. + +   [RFC4949]  Shirey, R., "Internet Security Glossary, Version 2", +              RFC 4949, August 2007. + +   [RFC5280]  Cooper, D., Santesson, S., Farrell, S., Boeyen, S., +              Housley, R., and W. Polk, "Internet X.509 Public Key +              Infrastructure Certificate and Certificate Revocation List +              (CRL) Profile", RFC 5280, May 2008. + +   [RFC5321]  Klensin, J., "Simple Mail Transfer Protocol", RFC 5321, +              October 2008. + +   [RFC5322]  Resnick, P., Ed., "Internet Message Format", RFC 5322, +              October 2008. + +   [RFC5952]  Kawamura, S. and M. Kawashima, "A Recommendation for IPv6 +              Address Text Representation", RFC 5952, August 2010. + +   [RFC6055]  Thaler, D., Klensin, J., and S. Cheshire, "IAB Thoughts on +              Encodings for Internationalized Domain Names", RFC 6055, +              February 2011. + +   [RFC6066]  Eastlake, D., "Transport Layer Security (TLS) Extensions: +              Extension Definitions", RFC 6066, January 2011. + +   [RFC6125]  Saint-Andre, P. and J. Hodges, "Representation and +              Verification of Domain-Based Application Service Identity +              within Internet Public Key Infrastructure Using X.509 +              (PKIX) Certificates in the Context of Transport Layer +              Security (TLS)", RFC 6125, March 2011. + +   [RFC6335]  Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. +              Cheshire, "Internet Assigned Numbers Authority (IANA) +              Procedures for the Management of the Service Name and +              Transport Protocol Port Number Registry", BCP 165, +              RFC 6335, August 2011. + +   [RFC6530]  Klensin, J. and Y. Ko, "Overview and Framework for +              Internationalized Email", RFC 6530, February 2012. + +   [RFC6532]  Yang, A., Steele, S., and N. Freed, "Internationalized +              Email Headers", RFC 6532, February 2012. + + + + + + +Thaler                        Informational                    [Page 25] + +RFC 6943                  Identifier Comparison                 May 2013 + + +   [RFC6818]  Yee, P., "Updates to the Internet X.509 Public Key +              Infrastructure Certificate and Certificate Revocation List +              (CRL) Profile", RFC 6818, January 2013. + +   [RFC6874]  Carpenter, B., Cheshire, S., and R. Hinden, "Representing +              IPv6 Zone Identifiers in Address Literals and Uniform +              Resource Identifiers", RFC 6874, February 2013. + +   [RFC6885]  Blanchet, M. and A. Sullivan, "Stringprep Revision and +              Problem Statement for the Preparation and Comparison of +              Internationalized Strings (PRECIS)", RFC 6885, March 2013. + +   [TR36]     Unicode Consortium, "Unicode Security Considerations", +              Unicode Technical Report #36, Revision 11, July 2012, +              <http://www.unicode.org/reports/tr36/>. + +   [USASCII]  American National Standards Institute, "Coded Character +              Sets -- 7-bit American Standard Code for Information +              Interchange (7-bit ASCII)", ANSI X3.4, 1986. + +   [WEBER]    Weber, C., "Attacking Software Globalization", March 2010, +              <http://www.lookout.net/files/ +              Chris_Weber_Character%20Transformations%20v1.7_IUC33.pdf>. + +Author's Address + +   Dave Thaler (editor) +   Microsoft Corporation +   One Microsoft Way +   Redmond, WA  98052 +   USA + +   Phone: +1 425 703 8835 +   EMail: dthaler@microsoft.com + + + + + + + + + + + + + + + + + +Thaler                        Informational                    [Page 26] + |