diff options
Diffstat (limited to 'doc/rfc/rfc5891.txt')
-rw-r--r-- | doc/rfc/rfc5891.txt | 955 |
1 files changed, 955 insertions, 0 deletions
diff --git a/doc/rfc/rfc5891.txt b/doc/rfc/rfc5891.txt new file mode 100644 index 0000000..9ab9985 --- /dev/null +++ b/doc/rfc/rfc5891.txt @@ -0,0 +1,955 @@ + + + + + + +Internet Engineering Task Force (IETF) J. Klensin +Request for Comments: 5891 August 2010 +Obsoletes: 3490, 3491 +Updates: 3492 +Category: Standards Track +ISSN: 2070-1721 + + + Internationalized Domain Names in Applications (IDNA): Protocol + +Abstract + + This document is the revised protocol definition for + Internationalized Domain Names (IDNs). The rationale for changes, + the relationship to the older specification, and important + terminology are provided in other documents. This document specifies + the protocol mechanism, called Internationalized Domain Names in + Applications (IDNA), for registering and looking up IDNs in a way + that does not require changes to the DNS itself. IDNA is only meant + for processing domain names, not free text. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc5891. + + + + + + + + + + + + + + + + + +Klensin Standards Track [Page 1] + +RFC 5891 IDNA2008 Protocol August 2010 + + +Copyright Notice + + Copyright (c) 2010 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + + This document may contain material from IETF Documents or IETF + Contributions published or made publicly available before November + 10, 2008. The person(s) controlling the copyright in some of this + material may not have granted the IETF Trust the right to allow + modifications of such material outside the IETF Standards Process. + Without obtaining an adequate license from the person(s) controlling + the copyright in such materials, this document may not be modified + outside the IETF Standards Process, and derivative works of it may + not be created outside the IETF Standards Process, except to format + it for publication as an RFC or to translate it into languages other + than English. + + + + + + + + + + + + + + + + + + + + + + + + + +Klensin Standards Track [Page 2] + +RFC 5891 IDNA2008 Protocol August 2010 + + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 3. Requirements and Applicability . . . . . . . . . . . . . . . . 5 + 3.1. Requirements . . . . . . . . . . . . . . . . . . . . . . . 5 + 3.2. Applicability . . . . . . . . . . . . . . . . . . . . . . 5 + 3.2.1. DNS Resource Records . . . . . . . . . . . . . . . . . 6 + 3.2.2. Non-Domain-Name Data Types Stored in the DNS . . . . . 6 + 4. Registration Protocol . . . . . . . . . . . . . . . . . . . . 6 + 4.1. Input to IDNA Registration . . . . . . . . . . . . . . . . 7 + 4.2. Permitted Character and Label Validation . . . . . . . . . 7 + 4.2.1. Input Format . . . . . . . . . . . . . . . . . . . . . 7 + 4.2.2. Rejection of Characters That Are Not Permitted . . . . 8 + 4.2.3. Label Validation . . . . . . . . . . . . . . . . . . . 8 + 4.2.4. Registration Validation Requirements . . . . . . . . . 9 + 4.3. Registry Restrictions . . . . . . . . . . . . . . . . . . 9 + 4.4. Punycode Conversion . . . . . . . . . . . . . . . . . . . 9 + 4.5. Insertion in the Zone . . . . . . . . . . . . . . . . . . 10 + 5. Domain Name Lookup Protocol . . . . . . . . . . . . . . . . . 10 + 5.1. Label String Input . . . . . . . . . . . . . . . . . . . . 10 + 5.2. Conversion to Unicode . . . . . . . . . . . . . . . . . . 10 + 5.3. A-label Input . . . . . . . . . . . . . . . . . . . . . . 10 + 5.4. Validation and Character List Testing . . . . . . . . . . 11 + 5.5. Punycode Conversion . . . . . . . . . . . . . . . . . . . 13 + 5.6. DNS Name Resolution . . . . . . . . . . . . . . . . . . . 13 + 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 + 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 + 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 13 + 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 + 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 + 10.1. Normative References . . . . . . . . . . . . . . . . . . . 14 + 10.2. Informative References . . . . . . . . . . . . . . . . . . 15 + Appendix A. Summary of Major Changes from IDNA2003 . . . . . . . 17 + + + + + + + + + + + + + + + + + +Klensin Standards Track [Page 3] + +RFC 5891 IDNA2008 Protocol August 2010 + + +1. Introduction + + This document supplies the protocol definition for Internationalized + Domain Names in Applications (IDNA), with the version specified here + known as IDNA2008. Essential definitions and terminology for + understanding this document and a road map of the collection of + documents that make up IDNA2008 appear in a separate Definitions + document [RFC5890]. Appendix A discusses the relationship between + this specification and the earlier version of IDNA (referred to here + as "IDNA2003"). The rationale for these changes, along with + considerable explanatory material and advice to zone administrators + who support IDNs, is provided in another document, known informally + in this series as the "Rationale document" [RFC5894]. + + IDNA works by allowing applications to use certain ASCII [ASCII] + string labels (beginning with a special prefix) to represent + non-ASCII name labels. Lower-layer protocols need not be aware of + this; therefore, IDNA does not change any infrastructure. In + particular, IDNA does not depend on any changes to DNS servers, + resolvers, or DNS protocol elements, because the ASCII name service + provided by the existing DNS can be used for IDNA. + + IDNA applies only to a specific subset of DNS labels. The base DNS + standards [RFC1034] [RFC1035] and their various updates specify how + to combine labels into fully-qualified domain names and parse labels + out of those names. + + This document describes two separate protocols, one for IDN + registration (Section 4) and one for IDN lookup (Section 5). These + two protocols share some terminology, reference data, and operations. + +2. Terminology + + As mentioned above, terminology used as part of the definition of + IDNA appears in the Definitions document [RFC5890]. It is worth + noting that some of this terminology overlaps with, and is consistent + with, that used in Unicode or other character set standards and the + DNS. Readers of this document are assumed to be familiar with the + associated Definitions document and with the DNS-specific terminology + in RFC 1034 [RFC1034]. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in BCP 14, RFC 2119 + [RFC2119]. + + + + + + +Klensin Standards Track [Page 4] + +RFC 5891 IDNA2008 Protocol August 2010 + + +3. Requirements and Applicability + +3.1. Requirements + + IDNA makes the following requirements: + + 1. Whenever a domain name is put into a domain name slot that is not + IDNA-aware (see Section 2.3.2.6 of the Definitions document + [RFC5890]), it MUST contain only ASCII characters (i.e., its + labels must be either A-labels or NR-LDH labels), unless the DNS + application is not subject to historical recommendations for + "hostname"-style names (see RFC 1034 [RFC1034] and + Section 3.2.1). + + 2. Labels MUST be compared using equivalent forms: either both + A-label forms or both U-label forms. Because A-labels and + U-labels can be transformed into each other without loss of + information, these comparisons are equivalent (however, in + practice, comparison of U-labels requires first verifying that + they actually are U-labels and not just Unicode strings). A pair + of A-labels MUST be compared as case-insensitive ASCII (as with + all comparisons of ASCII DNS labels). U-labels MUST be compared + as-is, without case folding or other intermediate steps. While + it is not necessary to validate labels in order to compare them, + successful comparison does not imply validity. In many cases, + not limited to comparison, validation may be important for other + reasons and SHOULD be performed. + + 3. Labels being registered MUST conform to the requirements of + Section 4. Labels being looked up and the lookup process MUST + conform to the requirements of Section 5. + +3.2. Applicability + + IDNA applies to all domain names in all domain name slots in + protocols except where it is explicitly excluded. It does not apply + to domain name slots that do not use the LDH syntax rules as + described in the Definitions document [RFC5890]. + + Because it uses the DNS, IDNA applies to many protocols that were + specified before it was designed. IDNs occupying domain name slots + in those older protocols MUST be in A-label form until and unless + those protocols and their implementations are explicitly upgraded to + be aware of IDNs and to accept the U-label form. IDNs actually + appearing in DNS queries or responses MUST be A-labels. + + + + + + +Klensin Standards Track [Page 5] + +RFC 5891 IDNA2008 Protocol August 2010 + + + IDNA-aware protocols and implementations MAY accept U-labels, + A-labels, or both as those particular protocols specify. IDNA is not + defined for extended label types (see RFC 2671 [RFC2671], Section 3). + +3.2.1. DNS Resource Records + + IDNA applies only to domain names in the NAME and RDATA fields of DNS + resource records whose CLASS is IN. See the DNS specification + [RFC1035] for precise definitions of these terms. + + The application of IDNA to DNS resource records depends entirely on + the CLASS of the record, and not on the TYPE except as noted below. + This will remain true, even as new TYPEs are defined, unless a new + TYPE defines TYPE-specific rules. Special naming conventions for SRV + records (and "underscore labels" more generally) are incompatible + with IDNA coding as discussed in the Definitions document [RFC5890], + especially Section 2.3.2.3. Of course, underscore labels may be part + of a domain that uses IDN labels at higher levels in the tree. + +3.2.2. Non-Domain-Name Data Types Stored in the DNS + + Although IDNA enables the representation of non-ASCII characters in + domain names, that does not imply that IDNA enables the + representation of non-ASCII characters in other data types that are + stored in domain names, specifically in the RDATA field for types + that have structured RDATA format. For example, an email address + local part is stored in a domain name in the RNAME field as part of + the RDATA of an SOA record (e.g., hostmaster@example.com would be + represented as hostmaster.example.com). IDNA does not update the + existing email standards, which allow only ASCII characters in local + parts. Even though work is in progress to define + internationalization for email addresses [RFC4952], changes to the + email address part of the SOA RDATA would require action in, or + updates to, other standards, specifically those that specify the + format of the SOA RR. + +4. Registration Protocol + + This section defines the model for registering an IDN. The model is + implementation independent; any sequence of steps that produces + exactly the same result for all labels is considered a valid + implementation. + + Note that, while the registration (this section) and lookup protocols + (Section 5) are very similar in most respects, they are not + identical, and implementers should carefully follow the steps + described in this specification. + + + + +Klensin Standards Track [Page 6] + +RFC 5891 IDNA2008 Protocol August 2010 + + +4.1. Input to IDNA Registration + + Registration processes, especially processing by entities (often + called "registrars") who deal with registrants before the request + actually reaches the zone manager ("registry") are outside the scope + of this definition and may differ significantly depending on local + needs. By the time a string enters the IDNA registration process as + described in this specification, it MUST be in Unicode and in + Normalization Form C (NFC [Unicode-UAX15]). Entities responsible for + zone files ("registries") MUST accept only the exact string for which + registration is requested, free of any mappings or local adjustments. + They MAY accept that input in any of three forms: + + 1. As a pair of A-label and U-label. + + 2. As an A-label only. + + 3. As a U-label only. + + The first two of these forms are RECOMMENDED because the use of + A-labels avoids any possibility of ambiguity. The first is normally + preferred over the second because it permits further verification of + user intent (see Section 4.2.1). + +4.2. Permitted Character and Label Validation + +4.2.1. Input Format + + If both the U-label and A-label forms are available, the registry + MUST ensure that the A-label form is in lowercase, perform a + conversion to a U-label, perform the steps and tests described below + on that U-label, and then verify that the A-label produced by the + step in Section 4.4 matches the one provided as input. In addition, + the U-label that was provided as input and the one obtained by + conversion of the A-label MUST match exactly. If, for some reason, + these tests fail, the registration MUST be rejected. + + If only an A-label was provided and the conversion to a U-label is + not performed, the registry MUST still verify that the A-label is + superficially valid, i.e., that it does not violate any of the rules + of Punycode encoding [RFC3492] such as the prohibition on trailing + hyphen-minus, the requirement that all characters be ASCII, and so + on. Strings that appear to be A-labels (e.g., they start with + "xn--") and strings that are supplied to the registry in a context + reserved for A-labels (such as a field in a form to be filled out), + but that are not valid A-labels as described in this paragraph, MUST + NOT be placed in DNS zones that support IDNA. + + + + +Klensin Standards Track [Page 7] + +RFC 5891 IDNA2008 Protocol August 2010 + + + If only an A-label is provided, the conversion to a U-label is not + performed, but the superficial tests described in the previous + paragraph are performed, registration procedures MAY, and usually + will, bypass the tests and actions in the balance of Section 4.2 and + in Sections 4.3 and 4.4. + +4.2.2. Rejection of Characters That Are Not Permitted + + The candidate Unicode string MUST NOT contain characters that appear + in the "DISALLOWED" and "UNASSIGNED" lists specified in the Tables + document [RFC5892]. + +4.2.3. Label Validation + + The proposed label (in the form of a Unicode string, i.e., a string + that at least superficially appears to be a U-label) is then examined + using tests that require examination of more than one character. + Character order is considered to be the on-the-wire order. That + order may not be the same as the display order. + +4.2.3.1. Hyphen Restrictions + + The Unicode string MUST NOT contain "--" (two consecutive hyphens) in + the third and fourth character positions and MUST NOT start or end + with a "-" (hyphen). + +4.2.3.2. Leading Combining Marks + + The Unicode string MUST NOT begin with a combining mark or combining + character (see The Unicode Standard, Section 2.11 [Unicode] for an + exact definition). + +4.2.3.3. Contextual Rules + + The Unicode string MUST NOT contain any characters whose validity is + context-dependent, unless the validity is positively confirmed by a + contextual rule. To check this, each code point identified as + CONTEXTJ or CONTEXTO in the Tables document [RFC5892] MUST have a + non-null rule. If such a code point is missing a rule, the label is + invalid. If the rule exists but the result of applying the rule is + negative or inconclusive, the proposed label is invalid. + +4.2.3.4. Labels Containing Characters Written Right to Left + + If the proposed label contains any characters from scripts that are + written from right to left, it MUST meet the Bidi criteria [RFC5893]. + + + + + +Klensin Standards Track [Page 8] + +RFC 5891 IDNA2008 Protocol August 2010 + + +4.2.4. Registration Validation Requirements + + Strings that contain at least one non-ASCII character, have been + produced by the steps above, whose contents pass all of the tests in + Section 4.2.3, and are 63 or fewer characters long in + ASCII-compatible encoding (ACE) form (see Section 4.4), are U-labels. + + To summarize, tests are made in Section 4.2 for invalid characters, + invalid combinations of characters, for labels that are invalid even + if the characters they contain are valid individually, and for labels + that do not conform to the restrictions for strings containing + right-to-left characters. + +4.3. Registry Restrictions + + In addition to the rules and tests above, there are many reasons why + a registry could reject a label. Registries at all levels of the + DNS, not just the top level, are expected to establish policies about + label registrations. Policies are likely to be informed by the local + languages and the scripts that are used to write them and may depend + on many factors including what characters are in the label (for + example, a label may be rejected based on other labels already + registered). See the Rationale document [RFC5894], Section 3.2, for + further discussion and recommendations about registry policies. + + The string produced by the steps in Section 4.2 is checked and + processed as appropriate to local registry restrictions. Application + of those registry restrictions may result in the rejection of some + labels or the application of special restrictions to others. + +4.4. Punycode Conversion + + The resulting U-label is converted to an A-label (defined in Section + 2.3.2.1 of the Definitions document [RFC5890]). The A-label is the + encoding of the U-label according to the Punycode algorithm [RFC3492] + with the ACE prefix "xn--" added at the beginning of the string. The + resulting string must, of course, conform to the length limits + imposed by the DNS. This document does not update or alter the + Punycode algorithm specified in RFC 3492 in any way. RFC 3492 does + make a non-normative reference to the information about the value and + construction of the ACE prefix that appears in RFC 3490 or Nameprep + [RFC3491]. For consistency and reader convenience, IDNA2008 + effectively updates that reference to point to this document. That + change does not alter the prefix itself. The prefix, "xn--", is the + same in both sets of documents. + + + + + + +Klensin Standards Track [Page 9] + +RFC 5891 IDNA2008 Protocol August 2010 + + + With the exception of the maximum string length test on Punycode + output, the failure conditions identified in the Punycode encoding + procedure cannot occur if the input is a U-label as determined by the + steps in Sections 4.1 through 4.3 above. + +4.5. Insertion in the Zone + + The label is registered in the DNS by inserting the A-label into a + zone. + +5. Domain Name Lookup Protocol + + Lookup is different from registration and different tests are applied + on the client. Although some validity checks are necessary to avoid + serious problems with the protocol, the lookup-side tests are more + permissive and rely on the assumption that names that are present in + the DNS are valid. That assumption is, however, a weak one because + the presence of wildcards in the DNS might cause a string that is not + actually registered in the DNS to be successfully looked up. + +5.1. Label String Input + + The user supplies a string in the local character set, for example, + by typing it, clicking on it, or copying and pasting it from a + resource identifier, e.g., a Uniform Resource Identifier (URI) + [RFC3986] or an Internationalized Resource Identifier (IRI) + [RFC3987], from which the domain name is extracted. Alternately, + some process not directly involving the user may read the string from + a file or obtain it in some other way. Processing in this step and + the one specified in Section 5.2 are local matters, to be + accomplished prior to actual invocation of IDNA. + +5.2. Conversion to Unicode + + The string is converted from the local character set into Unicode, if + it is not already in Unicode. Depending on local needs, this + conversion may involve mapping some characters into other characters + as well as coding conversions. Those issues are discussed in the + mapping-related sections (Sections 4.2, 4.4, 6, and 7.3) of the + Rationale document [RFC5894] and in the separate Mapping document + [IDNA2008-Mapping]. The result MUST be a Unicode string in NFC form. + +5.3. A-label Input + + If the input to this procedure appears to be an A-label (i.e., it + starts in "xn--", interpreted case-insensitively), the lookup + application MAY attempt to convert it to a U-label, first ensuring + that the A-label is entirely in lowercase (converting it to lowercase + + + +Klensin Standards Track [Page 10] + +RFC 5891 IDNA2008 Protocol August 2010 + + + if necessary), and apply the tests of Section 5.4 and the conversion + of Section 5.5 to that form. If the label is converted to Unicode + (i.e., to U-label form) using the Punycode decoding algorithm, then + the processing specified in those two sections MUST be performed, and + the label MUST be rejected if the resulting label is not identical to + the original. See Section 8.1 of the Rationale document [RFC5894] + for additional discussion on this topic. + + Conversion from the A-label and testing that the result is a U-label + SHOULD be performed if the domain name will later be presented to the + user in native character form (this requires that the lookup + application be IDNA-aware). If those steps are not performed, the + lookup process SHOULD at least test to determine that the string is + actually an A-label, examining it for the invalid formats specified + in the Punycode decoding specification. Applications that are not + IDNA-aware will obviously omit that testing; others MAY treat the + string as opaque to avoid the additional processing at the expense of + providing less protection and information to users. + +5.4. Validation and Character List Testing + + As with the registration procedure described in Section 4, the + Unicode string is checked to verify that all characters that appear + in it are valid as input to IDNA lookup processing. As discussed + above and in the Rationale document [RFC5894], the lookup check is + more liberal than the registration one. Labels that have not been + fully evaluated for conformance to the applicable rules are referred + to as "putative" labels as discussed in Section 2.3.2.1 of the + Definitions document [RFC5890]. Putative U-labels with any of the + following characteristics MUST be rejected prior to DNS lookup: + + o Labels that are not in NFC [Unicode-UAX15]. + + o Labels containing "--" (two consecutive hyphens) in the third and + fourth character positions. + + o Labels whose first character is a combining mark (see The Unicode + Standard, Section 2.11 [Unicode]). + + o Labels containing prohibited code points, i.e., those that are + assigned to the "DISALLOWED" category of the Tables document + [RFC5892]. + + o Labels containing code points that are identified in the Tables + document as "CONTEXTJ", i.e., requiring exceptional contextual + rule processing on lookup, but that do not conform to those rules. + Note that this implies that a rule must be defined, not null: a + + + + +Klensin Standards Track [Page 11] + +RFC 5891 IDNA2008 Protocol August 2010 + + + character that requires a contextual rule but for which the rule + is null is treated in this step as having failed to conform to the + rule. + + o Labels containing code points that are identified in the Tables + document as "CONTEXTO", but for which no such rule appears in the + table of rules. Applications resolving DNS names or carrying out + equivalent operations are not required to test contextual rules + for "CONTEXTO" characters, only to verify that a rule is defined + (although they MAY make such tests to provide better protection or + give better information to the user). + + o Labels containing code points that are unassigned in the version + of Unicode being used by the application, i.e., in the UNASSIGNED + category of the Tables document. + + This requirement means that the application must use a list of + unassigned characters that is matched to the version of Unicode + that is being used for the other requirements in this section. It + is not required that the application know which version of Unicode + is being used; that information might be part of the operating + environment in which the application is running. + + In addition, the application SHOULD apply the following test. + + o Verification that the string is compliant with the requirements + for right-to-left characters specified in the Bidi document + [RFC5893]. + + This test may be omitted in special circumstances, such as when the + lookup application knows that the conditions are enforced elsewhere, + because an attempt to look up and resolve such strings will almost + certainly lead to a DNS lookup failure except when wildcards are + present in the zone. However, applying the test is likely to give + much better information about the reason for a lookup failure -- + information that may be usefully passed to the user when that is + feasible -- than DNS resolution failure information alone. + + For all other strings, the lookup application MUST rely on the + presence or absence of labels in the DNS to determine the validity of + those labels and the validity of the characters they contain. If + they are registered, they are presumed to be valid; if they are not, + their possible validity is not relevant. While a lookup application + may reasonably issue warnings about strings it believes may be + problematic, applications that decline to process a string that + conforms to the rules above (i.e., does not look it up in the DNS) + are not in conformance with this protocol. + + + + +Klensin Standards Track [Page 12] + +RFC 5891 IDNA2008 Protocol August 2010 + + +5.5. Punycode Conversion + + The string that has now been validated for lookup is converted to ACE + form by applying the Punycode algorithm to the string and then adding + the ACE prefix ("xn--"). + +5.6. DNS Name Resolution + + The A-label resulting from the conversion in Section 5.5 or supplied + directly (see Section 5.3) is combined with other labels as needed to + form a fully-qualified domain name that is then looked up in the DNS, + using normal DNS resolver procedures. The lookup can obviously + either succeed (returning information) or fail. + +6. Security Considerations + + Security Considerations for this version of IDNA are described in the + Definitions document [RFC5890], except for the special issues + associated with right-to-left scripts and characters. The latter are + discussed in the Bidi document [RFC5893]. + + In order to avoid intentional or accidental attacks from labels that + might be confused with others, special problems in rendering, and so + on, the IDNA model requires that registries exercise care and + thoughtfulness about what labels they choose to permit. That issue + is discussed in Section 4.3 of this document which, in turn, points + to a somewhat more extensive discussion in the Rationale document + [RFC5894]. + +7. IANA Considerations + + IANA actions for this version of IDNA are specified in the Tables + document [RFC5892] and discussed informally in the Rationale document + [RFC5894]. The components of IDNA described in this document do not + require any IANA actions. + +8. Contributors + + While the listed editor held the pen, the original versions of this + document represent the joint work and conclusions of an ad hoc design + team consisting of the editor and, in alphabetic order, Harald + Alvestrand, Tina Dam, Patrik Faltstrom, and Cary Karp. This document + draws significantly on the original version of IDNA [RFC3490] both + conceptually and for specific text. This second-generation version + would not have been possible without the work that went into that + first version and especially the contributions of its authors Patrik + Faltstrom, Paul Hoffman, and Adam Costello. While Faltstrom was + + + + +Klensin Standards Track [Page 13] + +RFC 5891 IDNA2008 Protocol August 2010 + + + actively involved in the creation of this version, Hoffman and + Costello were not and should not be held responsible for any errors + or omissions. + +9. Acknowledgments + + This revision to IDNA would have been impossible without the + accumulated experience since RFC 3490 was published and resulting + comments and complaints of many people in the IETF, ICANN, and other + communities (too many people to list here). Nor would it have been + possible without RFC 3490 itself and the efforts of the Working Group + that defined it. Those people whose contributions are acknowledged + in RFC 3490, RFC 4690 [RFC4690], and the Rationale document [RFC5894] + were particularly important. + + Specific textual changes were incorporated into this document after + suggestions from the other contributors, Stephane Bortzmeyer, Vint + Cerf, Lisa Dusseault, Paul Hoffman, Kent Karlsson, James Mitchell, + Erik van der Poel, Marcos Sanz, Andrew Sullivan, Wil Tan, Ken + Whistler, Chris Wright, and other WG participants and reviewers + including Martin Duerst, James Mitchell, Subramanian Moonesamy, Peter + Saint-Andre, Margaret Wasserman, and Dan Winship who caught specific + errors and recommended corrections. Special thanks are due to Paul + Hoffman for permission to extract material to form the basis for + Appendix A from a draft document that he prepared. + +10. References + +10.1. Normative References + + [RFC1034] Mockapetris, P., "Domain names - concepts and + facilities", STD 13, RFC 1034, November 1987. + + [RFC1035] Mockapetris, P., "Domain names - implementation and + specification", STD 13, RFC 1035, November 1987. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC3492] Costello, A., "Punycode: A Bootstring encoding of + Unicode for Internationalized Domain Names in + Applications (IDNA)", RFC 3492, March 2003. + + [RFC5890] Klensin, J., "Internationalized Domain Names for + Applications (IDNA): Definitions and Document + Framework", RFC 5890, August 2010. + + + + + +Klensin Standards Track [Page 14] + +RFC 5891 IDNA2008 Protocol August 2010 + + + [RFC5892] Faltstrom, P., Ed., "The Unicode Code Points and + Internationalized Domain Names for Applications (IDNA)", + RFC 5892, August 2010. + + [RFC5893] Alvestrand, H., Ed. and C. Karp, "Right-to-Left Scripts + for Internationalized Domain Names for Applications + (IDNA)", RFC 5893, August 2010. + + [Unicode-UAX15] + The Unicode Consortium, "Unicode Standard Annex #15: + Unicode Normalization Forms", September 2009, + <http://www.unicode.org/reports/tr15/>. + +10.2. Informative References + + [ASCII] American National Standards Institute (formerly United + States of America Standards Institute), "USA Code for + Information Interchange", ANSI X3.4-1968, 1968. ANSI + X3.4-1968 has been replaced by newer versions with + slight modifications, but the 1968 version remains + definitive for the Internet. + + [IDNA2008-Mapping] + Resnick, P. and P. Hoffman, "Mapping Characters in + Internationalized Domain Names for Applications (IDNA)", + Work in Progress, April 2010. + + [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", + RFC 2671, August 1999. + + [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello, + "Internationalizing Domain Names in Applications + (IDNA)", RFC 3490, March 2003. + + [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep + Profile for Internationalized Domain Names (IDN)", + RFC 3491, March 2003. + + [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform + Resource Identifier (URI): Generic Syntax", STD 66, + RFC 3986, January 2005. + + [RFC3987] Duerst, M. and M. Suignard, "Internationalized Resource + Identifiers (IRIs)", RFC 3987, January 2005. + + [RFC4690] Klensin, J., Faltstrom, P., Karp, C., and IAB, "Review + and Recommendations for Internationalized Domain Names + (IDNs)", RFC 4690, September 2006. + + + +Klensin Standards Track [Page 15] + +RFC 5891 IDNA2008 Protocol August 2010 + + + [RFC4952] Klensin, J. and Y. Ko, "Overview and Framework for + Internationalized Email", RFC 4952, July 2007. + + [RFC5894] Klensin, J., "Internationalized Domain Names for + Applications (IDNA): Background, Explanation, and + Rationale", RFC 5894, August 2010. + + [Unicode] The Unicode Consortium, "The Unicode Standard, Version + 5.0", 2007. Boston, MA, USA: Addison-Wesley. ISBN + 0-321-48091-0. This printed reference has now been + updated online to reflect additional code points. For + code points, the reference at the time this document was + published is to Unicode 5.2. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Klensin Standards Track [Page 16] + +RFC 5891 IDNA2008 Protocol August 2010 + + +Appendix A. Summary of Major Changes from IDNA2003 + + 1. Update base character set from Unicode 3.2 to Unicode version + agnostic. + + 2. Separate the definitions for the "registration" and "lookup" + activities. + + 3. Disallow symbol and punctuation characters except where special + exceptions are necessary. + + 4. Remove the mapping and normalization steps from the protocol and + have them, instead, done by the applications themselves, + possibly in a local fashion, before invoking the protocol. + + 5. Change the way that the protocol specifies which characters are + allowed in labels from "humans decide what the table of code + points contains" to "decision about code points are based on + Unicode properties plus a small exclusion list created by + humans". + + 6. Introduce the new concept of characters that can be used only in + specific contexts. + + 7. Allow typical words and names in languages such as Dhivehi and + Yiddish to be expressed. + + 8. Make bidirectional domain names (delimited strings of labels, + not just labels standing on their own) display in a less + surprising fashion, whether they appear in obvious domain name + contexts or as part of running text in paragraphs. + + 9. Remove the dot separator from the mandatory part of the + protocol. + + 10. Make some currently valid labels that are not actually IDNA + labels invalid. + +Author's Address + + John C Klensin + 1770 Massachusetts Ave, Ste 322 + Cambridge, MA 02140 + USA + + Phone: +1 617 245 1457 + EMail: john+ietf@jck.com + + + + +Klensin Standards Track [Page 17] + |