1 files changed, 1347 insertions, 0 deletions
diff --git a/doc/rfc/rfc6055.txt b/doc/rfc/rfc6055.txt
new file mode 100644
index 0000000..81e7098
--- /dev/null
+++ b/doc/rfc/rfc6055.txt
@@ -0,0 +1,1347 @@
+
+
+
+
+
+
+Internet Architecture Board (IAB)                              D. Thaler
+Request for Comments: 6055                                     Microsoft
+Updates: 2130                                                 J. Klensin
+Category: Informational
+ISSN: 2070-1721                                              S. Cheshire
+                                                                   Apple
+                                                           February 2011
+
+
+      IAB Thoughts on Encodings for Internationalized Domain Names
+
+Abstract
+
+   This document explores issues with Internationalized Domain Names
+   (IDNs) that result from the use of various encoding schemes such as
+   UTF-8 and the ASCII-Compatible Encoding produced by the Punycode
+   algorithm.  It focuses on the importance of agreeing on a single
+   encoding and how complicated the state of affairs ends up being as a
+   result of using different encodings today.
+
+Status of This Memo
+
+   This document is not an Internet Standards Track specification; it is
+   published for informational purposes.
+
+   This document is a product of the Internet Architecture Board (IAB)
+   and represents information that the IAB has deemed valuable to
+   provide for permanent record.  Documents approved for publication by
+   the IAB are not a candidate for any level of Internet Standard; see
+   Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc6055.
+
+Copyright Notice
+
+   Copyright (c) 2011 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.
+
+
+
+
+
+Thaler, et al.                Informational                     [Page 1]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+Table of Contents
+
+   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  2
+     1.1.  APIs . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
+   2.  Use of Non-DNS Protocols . . . . . . . . . . . . . . . . . . .  9
+   3.  Use of Non-ASCII in DNS  . . . . . . . . . . . . . . . . . . . 10
+     3.1.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . 14
+   4.  Recommendations  . . . . . . . . . . . . . . . . . . . . . . . 16
+   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 18
+   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19
+   7.  IAB Members at the Time of Approval  . . . . . . . . . . . . . 19
+   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
+     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 20
+     8.2.  Informative References . . . . . . . . . . . . . . . . . . 20
+
+1.  Introduction
+
+   The goal of this document is to explore what can be learned from some
+   current difficulties in implementing Internationalized Domain Names
+   (IDNs).
+
+   A domain name consists of a sequence of labels, conventionally
+   written separated by dots.  An IDN is a domain name that contains one
+   or more labels that, in turn, contain one or more non-ASCII
+   characters.  Just as with plain ASCII domain names, each IDN label
+   must be encoded using some mechanism before it can be transmitted in
+   network packets, stored in memory, stored on disk, etc.  These
+   encodings need to be reversible, but they need not store domain names
+   the same way humans conventionally write them on paper.  For example,
+   when transmitted over the network in DNS packets, domain name labels
+   are *not* separated with dots.
+
+   Internationalized Domain Names for Applications (IDNA), discussed
+   later in this document, is the standard that defines the use and
+   coding of internationalized domain names for use on the public
+   Internet [RFC5890].  An earlier version of IDNA [RFC3490] is now
+   being phased out.  Except where noted, the two versions are
+   approximately the same with regard to the issues discussed in this
+   document.  However, some explanations appeared in the earlier
+   documents that were no longer considered useful when the later
+   revision was created; they are quoted here from the documents in
+   which they appear.  In addition, the terminology of the two versions
+   differ somewhat; this document reflects the terminology of the
+   current version.
+
+   Unicode [Unicode] is a list of characters (including non-spacing
+   marks that are used to form some other characters), where each
+   character is assigned an integer value, called a code point.  In
+
+
+
+Thaler, et al.                Informational                     [Page 2]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   simple terms a Unicode string is a string of integer code point
+   values in the range 0 to 1,114,111 (10FFFF in base 16).  These
+   integer code points must be encoded using some mechanism before they
+   can be transmitted in network packets, stored in memory, stored on
+   disk, etc.  Some common ways of encoding these integer code point
+   values in computer systems include UTF-8, UTF-16, and UTF-32.  In
+   addition to the material below, those forms and the tradeoffs among
+   them are discussed in Chapter 2 of The Unicode Standard [Unicode].
+
+   UTF-8 is a mechanism for encoding a Unicode code point in a variable
+   number of 8-bit octets, where an ASCII code point is preserved as-is.
+   Those octets encode a string of integer code point values, which
+   represent a string of Unicode characters.  The authoritative
+   definition of UTF-8 is in Sections 3.9 and 3.10 of The Unicode
+   Standard [Unicode], but a description of UTF-8 encoding can also be
+   found in RFC 3629 [RFC3629].  Descriptions and formulae can also be
+   found in Annex D of ISO/IEC 10646-1 [10646].
+
+   UTF-16 is a mechanism for encoding a Unicode code point in one or two
+   16-bit integers, described in detail in Sections 3.9 and 3.10 of The
+   Unicode Standard [Unicode].  A UTF-16 string encodes a string of
+   integer code point values that represent a string of Unicode
+   characters.
+
+   UTF-32 (formerly UCS-4), also described in Sections 3.9 and 3.10 of
+   The Unicode Standard [Unicode], is a mechanism for encoding a Unicode
+   code point in a single 32-bit integer.  A UTF-32 string is thus a
+   string of 32-bit integer code point values, which represent a string
+   of Unicode characters.
+
+   Note that UTF-16 results in some all-zero octets when code points
+   occur early in the Unicode sequence, and UTF-32 always has all-zero
+   octets.
+
+   IDNA specifies validity of a label, such as what characters it can
+   contain, relationships among them, and so on, in Unicode terms.
+   Valid labels can be in either "U-label" or "A-label" form, with the
+   appropriate one determined by particular protocols or by context.
+   U-label form is a direct representation of the Unicode characters
+   using one of the encoding forms discussed above.  This document
+   discusses UTF-8 strings in many places.  While all U-labels can be
+   represented by UTF-8 strings, not all UTF-8 strings are valid
+   U-labels (see Section 2.3.2 of the IDNA Definitions document
+   [RFC5890] for a discussion of these distinctions).  A-label form uses
+   a compressed, ASCII-compatible encoding (an "ACE" in IDNA and other
+   terminology) produced by an algorithm called Punycode.  U-labels and
+
+
+
+
+
+Thaler, et al.                Informational                     [Page 3]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   A-labels are duals of each other: transformations from one to the
+   other do not lose information.  The transformation mechanisms are
+   specified in the IDNA Protocol document [RFC5891].
+
+   Punycode [RFC3492] is thus a mechanism for encoding a Unicode string
+   in an ASCII-compatible encoding, i.e., using only letters, digits,
+   and hyphens from the ASCII character set.  When a Unicode label that
+   is valid under the IDNA rules (a U-label) is encoded with Punycode
+   for IDNA purposes, it is prefixed with "xn--"; the result is called
+   an A-label.  The prefix convention assumes that no other DNS labels
+   (at least no other DNS labels in IDNA-aware applications) are allowed
+   to start with these four characters.  Consequently, when A-label
+   encoding is assumed, any DNS labels beginning with "xn--" now have a
+   different meaning (the Punycode encoding of a label containing one or
+   more non-ASCII characters) or no defined meaning at all (in the case
+   of labels that are not IDNA-compliant, i.e., are not well-formed
+   A-labels).
+
+   ISO-2022-JP [RFC1468] is a mechanism for encoding a string of ASCII
+   and Japanese characters, where an ASCII character is preserved as-is.
+   ISO-2022-JP is stateful: special sequences are used to switch between
+   character coding tables.  As a result, if there are lost or mangled
+   characters in a character stream, it is extremely difficult to
+   recover the original stream after such a lost character encoding
+   shift.
+
+   Comparison of Unicode strings is not as easy as comparing ASCII
+   strings.  First, there are a multitude of ways to represent a string
+   of Unicode characters.  Second, in many languages and scripts, the
+   actual definition of "same" is very context-dependent.  Because of
+   this, comparison of two Unicode strings must take into account how
+   the Unicode strings are encoded.  Regardless of the encoding,
+   however, comparison cannot simply be done by comparing the encoded
+   Unicode strings byte by byte.  The only time that is possible is when
+   the strings are both mapped into some canonical form and encoded the
+   same way.
+
+   In 1996 the IAB sponsored a workshop on character sets and encodings
+   [RFC2130].  This document adds to that discussion and focuses on the
+   importance of agreeing on a single encoding and how complicated the
+   state of affairs ends up being as a result of using different
+   encodings today.
+
+   Different applications, APIs, and protocols use different encoding
+   schemes today.  Many of them were originally defined to use only
+   ASCII.  Internationalizing Domain Names in Applications (IDNA)
+   [RFC5890] defines a mechanism that requires changes to applications,
+   but in an attempt not to change APIs or servers, specifies that the
+
+
+
+Thaler, et al.                Informational                     [Page 4]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   A-label format is to be used in many contexts.  In some ways this
+   could be seen as not changing the existing APIs, in the sense that
+   the strings being passed to and from the APIs are still apparently
+   ASCII strings.  In other ways it is a very profound change to the
+   existing APIs, because while those strings are still syntactically
+   valid ASCII strings, they no longer mean the same thing that they
+   used to.  What looks like a plain ASCII string to one piece of
+   software or library could be seen by another piece of software or
+   library (with the application of out-of-band information) to be in
+   fact an encoding of a Unicode string.
+
+   Section 1.3 of the original IDNA specification [RFC3490] states:
+
+      The IDNA protocol is contained completely within applications.  It
+      is not a client-server or peer-to-peer protocol: everything is
+      done inside the application itself.  When used with a DNS resolver
+      library, IDNA is inserted as a "shim" between the application and
+      the resolver library.  When used for writing names into a DNS
+      zone, IDNA is used just before the name is committed to the zone.
+
+   Figure 1 depicts a simplistic architecture that a naive reader might
+   assume from the paragraph quoted above.  (A variant of this same
+   picture appears in Section 6 of the original IDNA specification
+   [RFC3490], further strengthening this assumption.)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Thaler, et al.                Informational                     [Page 5]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+                +-----------------------------------------+
+                |Host                                     |
+                |             +-------------+             |
+                |             | Application |             |
+                |             +------+------+             |
+                |                    |                    |
+                |               +----+----+               |
+                |               |   DNS   |               |
+                |               | Resolver|               |
+                |               | Library |               |
+                |               +----+----+               |
+                |                    |                    |
+                +-----------------------------------------+
+                                     |
+                            _________|_________
+                           /                   \
+                          /                     \
+                         /                       \
+                        |         Internet        |
+                         \                       /
+                          \                     /
+                           \___________________/
+
+                          Simplistic Architecture
+
+                                 Figure 1
+
+   There are, however, two problems with this simplistic architecture
+   that cause it to differ from reality.
+
+   First, resolver APIs on Operating Systems (OSs) today (Mac OS,
+   Windows, Linux, etc.) are not DNS-specific.  They typically provide a
+   layer of indirection so that the application can work independent of
+   the name resolution mechanism, which could be DNS, mDNS
+   [DNS-MULTICAST], LLMNR [RFC4795], NetBIOS-over-TCP
+   [RFC1001][RFC1002], hosts table [RFC0952], NIS [NIS], or anything
+   else.  For example, "Basic Socket Interface Extensions for IPv6"
+   [RFC3493] specifies the getaddrinfo() API and contains many phrases
+   like "For example, when using the DNS" and "any type of name
+   resolution service (for example, the DNS)".  Importantly, DNS is
+   mentioned only as an example, and the application has no knowledge as
+   to whether DNS or some other protocol will be used.
+
+   Second, even with the DNS protocol, private namespaces (sometimes
+   including private uses of the DNS) do not necessarily use the same
+   character set encoding scheme as the public Internet namespace.
+
+
+
+
+
+Thaler, et al.                Informational                     [Page 6]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   We will discuss each of the above issues in subsequent sections.  For
+   reference, Figure 2 depicts a more realistic architecture on typical
+   hosts today (which don't have IDNA inserted as a shim immediately
+   above the DNS resolver library).  More generally, the host may be
+   attached to one or more local networks, each of which may or may not
+   be connected to the public Internet and may or may not have a private
+   namespace.
+                +-----------------------------------------+
+                |Host                                     |
+                |             +-------------+             |
+                |             | Application |             |
+                |             +------+------+             |
+                |                    |                    |
+                |             +------+------+             |
+                |             |   Generic   |             |
+                |             |    Name     |             |
+                |             |  Resolution |             |
+                |             |     API     |             |
+                |             +------+------+             |
+                |                    |                    |
+                |   +-----+------+---+--+-------+-----+   |
+                |   |     |      |      |       |     |   |
+                | +-+-++--+--++--+-++---+---++--+--++-+-+ |
+                | |DNS||LLMNR||mDNS||NetBIOS||hosts||...| |
+                | +---++-----++----++-------++-----++---+ |
+                |                                         |
+                +-----------------------------------------+
+                                     |
+                               ______|______
+                              /             \
+                             /               \
+                            /      local      \
+                            \     network     /
+                             \               /
+                              \_____________/
+                                     |
+                            _________|_________
+                           /                   \
+                          /                     \
+                         /                       \
+                        |         Internet        |
+                         \                       /
+                          \                     /
+                           \___________________/
+
+                          Realistic Architecture
+
+                                 Figure 2
+
+
+
+Thaler, et al.                Informational                     [Page 7]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+1.1.  APIs
+
+   Section 6.2 of the original IDNA specification [RFC3490] states
+   (where ToASCII and ToUnicode below refer to conversions using the
+   Punycode algorithm):
+
+      It is expected that new versions of the resolver libraries in the
+      future will be able to accept domain names in other charsets than
+      ASCII, and application developers might one day pass not only
+      domain names in Unicode, but also in local script to a new API for
+      the resolver libraries in the operating system.  Thus the ToASCII
+      and ToUnicode operations might be performed inside these new
+      versions of the resolver libraries.
+
+   Resolver APIs such as getaddrinfo() and its predecessor
+   gethostbyname() were defined to accept C-Language "char *" arguments,
+   meaning they accept a string of bytes, terminated with a NULL (0)
+   byte.  Because of the use of a NULL octet as a string terminator,
+   this is sufficient for ASCII strings (including A-labels) and even
+   ISO-2022-JP [RFC1468] and UTF-8 strings (unless an implementation
+   artificially precludes them), but not UTF-16 or UTF-32 strings
+   because a NULL octet could appear in the middle of strings using
+   these encodings.  Several operating systems historically used in
+   Japan will accept (and expect) ISO-2022-JP strings in such APIs.
+   Some platforms used worldwide also have new versions of the APIs
+   (e.g., GetAddrInfoW() on Windows) that accept other encoding schemes
+   such as UTF-16.
+
+   It is worth noting that an API using C-Language "char *" arguments
+   can distinguish between conventional ASCII "hostname" labels,
+   A-labels, ISO-2022-JP, and UTF-8 labels in names if the coding is
+   known to be one of those four, and the label is intact (no lost or
+   mangled characters).  If a stateful encoding like ISO-2022-JP is
+   used, applications extracting labels from text must take special
+   precautions to be sure that the appropriate state-setting characters
+   are included in the string passed to the API.
+
+   An example method for distinguishing among such codings is as
+   follows:
+
+   o  if the label contains an ESC (0x1B) byte, the label is
+      ISO-2022-JP; otherwise,
+
+   o  if any byte in the label has the high bit set, the label is UTF-8;
+      otherwise,
+
+   o  if the label starts with "xn--", then it is presumed to be an
+      A-label; otherwise,
+
+
+
+Thaler, et al.                Informational                     [Page 8]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   o  the label is ASCII (and therefore, by definition, the label is
+      also UTF-8, since ASCII is a subset of UTF-8).
+
+   Again this assumes that ASCII labels never start with "xn--", and
+   also that UTF-8 strings never contain an ESC character.  Also the
+   above is merely an illustration; UTF-8 can be detected and
+   distinguished from other 8-bit encodings with good accuracy [MJD].
+
+   It is more difficult or impossible to distinguish the ISO 8859
+   character sets [ISO8859] from each other, because they differ in up
+   to about 90 characters that have exactly the same encodings, and a
+   short string is very unlikely to contain enough characters to allow a
+   receiver to deduce the character set.  Similarly, it is not possible
+   in general to distinguish between ISO-2022-JP and any other encoding
+   based on ISO 2022 code table switching.
+
+   Although it is possible (as in the example above) to distinguish some
+   encodings when not explicitly specified, it is cleaner to have the
+   encodings specified explicitly, such as specifying UTF-16 for
+   GetAddrInfoW(), or specifying explicitly which APIs expect UTF-8
+   strings.
+
+2.  Use of Non-DNS Protocols
+
+   As noted earlier, typical name resolution libraries are not
+   DNS-specific.  Furthermore, some protocols are defined to use
+   encoding forms other than IDNA A-labels.  For example, mDNS
+   [DNS-MULTICAST] specifies that UTF-8 be used.  Indeed, the IETF
+   policy on character sets and languages [RFC2277] (which followed the
+   1996 IAB-sponsored workshop [RFC2130]) states:
+
+      Protocols MUST be able to use the UTF-8 charset, which consists of
+      the ISO 10646 coded character set combined with the UTF-8
+      character encoding scheme, as defined in [10646] Annex R
+      (published in Amendment 2), for all text.
+
+      Protocols MAY specify, in addition, how to use other charsets or
+      other character encoding schemes for ISO 10646, such as UTF-16,
+      but lack of an ability to use UTF-8 is a violation of this policy;
+      such a violation would need a variance procedure ([BCP9] section
+      9) with clear and solid justification in the protocol
+      specification document before being entered into or advanced upon
+      the standards track.
+
+      For existing protocols or protocols that move data from existing
+      datastores, support of other charsets, or even using a default
+      other than UTF-8, may be a requirement.  This is acceptable, but
+      UTF-8 support MUST be possible.
+
+
+
+Thaler, et al.                Informational                     [Page 9]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   Applications that convert an IDN to A-label form before calling
+   getaddrinfo() will result in name resolution failures if the Punycode
+   name is directly used in such protocols.  Having libraries or
+   protocols to convert from A-labels to the encoding scheme defined by
+   the protocol (e.g., UTF-8) would require changes to APIs and/or
+   servers, which IDNA was intended to avoid.
+
+   As a result, applications that assume that non-ASCII names are
+   resolved using the public DNS and blindly convert them to A-labels
+   without knowledge of what protocol will be selected by the name
+   resolution library, have problems.  Furthermore, name resolution
+   libraries often try multiple protocols until one succeeds, because
+   they are defined to use a common namespace.  For example, the hosts
+   file [RFC0952], NetBIOS-over-TCP [RFC1001], and DNS [RFC1034], are
+   all defined to be able to share a common syntax.  This means that
+   when an application passes a name to be resolved, resolution may in
+   fact be attempted using multiple protocols, each with a potentially
+   different encoding scheme.  For this to work successfully, the name
+   must be converted to the appropriate encoding scheme only after the
+   choice is made to use that protocol.  In general, this cannot be done
+   by the application since the choice of protocol is not made by the
+   application.
+
+3.  Use of Non-ASCII in DNS
+
+   A common misconception is that DNS only supports names that can be
+   expressed using letters, digits, and hyphens.
+
+   This misconception originally stems from the 1985 definition of an
+   "Internet hostname" (and net, gateway, and domain name) for use in
+   the "hosts" file [RFC0952].  An Internet hostname was defined therein
+   as including only letters, digits, and hyphens, where uppercase and
+   lowercase letters were to be treated as identical.  The DNS
+   specification [RFC1034], Section 3.5 entitled "Preferred name syntax"
+   then repeated this definition in 1987, saying that this "syntax will
+   result in fewer problems with many applications that use domain names
+   (e.g., mail, TELNET)".
+
+   The confusion was thus left as to whether the "preferred" name syntax
+   was a mandatory restriction in DNS, or merely "preferred".
+
+   The definition of an Internet hostname was updated in 1989
+   ([RFC1123], Section 2.1) to allow names starting with a digit.
+   However, it did not address the increasing confusion as to whether
+   all names in DNS are "hostnames", or whether a "hostname" is merely a
+   special case of a DNS name.
+
+
+
+
+
+Thaler, et al.                Informational                    [Page 10]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   By 1997, things had progressed to a state where it was necessary to
+   clarify these areas of confusion.  "Clarifications to the DNS
+   Specification" [RFC2181], Section 11 states:
+
+      The DNS itself places only one restriction on the particular
+      labels that can be used to identify resource records.  That one
+      restriction relates to the length of the label and the full name.
+      The length of any one label is limited to between 1 and 63 octets.
+      A full domain name is limited to 255 octets (including the
+      separators).  The zero length full name is defined as representing
+      the root of the DNS tree, and is typically written and displayed
+      as ".".  Those restrictions aside, any binary string whatever can
+      be used as the label of any resource record.  Similarly, any
+      binary string can serve as the value of any record that includes a
+      domain name as some or all of its value (SOA, NS, MX, PTR, CNAME,
+      and any others that may be added).  Implementations of the DNS
+      protocols must not place any restrictions on the labels that can
+      be used.
+
+   Hence, it clarified that the restriction to letters, digits, and
+   hyphens does not apply to DNS names in general, nor to records that
+   include "domain names".  Hence, the "preferred" name syntax described
+   in the original DNS specification [RFC1034] is indeed merely
+   "preferred", not mandatory.
+
+   Since there is no restriction even to ASCII, let alone letter-digit-
+   hyphen use, DNS does not violate the subsequent IETF requirement to
+   allow UTF-8 [RFC2277].
+
+   Using UTF-16 or UTF-32 encoding, however, would not be ideal for use
+   in DNS packets or C-Language "char *" APIs because existing software
+   already uses ASCII, and UTF-16 and UTF-32 strings can contain
+   all-zero octets that existing software will interpret as the end of
+   the string.  To use UTF-16 or UTF-32, one would need some way of
+   knowing whether the string was encoded using ASCII, UTF-16, or
+   UTF-32, and indeed for UTF-16 or UTF-32 whether it was big-endian or
+   little-endian encoding.  In contrast, UTF-8 works well because any
+   7-bit ASCII string is also a UTF-8 string representing the same
+   characters.
+
+   If a private namespace is defined to use UTF-8 (and not other
+   encodings such as UTF-16 or UTF-32), there's no need for a mechanism
+   to know whether a string was encoded using ASCII or UTF-8, because
+   (for any string that can be represented using ASCII) the
+   representations are exactly the same.  In other words, for any string
+   that can be represented using ASCII, it doesn't matter whether it is
+   interpreted as ASCII or UTF-8 because both encodings are the same,
+   and for any string that can't be represented using ASCII, it's
+
+
+
+Thaler, et al.                Informational                    [Page 11]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   obviously UTF-8.  In addition, unlike UTF-16 and UTF-32, ASCII and
+   UTF-8 are both byte-oriented encodings so the question of big-endian
+   or little-endian encoding doesn't apply.
+
+   While implementations of the DNS protocol must not place any
+   restrictions on the labels that can be used, applications that use
+   the DNS are free to impose whatever restrictions they like, and many
+   have.  The above rules permit a domain name label that contains
+   unusual characters, such as embedded spaces, which many applications
+   consider a bad idea.  For example, the original specification
+   [RFC0821] of the SMTP protocol [RFC5321] constrains the character set
+   usable in email addresses.  There is now an effort underway to define
+   an extension to SMTP to support internationalized email addresses and
+   headers.  See the EAI framework [RFC4952] for more discussion on this
+   topic.
+
+   Shortly after the DNS Clarifications [RFC2181] and IETF character
+   sets and languages policy [RFC2277] were published, the need for
+   internationalized names within private namespaces (i.e., within
+   enterprises) arose.  The current (and past, predating IDNA and the
+   prefixed ACE conventions) practice within enterprises that support
+   other languages is to put UTF-8 names in their internal DNS servers
+   in a private namespace.  For example, "Using the UTF-8 Character Set
+   in the Domain Name System" [UTF8-DNS] was first written in 1997, and
+   was then widely deployed in Windows.  The use of UTF-8 names in DNS
+   was similarly implemented and deployed in Mac OS, simply by virtue of
+   the fact that applications blindly passed UTF-8 strings to the name
+   resolution APIs, the name resolution APIs blindly passed those UTF-8
+   strings to the DNS servers, and the DNS servers correctly answered
+   those queries.  From the user's point of view, everything worked
+   properly without any special new code being written, except that
+   ASCII is matched case-insensitively whereas UTF-8 is not (although
+   some enterprise DNS servers reportedly attempt to do case-insensitive
+   matching on UTF-8 within private namespaces, an action that causes
+   other problems and violates a subsequent prohibition [RFC4343]).
+   Within a private namespace, and especially in light of the IETF UTF-8
+   policy [RFC2277], it was reasonable to assume that binary strings
+   were encoded in UTF-8.
+
+   As implied earlier, there are also issues with mapping strings to
+   some canonical form, independent of the encoding.  Such issues are
+   not discussed in detail in this document.  They are discussed to some
+   extent in, for example, Section 3 of "Unicode Format for Network
+   Interchange" [RFC5198], and are left as opportunities for elaboration
+   in other documents.
+
+   A few years after UTF-8 was already in use in private namespaces in
+   DNS, the strategy of using a reserved prefix and an ASCII-compatible
+
+
+
+Thaler, et al.                Informational                    [Page 12]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   encoding (ACE) was developed for IDNA.  That strategy included the
+   Punycode algorithm, which began to be developed (during the period
+   from 2002 [IDN-PUNYCODE] to 2003 [RFC3492]) for use in the public DNS
+   namespace.  There were a number of reasons for this.  One such reason
+   the prefixed ACE strategy was selected for the public DNS namespace
+   had to do with the fact that other encodings such as ISO 8859-1 were
+   also in use in DNS and the various encodings were not necessarily
+   distinguishable from each other.  Another reason had to do with
+   concerns about whether the details of IDNA, including the use of the
+   Punycode algorithm, were an adequate solution to the problems that
+   were posed.  If either the Punycode algorithm or fundamental aspects
+   of character handling were wrong, and had to be changed to something
+   incompatible, it would be possible to switch to a new prefix or adopt
+   another model entirely.  Only the part of the public DNS namespace
+   that starts a label with "xn--" would be polluted.
+
+   Today the algorithm is seen as being about as good as it can
+   realistically be, so moving to a different encoding (UTF-8 as
+   suggested in this document) that can be viewed as "native" would not
+   be as risky as it would have been in 2002.
+
+   In any case, the publication of Punycode [RFC3492] and the
+   dependencies on it in the IDNA Protocol document [RFC5891] and the
+   earlier IDNA specification [RFC3490] thus resulted in having to use
+   different encodings for different namespaces (where UTF-8 for private
+   namespaces was already deployed).  Hence, referring back to Figure 2,
+   a different encoding scheme may be in use on the Internet vs. a local
+   network.
+
+   In general, a host may be connected to zero or more networks using
+   private namespaces, plus potentially the public namespace.
+   Applications that convert a U-label form IDN to an A-label before
+   calling getaddrinfo() will incur name resolution failures if the name
+   is actually registered in a private namespace in some other encoding
+   (e.g., UTF-8).  Having libraries or protocols convert from A-labels
+   to the encoding used by a private namespace (e.g., UTF-8) would
+   require changes to APIs and/or servers, which IDNA was intended to
+   avoid.
+
+   Also, a fully-qualified domain name (FQDN) to be resolved may be
+   obtained directly from an application, or it may be composed by the
+   DNS resolver itself from a single label obtained from an application
+   by using a configured suffix search list, and the resulting FQDN may
+   use multiple encodings in different labels.  For more information on
+   the suffix search list, see Section 6 of "Common DNS Implementation
+   Errors and Suggested Fixes" [RFC1536], the DHCP Domain Search Option
+   [RFC3397], and Section 4 of "DNS Configuration options for DHCPv6"
+   [RFC3646].
+
+
+
+Thaler, et al.                Informational                    [Page 13]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   As noted in Section 6 of "Common DNS Implementation Errors and
+   Suggested Fixes" [RFC1536], the community has had bad experiences
+   (e.g., security problems [RFC1535]) with "searching" for domain names
+   by trying multiple variations or appending different suffixes.  Such
+   searching can yield inconsistent results depending on the order in
+   which alternatives are tried.  Nonetheless, the practice is
+   widespread and must be considered.
+
+   The practice of searching for names, whether by the use of a suffix
+   search list or by searching in different namespaces, can yield
+   inconsistent results.  For example, even when a suffix search list is
+   only used when an application provides a name containing no dots, two
+   clients with different configured suffix search lists can get
+   different answers, and the same client could get different answers at
+   different times if it changes its configuration (e.g., when moving to
+   another network).  A deeper discussion of this topic is outside the
+   scope of this document.
+
+3.1.  Examples
+
+   Some examples of cases that can happen in existing implementations
+   today (where {non-ASCII} below represents some user-entered non-ASCII
+   string) are:
+
+   o  User types in {non-ASCII}.{non-ASCII}.com, and the application
+      passes it, in the form of a UTF-8 string, to getaddrinfo() or
+      gethostbyname() or equivalent.
+
+      1.  The DNS resolver passes the (UTF-8) string unmodified to a DNS
+          server.
+
+   o  User types in {non-ASCII}.{non-ASCII}.com, and the application
+      passes it to a name resolution API that accepts strings in some
+      other encoding such as UTF-16, e.g., GetAddrInfoW() on Windows.
+
+      1.  The name resolution API decides to pass the string to DNS (and
+          possibly other protocols).
+
+      2.  The DNS resolver converts the name from UTF-16 to UTF-8 and
+          passes the query to a DNS server.
+
+   o  User types in {non-ASCII}.{non-ASCII}.com, but the application
+      first converts it to A-label form such that the name that is
+      passed to name resolution APIs is (say)
+      xn--e1afmkfd.xn--80akhbyknj4f.com.
+
+      1.  The name resolution API decides to pass the string to DNS (and
+          possibly other protocols).
+
+
+
+Thaler, et al.                Informational                    [Page 14]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+      2.  The DNS resolver passes the string unmodified to a DNS server.
+
+      3.  If the name is not found in DNS, the name resolution API
+          decides to try another protocol, say mDNS.
+
+      4.  The query goes out in mDNS, but since mDNS specified that
+          names are to be registered in UTF-8, the name isn't found
+          since it was encoded as an A-label in the query.
+
+   o  User types in {non-ASCII}, and the application passes it, in the
+      form of a UTF-8 string, to getaddrinfo() or equivalent.
+
+      1.  The name resolution API decides to pass the string to DNS (and
+          possibly other protocols).
+
+      2.  The DNS resolver will append suffixes in the suffix search
+          list, which may contain UTF-8 characters if the local network
+          uses a private namespace.
+
+      3.  Each FQDN in turn will then be sent in a query to a DNS
+          server, until one succeeds.
+
+   o  User types in {non-ASCII}, but the application first converts it
+      to an A-label, such that the name that is passed to getaddrinfo()
+      or equivalent is (say) xn--e1afmkfd.
+
+      1.  The name resolution API decides to pass the string to DNS (and
+          possibly other protocols).
+
+      2.  The DNS stub resolver will append suffixes in the suffix
+          search list, which may contain UTF-8 characters if the local
+          network uses a private namespace, resulting in (say)
+          xn--e1afmkfd.{non-ASCII}.com
+
+      3.  Each FQDN in turn will then be sent in a query to a DNS
+          server, until one succeeds.
+
+      4.  Since the private namespace in this case uses UTF-8, the above
+          queries fail, since the A-label version of the name was not
+          registered in that namespace.
+
+   o  User types in {non-ASCII1}.{non-ASCII2}.{non-ASCII3}.com, where
+      {non-ASCII3}.com is a public namespace using IDNA and A-labels,
+      but {non-ASCII2}.{non-ASCII3}.com is a private namespace using
+      UTF-8, which is accessible to the user.  The application passes
+      the name, in the form of a UTF-8 string, to getaddrinfo() or
+      equivalent.
+
+
+
+
+Thaler, et al.                Informational                    [Page 15]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+      1.  The name resolution API decides to pass the string to DNS (and
+          possibly other protocols).
+
+      2.  The DNS resolver tries to locate the authoritative server, but
+          fails the lookup because it cannot find a server for the UTF-8
+          encoding of {non-ASCII3}.com, even though it would have access
+          to the private namespace.  (To make this work, the private
+          namespace would need to include the UTF-8 encoding of
+          {non-ASCII3}.com.)
+
+   When users use multiple applications, some of which do A-label
+   conversion prior to passing a name to name resolution APIs, and some
+   of which do not, odd behavior can result which at best violates the
+   Principle of Least Surprise, and at worst can result in security
+   vulnerabilities.
+
+   First consider two competing applications, such as web browsers, that
+   are designed to achieve the same task.  If the user types the same
+   name into each browser, one may successfully resolve the name (and
+   hence access the desired content) because the encoding scheme is
+   correct, while the other may fail name resolution because the
+   encoding scheme is incorrect.  Hence the issue can incent users to
+   switch to another application (which in some cases means switching to
+   an IDNA application, and in other cases means switching away from an
+   IDNA application).
+
+   Next consider two separate applications where one is designed to be
+   launched from the other, for example a web browser launching a media
+   player application when the link to a media file is clicked.  If both
+   types of content (web pages and media files in this example) are
+   hosted at the same IDN in a private namespace, but one application
+   converts to A-labels before calling name resolution APIs and the
+   other does not, the user may be able to access a web page, click on
+   the media file causing the media player to launch and attempt to
+   retrieve the media file, which will then fail because the IDN
+   encoding scheme was incorrect.  Or even worse, if an attacker is able
+   to register the same name in the other encoding scheme, the user may
+   get the content from the attacker's machine.  This is similar to a
+   normal phishing attack, except that the two names represent exactly
+   the same Unicode characters.
+
+4.  Recommendations
+
+   On many platforms, the name resolution library will automatically use
+   a variety of protocols to search a variety of namespaces, which might
+   be using UTF-8 or other encodings.  In addition, even when only the
+   DNS protocol is used, in many operational environments, a private DNS
+
+
+
+
+Thaler, et al.                Informational                    [Page 16]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   namespace using UTF-8 is also deployed and is automatically searched
+   by the name resolution library.
+
+   As explained earlier, using multiple canonical formats, and multiple
+   encodings in different protocols or even in different places in the
+   same namespace creates problems.  Because of this, and the fact that
+   both IDNA A-labels and UTF-8 are in use as encoding mechanisms for
+   domain names today, we make the recommendations described below.
+
+   It is inappropriate for an application that calls a general-purpose
+   name resolution library to convert a name to an A-label unless the
+   application is absolutely certain that, in all environments where the
+   application might be used, only the global DNS that uses IDNA
+   A-labels actually will be used to resolve the name.
+
+   Instead, conversion to A-label form, or any other special encoding
+   required by a particular name-lookup protocol, should be done only by
+   an entity that knows which protocol will be used (e.g., the DNS
+   resolver, or getaddrinfo() upon deciding to pass the name to DNS),
+   rather than by general applications that call protocol-independent
+   name resolution APIs.  (Of course, applications that store strings
+   internally in a different format than that required by those APIs,
+   need to convert strings from their own internal format to the format
+   required by the API.)  Similarly, even if an application can know
+   that DNS is to be used, the conversion to A-labels should be done
+   only by an entity that knows which part of the DNS namespace will be
+   used.
+
+   That is, a more intelligent DNS resolver would be more liberal in
+   what it would accept from an application and be able to query for
+   both a name in A-label form (e.g., over the Internet) and a UTF-8
+   name (e.g., over a corporate network with a private namespace) in
+   case the server only recognizes one.  However, we might also take
+   into account that the various resolution behaviors discussed earlier
+   could also occur with record updates (e.g., with Dynamic Update
+   [RFC2136]), resulting in some names being registered in a local
+   network's private namespace by applications doing conversion to
+   A-labels, and other names being registered using UTF-8.  Hence, a
+   name might have to be queried with both encodings to be sure to
+   succeed without changes to DNS servers.
+
+   Similarly, a more intelligent stub resolver would also be more
+   liberal in what it would accept from a response as the value of a
+   record (e.g., PTR) in that it would accept either UTF-8 (U-labels in
+   the case of IDNA) or A-labels and convert them to whatever encoding
+   is used by the application APIs to return strings to applications.
+
+
+
+
+
+Thaler, et al.                Informational                    [Page 17]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   Indeed the choice of conversion within the resolver libraries is
+   consistent with the quote from Section 6.2 of the original IDNA
+   specification [RFC3490] stating that conversion using the Punycode
+   algorithm (i.e., to A-labels) "might be performed inside these new
+   versions of the resolver libraries".
+
+   That said, some application-layer protocols (e.g., EPP Domain Name
+   Mapping [RFC5731]) are defined to use A-labels rather than simply
+   using UTF-8 as recommended by the IETF character sets and languages
+   policy [RFC2277].  In this case, an application may receive a string
+   containing A-labels and want to pass it to name resolution APIs.
+   Again the recommendation that a resolver library be more liberal in
+   what it would accept from an application would mean that such a name
+   would be accepted and re-encoded as needed, rather than requiring the
+   application to do so.
+
+   It is important that any APIs used by applications to pass names
+   specify what encoding(s) the API uses.  For example, GetAddrInfoW()
+   on Windows specifies that it accepts UTF-16 and only UTF-16.  In
+   contrast, the original specification of getaddrinfo() [RFC3493] does
+   not, and hence platforms vary in what they use (e.g., Mac OS uses
+   UTF-8 whereas Windows uses Windows code pages).
+
+   Finally, the question remains about what, if anything, a DNS server
+   should do to handle cases where some existing applications or hosts
+   do IDNA queries using A-labels within the local network using a
+   private namespace, and other existing applications or hosts send
+   UTF-8 queries.  It is undesirable to store different records for
+   different encodings of the same name, since this introduces the
+   possibility for inconsistency between them.  Instead, a new DNS
+   server serving a private namespace using UTF-8 could potentially
+   treat encoding-conversion in the same way as case-insensitive
+   comparison which a DNS server is already required to do, as long the
+   DNS server has some way to know what the encoding is.  Two encodings
+   are, in this sense, two representations of the same name, just as two
+   case-different strings are.  However, whereas case comparison of
+   non-ASCII characters is complicated by ambiguities (as explained in
+   the IAB's Review and Recommendations for Internationalized Domain
+   Names [RFC4690]), encoding conversion between A-labels and U-labels
+   is unambiguous.
+
+5.  Security Considerations
+
+   Having applications convert names to prefixed ACE format (A-labels)
+   before calling name resolution can result in security
+   vulnerabilities.  If the name is resolved by protocols or in zones
+   for which records are registered using other encoding schemes, an
+   attacker can claim the A-label version of the same name and hence
+
+
+
+Thaler, et al.                Informational                    [Page 18]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   trick the victim into accessing a different destination.  This can be
+   done for any non-ASCII name, even when there is no possible confusion
+   due to case, language, or other issues.  Other types of confusion
+   beyond those resulting simply from the choice of encoding scheme are
+   discussed in "Review and Recommendations for IDNs" [RFC4690].
+
+   Designers and users of encodings that represent Unicode strings in
+   terms of ASCII should also consider whether trademark protection or
+   phishing are issues, e.g., if one name would be encoded in a way that
+   would be naturally associated with another organization or product.
+
+6.  Acknowledgements
+
+   The authors wish to thank Patrik Faltstrom, Martin Duerst, JFC
+   Morfin, Ran Atkinson, S. Moonesamy, Paul Hoffman, and Stephane
+   Bortzmeyer for their careful review and helpful suggestions.  It is
+   also interesting to note that none of the first three individuals'
+   names above can be spelled out and written correctly in ASCII text.
+   Furthermore, one of the IAB member's names below (Andrei Robachevsky)
+   cannot be written in the script as it appears on his birth
+   certificate.
+
+7.  IAB Members at the Time of Approval
+
+   Bernard Aboba
+   Marcelo Bagnulo
+   Ross Callon
+   Spencer Dawkins
+   Vijay Gill
+   Russ Housley
+   John Klensin
+   Olaf Kolkman
+   Danny McPherson
+   Jon Peterson
+   Andrei Robachevsky
+   Dave Thaler
+   Hannes Tschofenig
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Thaler, et al.                Informational                    [Page 19]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+8.  References
+
+8.1.  Normative References
+
+   [10646]          International Organization for Standardization,
+                    "Information Technology - Universal Multiple-octet
+                    coded Character Set (UCS)".
+
+                    ISO/IEC Standard 10646, comprised of ISO/IEC 10646-
+                    1:2000, "Information technology -- Universal
+                    Multiple-Octet Coded Character Set (UCS) -- Part 1:
+                    Architecture and Basic Multilingual Plane", ISO/IEC
+                    10646-2:2001, "Information technology -- Universal
+                    Multiple-Octet Coded Character Set (UCS) -- Part 2:
+                    Supplementary Planes" and ISO/IEC 10646- 1:2000/Amd
+                    1:2002, "Mathematical symbols and other characters".
+
+   [Unicode]        The Unicode Consortium.  The Unicode Standard,
+                    Version 5.1.0, defined by: "The Unicode Standard,
+                    Version 5.0", Boston, MA, Addison-Wesley, 2007, ISBN
+                    0-321-48091-0, as amended by Unicode 5.1.0
+                    (http://www.unicode.org/versions/Unicode5.1.0/).
+
+8.2.  Informative References
+
+   [DNS-MULTICAST]  Cheshire, S. and M. Krochmal, "Multicast DNS", Work
+                    in Progress, February 2011.
+
+   [IDN-PUNYCODE]   Costello, A., "Punycode version 0.3.3", Work
+                    in Progress, January 2002.
+
+   [ISO8859]        International Organization for Standardization,
+                    "Information technology -- 8-bit single-byte coded
+                    graphic character sets".
+
+                    ISO/IEC Standard 8859, comprised of ISO/IEC 8859-
+                    1:1998, Part 1: Latin alphabet No. 1 - ISO/IEC 8859-
+                    2:1999, Part 2: Latin alphabet No. 2 - ISO/IEC 8859-
+                    3:1999, Part 3: Latin alphabet No. 3 - ISO/IEC 8859-
+                    4:1998, Part 4: Latin alphabet No. 4 - ISO/IEC 8859-
+                    5:1999, Part 5: Latin/Cyrillic alphabet - ISO/IEC
+                    8859-6:1999, Part 6: Latin/Arabic alphabet - ISO/IEC
+                    8859-7:2003, Part 7: Latin/Greek alphabet - ISO/IEC
+                    8859-8:1999, Part 8: Latin/Hebrew alphabet - ISO/IEC
+                    8859-9:1999, Part 9: Latin alphabet No. 5 - ISO/IEC
+                    8859-10:1998, Part 10: Latin alphabet No. 6 - ISO/
+                    IEC 8859-11:2001, Part 11: Latin/Thai alphabet -
+                    ISO/IEC 8859-13:1998, Part 13: Latin alphabet No. 7
+
+
+
+Thaler, et al.                Informational                    [Page 20]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+                    - ISO/IEC 8859-14:1998, Part 14: Latin alphabet No.
+                    8 (Celtic) - ISO/IEC 8859-15:1999, Part 15: Latin
+                    alphabet No. 9 - ISO/IEC 8859-16:2001, Part 16:
+                    Latin alphabet No. 10.
+
+   [MJD]            Duerst, M., "The Properties and Promizes of UTF-8",
+                    11th International Unicode Conference, San Jose ,
+                    September 1997, <http://www.ifi.unizh.ch/mml/
+                    mduerst/papers/PDF/IUC11-UTF-8.pdf>.
+
+   [NIS]            Sun Microsystems, "System and Network
+                    Administration", March 1990.
+
+   [RFC0821]        Postel, J., "Simple Mail Transfer Protocol", STD 10,
+                    RFC 821, August 1982.
+
+   [RFC0952]        Harrenstien, K., Stahl, M., and E. Feinler, "DoD
+                    Internet host table specification", RFC 952,
+                    October 1985.
+
+   [RFC1001]        NetBIOS Working Group, "Protocol standard for a
+                    NetBIOS service on a TCP/UDP transport: Concepts and
+                    methods", STD 19, RFC 1001, March 1987.
+
+   [RFC1002]        NetBIOS Working Group, "Protocol standard for a
+                    NetBIOS service on a TCP/UDP transport: Detailed
+                    specifications", STD 19, RFC 1002, March 1987.
+
+   [RFC1034]        Mockapetris, P., "Domain names - concepts and
+                    facilities", STD 13, RFC 1034, November 1987.
+
+   [RFC1123]        Braden, R., "Requirements for Internet Hosts -
+                    Application and Support", STD 3, RFC 1123,
+                    October 1989.
+
+   [RFC1468]        Murai, J., Crispin, M., and E. van der Poel,
+                    "Japanese Character Encoding for Internet Messages",
+                    RFC 1468, June 1993.
+
+   [RFC1535]        Gavron, E., "A Security Problem and Proposed
+                    Correction With Widely Deployed DNS Software",
+                    RFC 1535, October 1993.
+
+   [RFC1536]        Kumar, A., Postel, J., Neuman, C., Danzig, P., and
+                    S. Miller, "Common DNS Implementation Errors and
+                    Suggested Fixes", RFC 1536, October 1993.
+
+
+
+
+
+Thaler, et al.                Informational                    [Page 21]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   [RFC2130]        Weider, C., Preston, C., Simonsen, K., Alvestrand,
+                    H., Atkinson, R., Crispin, M., and P. Svanberg, "The
+                    Report of the IAB Character Set Workshop held 29
+                    February - 1 March, 1996", RFC 2130, April 1997.
+
+   [RFC2136]        Vixie, P., Thomson, S., Rekhter, Y., and J. Bound,
+                    "Dynamic Updates in the Domain Name System (DNS
+                    UPDATE)", RFC 2136, April 1997.
+
+   [RFC2181]        Elz, R. and R. Bush, "Clarifications to the DNS
+                    Specification", RFC 2181, July 1997.
+
+   [RFC2277]        Alvestrand, H., "IETF Policy on Character Sets and
+                    Languages", BCP 18, RFC 2277, January 1998.
+
+   [RFC3397]        Aboba, B. and S. Cheshire, "Dynamic Host
+                    Configuration Protocol (DHCP) Domain Search Option",
+                    RFC 3397, November 2002.
+
+   [RFC3490]        Faltstrom, P., Hoffman, P., and A. Costello,
+                    "Internationalizing Domain Names in Applications
+                    (IDNA)", RFC 3490, March 2003.
+
+   [RFC3492]        Costello, A., "Punycode: A Bootstring encoding of
+                    Unicode for Internationalized Domain Names in
+                    Applications (IDNA)", RFC 3492, March 2003.
+
+   [RFC3493]        Gilligan, R., Thomson, S., Bound, J., McCann, J.,
+                    and W. Stevens, "Basic Socket Interface Extensions
+                    for IPv6", RFC 3493, February 2003.
+
+   [RFC3629]        Yergeau, F., "UTF-8, a transformation format of ISO
+                    10646", STD 63, RFC 3629, November 2003.
+
+   [RFC3646]        Droms, R., "DNS Configuration options for Dynamic
+                    Host Configuration Protocol for IPv6 (DHCPv6)",
+                    RFC 3646, December 2003.
+
+   [RFC4343]        Eastlake, D., "Domain Name System (DNS) Case
+                    Insensitivity Clarification", RFC 4343,
+                    January 2006.
+
+   [RFC4690]        Klensin, J., Faltstrom, P., Karp, C., and IAB,
+                    "Review and Recommendations for Internationalized
+                    Domain Names (IDNs)", RFC 4690, September 2006.
+
+
+
+
+
+
+Thaler, et al.                Informational                    [Page 22]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+   [RFC4795]        Aboba, B., Thaler, D., and L. Esibov, "Link-local
+                    Multicast Name Resolution (LLMNR)", RFC 4795,
+                    January 2007.
+
+   [RFC4952]        Klensin, J. and Y. Ko, "Overview and Framework for
+                    Internationalized Email", RFC 4952, July 2007.
+
+   [RFC5198]        Klensin, J. and M. Padlipsky, "Unicode Format for
+                    Network Interchange", RFC 5198, March 2008.
+
+   [RFC5321]        Klensin, J., "Simple Mail Transfer Protocol",
+                    RFC 5321, October 2008.
+
+   [RFC5731]        Hollenbeck, S., "Extensible Provisioning Protocol
+                    (EPP) Domain Name Mapping", STD 69, RFC 5731,
+                    August 2009.
+
+   [RFC5890]        Klensin, J., "Internationalized Domain Names for
+                    Applications (IDNA): Definitions and Document
+                    Framework", RFC 5890, August 2010.
+
+   [RFC5891]        Klensin, J., "Internationalized Domain Names in
+                    Applications (IDNA): Protocol", RFC 5891,
+                    August 2010.
+
+   [UTF8-DNS]       Kwan, S. and J. Gilroy, "Using the UTF-8 Character
+                    Set in the Domain Name System", Work in Progress,
+                    November 1997.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Thaler, et al.                Informational                    [Page 23]
+
+RFC 6055                      IDN Encodings                February 2011
+
+
+Authors' Addresses
+
+   Dave Thaler
+   Microsoft Corporation
+   One Microsoft Way
+   Redmond, WA  98052
+   USA
+
+   Phone: +1 425 703 8835
+   EMail: dthaler@microsoft.com
+
+
+   John C Klensin
+   1770 Massachusetts Ave, Ste 322
+   Cambridge, MA  02140
+
+   Phone: +1 617 245 1457
+   EMail: john+ietf@jck.com
+
+
+   Stuart Cheshire
+   Apple Inc.
+   1 Infinite Loop
+   Cupertino, CA  95014
+
+   Phone: +1 408 974 3207
+   EMail: cheshire@apple.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Thaler, et al.                Informational                    [Page 24]
+