summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc9233.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc9233.txt')
-rw-r--r--doc/rfc/rfc9233.txt1321
1 files changed, 1321 insertions, 0 deletions
diff --git a/doc/rfc/rfc9233.txt b/doc/rfc/rfc9233.txt
new file mode 100644
index 0000000..f36302d
--- /dev/null
+++ b/doc/rfc/rfc9233.txt
@@ -0,0 +1,1321 @@
+
+
+
+
+Internet Engineering Task Force (IETF) P. Fältström
+Request for Comments: 9233 Netnod
+Category: Standards Track March 2022
+ISSN: 2070-1721
+
+
+ Internationalized Domain Names for Applications 2008 (IDNA2008) and
+ Unicode 12.0.0
+
+Abstract
+
+ This document describes the changes between Unicode 6.0.0 and Unicode
+ 12.0.0 in the context of the current version of Internationalized
+ Domain Names for Applications 2008 (IDNA2008). Some additions and
+ changes have been made in the Unicode Standard that affect the values
+ produced by the algorithm IDNA2008 specifies. IDNA2008 allows adding
+ exceptions to the algorithm for backward compatibility; however, this
+ document does not add any such exceptions. This document provides
+ the necessary tables to IANA to make its database consistent with
+ Unicode 12.0.0.
+
+ To improve understanding, this document describes systems that are
+ being used as alternatives to those that conform to IDNA2008.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 7841.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ https://www.rfc-editor.org/info/rfc9233.
+
+Copyright Notice
+
+ Copyright (c) 2022 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (https://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Revised BSD License text as described in Section 4.e of the
+ Trust Legal Provisions and are provided without warranty as described
+ in the Revised BSD License.
+
+Table of Contents
+
+ 1. Introduction
+ 2. Background
+ 2.1. IDNA2008 Documents
+ 2.2. Additional Important IDNA2008-Related Documents
+ 2.3. Deployment
+ 3. Notable Changes between Unicode 6.0.0 and 12.0.0
+ 3.1. Changes between Unicode 6.0.0 and 7.0.0
+ 3.2. Changes between Unicode 7.0.0 and 10.0.0
+ 3.3. Changes between Unicode 10.0.0 and 11.0.0
+ 3.4. Changes between Unicode 11.0.0 and 12.0.0
+ 4. U+111C9 SHARADA SANDHI MARK
+ 5. Conclusion
+ 6. IANA Considerations
+ 7. Security Considerations
+ 8. References
+ 8.1. Normative References
+ 8.2. Informative References
+ Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0
+ Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0
+ Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0
+ Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0
+ Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0
+ Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0
+ Acknowledgments
+ Author's Address
+
+1. Introduction
+
+ The current version of Internationalized Domain Names for
+ Applications (IDNA) was initiated in 2008, and despite not being
+ completed until 2010, is widely known as "IDNA2008". It is specified
+ in the series of documents listed in Section 2.1. The IDNA2008
+ standard includes an algorithm by which a derived property value is
+ calculated based on the properties defined in the Unicode Standard.
+
+ The derived property values that can be calculated are defined in RFC
+ 5892 [RFC5892]. Below is a summary to aid in the reading of this
+ document. For definition of the terms, please see RFC 5892
+ [RFC5892].
+
+ PROTOCOL VALID: Those that are allowed to be used in IDNs. Code
+ points with this property value are permitted for general use in
+ IDNs. However, the fact that a label consists only of code points
+ with this property value does not imply that the label can be used
+ in DNS. The abbreviated term PVALID is used to refer to this
+ value.
+
+ CONTEXTUAL RULE REQUIRED: Some characteristics of the character,
+ such as it being invisible in certain contexts or problematic in
+ others, require that it not be used in labels unless specific
+ other characters or properties are present. The abbreviated term
+ CONTEXT is used to refer to this value. As explained in RFC 5892
+ [RFC5892], CONTEXT is in turn divided into CONTEXTJ and CONTEXTO.
+
+ DISALLOWED: Those that should clearly not be included in IDNs. Code
+ points with this property value are not permitted in IDNs.
+
+ UNASSIGNED: Those code points that are not designated (i.e., are
+ unassigned) in the Unicode Standard.
+
+ When the Unicode Standard is updated, new code points are assigned
+ and already assigned code points can have their property values
+ changed.
+
+ * Assigning code points can create problems if the newly assigned
+ code points are compositions of existing code points and the
+ normalization relationships associated with those code points
+ should have been changed because of that.
+
+ * Changing properties for already assigned code points can create
+ problems if the property change results in changes to the derived
+ property value. A previously allowed code point whose derived
+ property value is PVALID may now be prohibited if its derived
+ property value changes to DISALLOWED. The problem can also happen
+ the other way around: a code point that was not allowed (and thus
+ was prohibited) can suddenly be allowed.
+
+ * Problems can also be created if the properties assigned to those
+ code points are inconsistent with IDNA2008 assumptions about how
+ properties are assigned and/or about how code points with those
+ properties are used or behave.
+
+ There were three incompatible changes in the Unicode Standard between
+ Unicode 5.2.0 [Unicode-5.2.0] and Unicode 6.0.0 [Unicode-6.0.0]; they
+ are described in RFC 6452 [RFC6452]. The code points U+0CF1 and
+ U+0CF2 had a derived property value change from DISALLOWED to PVALID,
+ and the code point U+19DA had a change in derived property value from
+ PVALID to DISALLOWED. These changes where examined in great detail,
+ but the IETF concluded that these changes to the Unicode Standard did
+ not warrant an update to RFC 5892 [RFC5892].
+
+ As described in Section 3, more incompatible changes have been made
+ to code points between Unicode 6.0.0 and Unicode 12.0.0
+ [Unicode-12.0.0]; however, the changes in the derived property values
+ do not result in exceptions (as defined in Section 2.6 of RFC 5892
+ [RFC5892]) that would require an update to the "IDNA Contextual
+ Rules" registry (which would also be considered an update to RFC 5892
+ [RFC5892]).
+
+ Further, in 2015, the Internet Architecture Board (IAB) issued a
+ statement [IAB2005-1] that advised the community to avoid using any
+ of the potentially problematic code points and asked the IETF to
+ resolve the issues related to the code point ARABIC LETTER BEH WITH
+ HAMZA ABOVE (U+08A1) that was introduced in Unicode 7.0.0
+ [Unicode-7.0.0]. In February of that year, the statement was revised
+ [IAB2005-2] to focus on the latter request. More details about the
+ problem of code point sequences not normalizing as one might expect
+ appear in a draft that was part of the discussion [IDNA7].
+
+ The result of the work in the IETF was that no exception was added to
+ RFC 5892 [RFC5892]; however, it should be noted that the review of
+ the issues around U+08A1 indicated that this code point is not an
+ isolated case and that a number of long-standing PVALID code points
+ may have similar issues. While the affected code points remain
+ PVALID in this document, identification of the problem resulted in a
+ clarification of the review process for new Unicode versions. That
+ clarification, which reinforces the original review plan to capture
+ issues like these, was published as RFC 8753 [RFC8753]. Any review
+ of Unicode versions after 12.0.0 should be made according to RFC 8753
+ [RFC8753]; an objective of this document is to ensure that a proper
+ review of such versions after version 12.0.0 can be made.
+
+2. Background
+
+2.1. IDNA2008 Documents
+
+ IDNA2008 consists of the following documents. The documents in the
+ set have informal names.
+
+ * "Internationalized Domain Names for Applications (IDNA):
+ Definitions and Document Framework" [RFC5890], informally called
+ "Defs" or "Definitions", contains definitions and other material
+ that are needed for understanding other documents in the set.
+
+ * "Internationalized Domain Names in Applications (IDNA): Protocol"
+ [RFC5891], informally called "Protocol", describes the core
+ IDNA2008 protocol and its operations. It needs to be interpreted
+ in combination with the Bidi document (described below). RFC 5891
+ [RFC5891] obsoletes RFC 3491 [RFC3491] and, in particular, the use
+ of the tables to which RFC 3491 [RFC3491] refers.
+
+ * "The Unicode Code Points and Internationalized Domain Names for
+ Applications (IDNA)" [RFC5892], informally called "Tables", lists
+ the categories and rules that identify the code points allowed in
+ a label written in native character form (called a "U-label"), and
+ is based on Unicode 5.2.0 [Unicode-5.2.0] code point assignments
+ and additional rules unique to IDNA2008. The Unicode-based rules
+ in RFC 5892 are expected to be stable across Unicode updates and
+ hence independent of Unicode versions.
+
+ * "Right-to-Left Scripts for Internationalized Domain Names for
+ Applications (IDNA)" [RFC5893], informally called "Bidi",
+ specifies special rules for labels that contain characters that
+ are written from right to left.
+
+ * "Internationalized Domain Names for Applications (IDNA):
+ Background, Explanation, and Rationale" [RFC5894], informally
+ called "Rationale", provides an overview of the protocol and
+ associated tables, and gives explanatory material and some
+ rationale for the decisions that led to IDNA2008. It also
+ contains advice for DNS registry operators and others who use
+ Internationalized Domain Names (IDNs).
+
+ * "Mapping Characters for Internationalized Domain Names in
+ Applications (IDNA) 2008" [RFC5895], informally called "Mapping",
+ discusses the issue of mapping characters into other characters
+ and provides guidance for doing so when that is appropriate. RFC
+ 5895 provides advice only and is not a required part of IDNA.
+
+2.2. Additional Important IDNA2008-Related Documents
+
+ There are other documents important for the understanding and
+ functioning of IDNA2008, for example this.
+
+ * "The Unicode Code Points and Internationalized Domain Names for
+ Applications (IDNA) - Unicode 6.0" [RFC6452] describes some
+ changes made to Unicode 6.0.0 [Unicode-6.0.0] that resulted in
+ derived property value changes for the code points U+0CF1, U+0CF2,
+ and U+19DA. U+0CF1 and U+0CF2 changed from DISALLOWED to PVALID,
+ while U+19DA changed from PVALID to DISALLOWED. The IETF
+ concluded that no update to RFC 5892 [RFC5892] was needed based on
+ the changes made in Unicode 6.0.0 [Unicode-6.0.0]. As a result,
+ the derived property value remained aligned with the Unicode
+ Standard. Specifically, no exception was added.
+
+2.3. Deployment
+
+ There are many variations on the general IDNA model in use in the
+ various parts of the community. The following lists some of the
+ strategies that implementations that claim to be IDNA compliant are
+ known to use, but it should be noted the list is not complete:
+
+ * IDNA2003 as specified in RFC 3490 [RFC3490] and RFC 3491
+ [RFC3491]. Those specifications are dependent on case folding,
+ Normalization Form KC (NFKC), and on tables that specify for each
+ code point whether it is allowed to be used or not, with a
+ distinction made between use for "stored strings" and "query
+ strings". The tables themselves are dependent on Unicode 3.2
+ [Unicode-3.2.0].
+
+ * A number of variations on IDNA2003, sometimes presented as
+ "updated IDNA2003" or the like, which follow the principles of
+ IDNA2003 as understood by the implementers but that use tables
+ that represent how the implementers believe Stringprep [RFC3454]
+ and Nameprep [RFC3491] would have evolved had the IETF not moved
+ in the direction of IDNA2008 instead.
+
+ * A mix between IDNA2003 and IDNA2008 where code points assigned to
+ Unicode after Unicode 3.2.0 [Unicode-3.2.0] have derived property
+ value calculated according to the algorithm specified in IDNA2008.
+
+ * A mix between IDNA2003 and IDNA2008 according to the Unicode
+ Technical Standard #46 [UTS-46]. Because that document specifies
+ different profiles, there are several variations that leave users
+ with no guarantee that two applications claiming conformance to
+ UTS#46 will interoperate well with each other much less with
+ conforming IDNA2008 implementations. UTS#46 is ultimately based
+ on a normative table very much like the one used by Stringprep
+ [RFC3454] but updated for each new version of Unicode.
+
+ * The (normative) IDNA2008 algorithm applied to whatever version of
+ Unicode Standard exists in the operating system and/or libraries
+ used, independent of whatever version of tables appears in the
+ (non-normative) IANA database.
+
+ In practice, the Unicode Consortium creates a maximum set of code
+ points by assigning code points in the Unicode Standard. The
+ IDNA2008 rules use the Unicode Standard to create a further subset of
+ code points and context that are permitted in DNS labels associated
+ with its PVALID and CONTEXT (CONTEXTJ or CONTEXTO) derived property
+ values. DNS registries and other organizations that deal with IDNs
+ are supposed to create their own subsets from IDNA2008 for use by
+ those registries and organizations.
+
+ This progressive subsetting and narrowing of the repertoire of code
+ points that can be used in labels is an implementation of the
+ principles of being conservative when deciding what code points to
+ include in such a subset. SAC-084 [SAC-084] and RFC 6912 [RFC6912]
+ recommend to DNS registries and other organizations to be
+ conservative when creating their subsets and to use the principle of
+ creating subsets by inclusion.
+
+ See also Security Considerations (Section 7) in this document.
+
+3. Notable Changes between Unicode 6.0.0 and 12.0.0
+
+ Among the changes between the Unicode versions, most code points that
+ change derived property value change from UNASSIGNED to PVALID or
+ from UNASSIGNED to DISALLOWED. The interesting changes in derived
+ property values include other changes. All changes between the major
+ versions of Unicode can be found in Appendix A (6.0.0-7.0.0),
+ Appendix B (7.0.0-8.0.0), Appendix C (8.0.0-9.0.0), Appendix D
+ (9.0.0-10.0.0), Appendix E (10.0.0-11.0.0), and Appendix F
+ (11.0.0-12.0.0).
+
+3.1. Changes between Unicode 6.0.0 and 7.0.0
+
+ Change in number of characters in each category:
+
+ * PVALID changed from 97418 to 99867 (+2449)
+
+ * UNASSIGNED changed from 865081 to 861509 (-3572)
+
+ * CONTEXTJ did not change, at 2
+
+ * CONTEXTO did not change, at 25
+
+ * DISALLOWED changed from 151586 to 152709 (+1123)
+
+ * TOTAL did not change, at 1114112
+
+ There are no changes made to Unicode between version 6.0.0 and 7.0.0
+ that impact IDNA2008 calculation of the derived property values.
+
+ The code points U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL
+ INHERENT AA both changed the General Category from Cf (Format) to Mn
+ (Nonspacing_Mark), but that did not impact the calculation of the
+ derived property value which stayed at DISALLOWED.
+
+ The character ARABIC LETTER BEH WITH HAMZA ABOVE (U+08A1) was
+ introduced in Unicode 7.0.0. This was discussed extensively in the
+ IETF and also by the IAB in their statement [IAB2005-1] requesting
+ the IETF to investigate the issue. Specifically, the IAB stated:
+
+ | On the same precautionary principle, the IAB recommends that the
+ | Internationalized Domain Names for Applications (IDNA) Parameters
+ | registry <https://www.iana.org/assignments/idna-tables/> not be
+ | updated to Unicode 7.0.0 until the IETF has consensus on a
+ | solution to this problem.
+
+ The discussion in the IETF concluded that although it is possible to
+ create "the same" character in multiple ways, the issue with U+08A1
+ is not unique. The character U+08A1 (ARABIC LETTER BEH WITH HAMZA
+ ABOVE) can be represented with the sequence ARABIC LETTER BEH
+ (U+0628) and ARABIC HAMZA ABOVE (U+0654). This is identical to LATIN
+ SMALL LETTER O WITH STROKE (U+00F8), which can be represented with
+ the sequence LATIN SMALL LETTER O (U+006F) followed by COMBINING
+ SHORT SOLIDUS OVERLAY (U+0337).
+
+ Although the discussion about this specific code point resulted in
+ acceptance of the derived property value of PVALID, the underlying
+ problem with combining sequences is not understood fully. Therefore,
+ it cannot be claimed that this case can be extrapolated to other
+ situations and other code points.
+
+3.2. Changes between Unicode 7.0.0 and 10.0.0
+
+ Change in number of characters in each category:
+
+ * Code points that changed derived property value: 0
+
+ * PVALID changed from 99867 to 122411 (+22544)
+
+ * UNASSIGNED changed from 861509 to 837775 (-23734)
+
+ * CONTEXTJ did not change, at 2
+
+ * CONTEXTO did not change, at 25
+
+ * DISALLOWED changed from 152709 to 153899 (+1190)
+
+ * TOTAL did not change, at 1114112
+
+ There are no changes made to Unicode between version 7.0.0 and 10.0.0
+ that impact IDNA2008 calculation of the derived property values.
+
+3.3. Changes between Unicode 10.0.0 and 11.0.0
+
+ Change in number of characters in each category:
+
+ * Code points that changed derived property value: 1
+
+ * PVALID changed from 122411 to 122734 (+323)
+
+ * UNASSIGNED changed from 837775 to 837091 (-684)
+
+ * CONTEXTJ did not change, at 2
+
+ * CONTEXTO did not change, at 25
+
+ * DISALLOWED changed from 153899 to 154260 (+361)
+
+ * TOTAL did not change, at 1114112
+
+ * Georgian letters in the ranges U+10D0..U+10FA and U+10FD..U+10FF
+ had their General Category changed from Lo (Other_Letter) to Ll
+ (Lowercase_Letter) to reflect their status as the lowercase of new
+ Georgian case pairs. Case mappings were also added.
+
+ * SHARADA SANDHI MARK (U+111C9) General Category was changed from Po
+ (Other_Punctuation) to Mn (Nonspacing_Mark), and the Bidi property
+ was changed from L (Left to Right) to NSM (Nonspacing Mark).
+
+ * The properties for ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
+ ZANABAZAR SQUARE VOWEL SIGN AU (U+11A08) were corrected from Mc to
+ Mn.
+
+ * SPHERICAL ANGLE OPENING UP (U+29A1) was changed to Bidi Mirrored
+ to No.
+
+ These changes to the Unicode Standard have the following implications
+ for these code points:
+
+ * The newly assigned 684 characters are assigned a derived property
+ value as of a result of applying the IDNA2008 algorithm.
+
+ * The Georgian letters in the ranges U+10D0..U+10FA and
+ U+10FD..U+10FF existed before IDNA2008 was created. Applying the
+ IDNA2008 algorithm to the code points assigned the derived
+ property value PVALID, and that value is unchanged even if the
+ underlying Unicode properties have changed. The newly encoded
+ Mtavruli letters have General Category Lu (Uppercase_Letter) and
+ are therefore DISALLOWED.
+
+ * The U+111C9 SHARADA SANDHI MARK was added to Unicode 8.0.0
+ [Unicode-8.0.0]. Applying the IDNA2008 algorithm to the code
+ point assigned the derived property value DISALLOWED. The changes
+ in the underlying properties in Unicode 11.0.0 [Unicode-11.0.0]
+ caused the derived property value to change to PVALID.
+
+ * The characters ZANABAZAR SQUARE VOWEL SIGN AI (U+11A07) and
+ ZANABAZAR SQUARE VOWEL SIGN AU (U+11A08) were added to Unicode
+ 10.0.0 [Unicode-10.0.0]. Applying the IDNA2008 algorithm to the
+ code points assigned the derived property value PVALID, and that
+ value is unchanged even if the underlying Unicode properties have
+ changed.
+
+ * SPHERICAL ANGLE OPENING UP (U+29A1) existed before IDNA2008 was
+ created. Applying the IDNA2008 algorithm to the code point
+ assigned the derived property value DISALLOWED, and that value is
+ unchanged even if the underlying Unicode properties have changed.
+
+3.4. Changes between Unicode 11.0.0 and 12.0.0
+
+ Change in number of characters in each category:
+
+ * Code points that changed derived property value: 0
+
+ * PVALID changed from 122734 to 123006 (+272)
+
+ * UNASSIGNED changed from 837091 to 836537 (-554)
+
+ * CONTEXTJ did not change, at 2
+
+ * CONTEXTO did not change, at 25
+
+ * DISALLOWED changed from 154260 to 154542 (+282)
+
+ * TOTAL did not change, at 1114112
+
+4. U+111C9 SHARADA SANDHI MARK
+
+ As one can see in Section 3, an incompatible property change was made
+ between Unicode 6.0.0 and 12.0.0, affecting the code point U+111C9.
+ Its derived property value thus changed from DISALLOWED to PVALID.
+ In situations like these, IDNA2008 allows for addition of rules to
+ RFC 5892 [RFC5892], Section 2.7. If the code point is accepted, it
+ might still be rejected if validated by software based on versions of
+ Unicode older than 12.0.0. As the character is rarely used outside
+ the group of Sharada specialists but is used in some records for
+ indicating sandhi breaks, the conclusion was that it could either be
+ added as an exception or allowed to change its property value. As
+ including an exception would require implementation changes to
+ deployments of IDNA20008, the IETF has decided not to add a
+ BackwardCompatible rule to IDNA2008 (i.e., Section 2.7 of RFC 5892
+ [RFC5892]) for this code point. This also ensures all sandhi marks
+ are treated equally.
+
+5. Conclusion
+
+ As described in Sections 3 and 4, changes have been made to Unicode
+ between version 6.0.0 and 12.0.0. Some changes to specific
+ characters changed their derived property value, whereas other
+ changes did not. Given the deployment considerations described in
+ Section 2.3 and changes in the Unicode Standard described in Sections
+ 3 and 4, including implications to normalization, the conclusion is
+ not to add any exception rules to IDNA2008.
+
+ This document addresses only changes to Unicode between version 6.0.0
+ and version 12.0.0. Changes in future Unicode versions might result
+ in the conclusion that exception rules need to be added to IDNA2008
+ after the review process explained in RFC 8753 [RFC8753]. Separately
+ from any changes in Unicode, the IETF might conclude that updates to
+ RFC 5892 [RFC5892] or other IDNA2008 documents might become
+ necessary; such updates might include changes to the algorithm
+ specified in IDNA2008 as well as additional rules, categories, or
+ other forms of tuning, like the clarifications in RFC 8753 [RFC8753].
+
+6. IANA Considerations
+
+ IANA updated the "IDNA Rules and Derived Property Values" [IANA-IDNA]
+ registry after the expert reviewer validated that the derived
+ property values were calculated correctly.
+
+7. Security Considerations
+
+ This document makes recommendations regarding the use of the IDNA2008
+ algorithm for calculation of derived property values, based on
+ Unicode version 12.0.0. This recommendation does not say anything
+ about what recommendations to make for future versions of the Unicode
+ Standard.
+
+ Not following these recommendations can lead to various security
+ issues. Specifically, allowing confusable characters may lead to
+ various phishing attacks, as described in the Security Consideration
+ Sections in the documents listed in Section 2.1.
+
+8. References
+
+8.1. Normative References
+
+ [RFC3491] Hoffman, P. and M. Blanchet, "Nameprep: A Stringprep
+ Profile for Internationalized Domain Names (IDN)",
+ RFC 3491, DOI 10.17487/RFC3491, March 2003,
+ <https://www.rfc-editor.org/info/rfc3491>.
+
+ [RFC5890] Klensin, J., "Internationalized Domain Names for
+ Applications (IDNA): Definitions and Document Framework",
+ RFC 5890, DOI 10.17487/RFC5890, August 2010,
+ <https://www.rfc-editor.org/info/rfc5890>.
+
+ [RFC5891] Klensin, J., "Internationalized Domain Names in
+ Applications (IDNA): Protocol", RFC 5891,
+ DOI 10.17487/RFC5891, August 2010,
+ <https://www.rfc-editor.org/info/rfc5891>.
+
+ [RFC5892] Faltstrom, P., Ed., "The Unicode Code Points and
+ Internationalized Domain Names for Applications (IDNA)",
+ RFC 5892, DOI 10.17487/RFC5892, August 2010,
+ <https://www.rfc-editor.org/info/rfc5892>.
+
+ [RFC5893] Alvestrand, H., Ed. and C. Karp, "Right-to-Left Scripts
+ for Internationalized Domain Names for Applications
+ (IDNA)", RFC 5893, DOI 10.17487/RFC5893, August 2010,
+ <https://www.rfc-editor.org/info/rfc5893>.
+
+ [RFC6452] Faltstrom, P., Ed. and P. Hoffman, Ed., "The Unicode Code
+ Points and Internationalized Domain Names for Applications
+ (IDNA) - Unicode 6.0", RFC 6452, DOI 10.17487/RFC6452,
+ November 2011, <https://www.rfc-editor.org/info/rfc6452>.
+
+8.2. Informative References
+
+ [IAB2005-1]
+ Internet Architecture Board, "IAB Statement on Identifiers
+ and Unicode 7.0.0", 27 January 2015,
+ <https://www.iab.org/documents/correspondence-reports-
+ documents/2015-2/iab-statement-on-identifiers-and-unicode-
+ 7-0-0/archive/>.
+
+ [IAB2005-2]
+ Internet Architecture Board, "IAB Statement on Identifiers
+ and Unicode 7.0.0", 11 February 2015,
+ <https://www.iab.org/documents/correspondence-reports-
+ documents/2015-2/iab-statement-on-identifiers-and-unicode-
+ 7-0-0/>.
+
+ [IANA-IDNA]
+ IANA, "IDNA Rules and Derived Property Values", February
+ 2022,
+ <https://www.iana.org/assignments/idna-tables-12.0.0/>.
+
+ [IDNA7] Klensin, J. C. and P. Faltstrom, "IDNA Update for Unicode
+ 7.0 and Later Versions", Work in Progress, Internet-Draft,
+ draft-klensin-idna-5892upd-unicode70-05, 8 October 2017,
+ <https://datatracker.ietf.org/doc/html/draft-klensin-idna-
+ 5892upd-unicode70-05>.
+
+ [RFC3454] Hoffman, P. and M. Blanchet, "Preparation of
+ Internationalized Strings ("stringprep")", RFC 3454,
+ DOI 10.17487/RFC3454, December 2002,
+ <https://www.rfc-editor.org/info/rfc3454>.
+
+ [RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
+ "Internationalizing Domain Names in Applications (IDNA)",
+ RFC 3490, DOI 10.17487/RFC3490, March 2003,
+ <https://www.rfc-editor.org/info/rfc3490>.
+
+ [RFC5894] Klensin, J., "Internationalized Domain Names for
+ Applications (IDNA): Background, Explanation, and
+ Rationale", RFC 5894, DOI 10.17487/RFC5894, August 2010,
+ <https://www.rfc-editor.org/info/rfc5894>.
+
+ [RFC5895] Resnick, P. and P. Hoffman, "Mapping Characters for
+ Internationalized Domain Names in Applications (IDNA)
+ 2008", RFC 5895, DOI 10.17487/RFC5895, September 2010,
+ <https://www.rfc-editor.org/info/rfc5895>.
+
+ [RFC6912] Sullivan, A., Thaler, D., Klensin, J., and O. Kolkman,
+ "Principles for Unicode Code Point Inclusion in Labels in
+ the DNS", RFC 6912, DOI 10.17487/RFC6912, April 2013,
+ <https://www.rfc-editor.org/info/rfc6912>.
+
+ [RFC8753] Klensin, J. and P. Fältström, "Internationalized Domain
+ Names for Applications (IDNA) Review for New Unicode
+ Versions", RFC 8753, DOI 10.17487/RFC8753, April 2020,
+ <https://www.rfc-editor.org/info/rfc8753>.
+
+ [SAC-084] The Security and Stability Advisory Committee, "SAC084",
+ SSAC Comments on Guidelines for the Extended Process
+ Similarity Review Panel for the IDN ccTLD Fast Track
+ Process, August 2016,
+ <https://www.icann.org/en/system/files/files/sac-
+ 084-en.pdf>.
+
+ [Unicode-3.2.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 3.2.0", Mountain View: The Unicode Consortium,
+ ISBN 0-201-61633-5, March 2002,
+ <https://www.unicode.org/versions/Unicode3.2.0/>.
+
+ [Unicode-5.2.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 5.2.0", Mountain View: The Unicode Consortium,
+ ISBN 978-1-936213-00-9, October 2009,
+ <https://www.unicode.org/versions/Unicode5.2.0/>.
+
+ [Unicode-6.0.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 6.0.0", Mountain View: The Unicode Consortium,
+ ISBN 978-1-936213-01-6, October 2011,
+ <https://www.unicode.org/versions/Unicode6.0.0/>.
+
+ [Unicode-7.0.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 7.0.0", Mountain View: The Unicode Consortium,
+ ISBN 978-1-936213-09-2, June 2014,
+ <https://www.unicode.org/versions/Unicode7.0.0/>.
+
+ [Unicode-8.0.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 8.0.0", Mountain View: The Unicode Consortium,
+ ISBN 978-1-936213-10-8, June 2015,
+ <https://www.unicode.org/versions/Unicode8.0.0/>.
+
+ [Unicode-10.0.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 10.0.0", Mountain View: The Unicode Consortium,
+ ISBN 978-1-936213-16-0, June 2017,
+ <https://www.unicode.org/versions/Unicode10.0.0/>.
+
+ [Unicode-11.0.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 11.0.0", Mountain View: The Unicode Consortium,
+ ISBN 978-1-936213-19-1, June 2018,
+ <https://www.unicode.org/versions/Unicode11.0.0/>.
+
+ [Unicode-12.0.0]
+ The Unicode Consortium, "The Unicode Standard, Version
+ 12.0.0", Mountain View: The Unicode Consortium,
+ ISBN 978-1-936213-22-1, March 2019,
+ <https://www.unicode.org/versions/Unicode12.0.0/>.
+
+ [UTS-46] The Unicode Consortium, "Unicode Technical Standard #46,
+ Version 12.0.0", UNICODE IDNA COMPATIBILITY PROCESSING,
+ March 2019,
+ <https://www.unicode.org/reports/tr46/tr46-23.html>.
+
+Appendix A. Changes from Unicode 6.0.0 to Unicode 7.0.0
+
+ Changes from derived property value UNASSIGNED to either PVALID or
+ DISALLOWED.
+
+037F ; DISALLOWED # GREEK CAPITAL LETTER YOT
+0528 ; DISALLOWED # CYRILLIC CAPITAL LETTER EN WITH LEFT HOOK
+0529 ; PVALID # CYRILLIC SMALL LETTER EN WITH LEFT HOOK
+052A ; DISALLOWED # CYRILLIC CAPITAL LETTER DZZHE
+052B ; PVALID # CYRILLIC SMALL LETTER DZZHE
+052C ; DISALLOWED # CYRILLIC CAPITAL LETTER DCHE
+052D ; PVALID # CYRILLIC SMALL LETTER DCHE
+052E ; DISALLOWED # CYRILLIC CAPITAL LETTER EL WITH DESCENDER
+052F ; PVALID # CYRILLIC SMALL LETTER EL WITH DESCENDER
+058D..058F ; DISALLOWED # RIGHT-FACING ARMENIAN ETERNITY SIGN..ARMENIAN
+0604..0605 ; DISALLOWED # ARABIC SIGN SAMVAT..ARABIC NUMBER MARK ABOVE
+061C ; DISALLOWED # ARABIC LETTER MARK
+08A0..08B2 ; PVALID # ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC
+08E4..08FF ; PVALID # ARABIC CURLY FATHA..ARABIC MARK SIDEWAYS NOON
+0978 ; PVALID # DEVANAGARI LETTER MARWARI DDA
+0980 ; PVALID # BENGALI ANJI
+0AF0 ; DISALLOWED # GUJARATI ABBREVIATION SIGN
+0C00 ; PVALID # TELUGU SIGN COMBINING CANDRABINDU ABOVE
+0C34 ; PVALID # TELUGU LETTER LLLA
+0C81 ; PVALID # KANNADA SIGN CANDRABINDU
+0D01 ; PVALID # MALAYALAM SIGN CANDRABINDU
+0DE6..0DEF ; PVALID # SINHALA LITH DIGIT ZERO..SINHALA LITH DIGIT N
+0EDE..0EDF ; PVALID # LAO LETTER KHMU GO..LAO LETTER KHMU NYO
+10C7 ; DISALLOWED # GEORGIAN CAPITAL LETTER YN
+10CD ; DISALLOWED # GEORGIAN CAPITAL LETTER AEN
+10FD..10FF ; PVALID # GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL S
+16F1..16F8 ; PVALID # RUNIC LETTER K..RUNIC LETTER FRANKS CASKET AE
+17B4..17B5 ; DISALLOWED # KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT
+191D..191E ; PVALID # LIMBU LETTER GYAN..LIMBU LETTER TRA
+1AB0..1ABD ; PVALID # COMBINING DOUBLED CIRCUMFLEX ACCENT..COMBININ
+1ABE ; DISALLOWED # COMBINING PARENTHESES OVERLAY
+1BAB..1BAD ; PVALID # SUNDANESE SIGN VIRAMA..SUNDANESE CONSONANT SI
+1BBA..1BBF ; PVALID # SUNDANESE AVAGRAHA..SUNDANESE LETTER FINAL M
+1CC0..1CC7 ; DISALLOWED # SUNDANESE PUNCTUATION BINDU SURYA..SUNDANESE
+1CF3..1CF6 ; PVALID # VEDIC SIGN ROTATED ARDHAVISARGA..VEDIC SIGN U
+1CF8..1CF9 ; PVALID # VEDIC TONE RING ABOVE..VEDIC TONE DOUBLE RING
+1DE7..1DF5 ; PVALID # COMBINING LATIN SMALL LETTER ALPHA..COMBINING
+2066..2069 ; DISALLOWED # LEFT-TO-RIGHT ISOLATE..POP DIRECTIONAL ISOLAT
+20BA..20BD ; DISALLOWED # TURKISH LIRA SIGN..RUBLE SIGN
+23F4..23FA ; DISALLOWED # BLACK MEDIUM LEFT-POINTING TRIANGLE..BLACK CI
+2700 ; DISALLOWED # BLACK SAFETY SCISSORS
+27CB ; DISALLOWED # MATHEMATICAL RISING DIAGONAL
+27CD ; DISALLOWED # MATHEMATICAL FALLING DIAGONAL
+2B4D..2B4F ; DISALLOWED # DOWNWARDS TRIANGLE-HEADED ZIGZAG ARROW..SHORT
+2B5A..2B73 ; DISALLOWED # SLANTED NORTH ARROW WITH HOOKED HEAD..DOWNWAR
+2B76..2B95 ; DISALLOWED # NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGH
+2B98..2BB9 ; DISALLOWED # THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARR
+2BBD..2BC8 ; DISALLOWED # BALLOT BOX WITH LIGHT X..BLACK MEDIUM RIGHT-P
+2BCA..2BD1 ; DISALLOWED # TOP HALF BLACK CIRCLE..UNCERTAINTY SIGN
+2CF2 ; DISALLOWED # COPTIC CAPITAL LETTER BOHAIRIC KHEI
+2CF3 ; PVALID # COPTIC SMALL LETTER BOHAIRIC KHEI
+2D27 ; PVALID # GEORGIAN SMALL LETTER YN
+2D2D ; PVALID # GEORGIAN SMALL LETTER AEN
+2D66..2D67 ; PVALID # TIFINAGH LETTER YE..TIFINAGH LETTER YO
+2E32..2E42 ; DISALLOWED # TURNED COMMA..DOUBLE LOW-REVERSED-9 QUOTATION
+9FCC ; PVALID # <CJK Ideograph>
+A674..A67B ; PVALID # COMBINING CYRILLIC LETTER UKRAINIAN IE..COMBI
+A698 ; DISALLOWED # CYRILLIC CAPITAL LETTER DOUBLE O
+A699 ; PVALID # CYRILLIC SMALL LETTER DOUBLE O
+A69A ; DISALLOWED # CYRILLIC CAPITAL LETTER CROSSED O
+A69B ; PVALID # CYRILLIC SMALL LETTER CROSSED O
+A69C..A69D ; DISALLOWED # MODIFIER LETTER CYRILLIC HARD SIGN..MODIFIER
+A69F ; PVALID # COMBINING CYRILLIC LETTER IOTIFIED E
+A792 ; DISALLOWED # LATIN CAPITAL LETTER C WITH BAR
+A793..A795 ; PVALID # LATIN SMALL LETTER C WITH BAR..LATIN SMALL LE
+A796 ; DISALLOWED # LATIN CAPITAL LETTER B WITH FLOURISH
+A797 ; PVALID # LATIN SMALL LETTER B WITH FLOURISH
+A798 ; DISALLOWED # LATIN CAPITAL LETTER F WITH STROKE
+A799 ; PVALID # LATIN SMALL LETTER F WITH STROKE
+A79A ; DISALLOWED # LATIN CAPITAL LETTER VOLAPUK AE
+A79B ; PVALID # LATIN SMALL LETTER VOLAPUK AE
+A79C ; DISALLOWED # LATIN CAPITAL LETTER VOLAPUK OE
+A79D ; PVALID # LATIN SMALL LETTER VOLAPUK OE
+A79E ; DISALLOWED # LATIN CAPITAL LETTER VOLAPUK UE
+A79F ; PVALID # LATIN SMALL LETTER VOLAPUK UE
+A7AA..A7AD ; DISALLOWED # LATIN CAPITAL LETTER H WITH HOOK..LATIN CAPIT
+A7B0..A7B1 ; DISALLOWED # LATIN CAPITAL LETTER TURNED K..LATIN CAPITAL
+A7F7 ; PVALID # LATIN EPIGRAPHIC LETTER SIDEWAYS I
+A7F8..A7F9 ; DISALLOWED # MODIFIER LETTER CAPITAL H WITH STROKE..MODIFI
+A9E0..A9FE ; PVALID # MYANMAR LETTER SHAN GHA..MYANMAR LETTER TAI L
+AA7C..AA7F ; PVALID # MYANMAR SIGN TAI LAING TONE-2..MYANMAR LETTER
+AAE0..AAEF ; PVALID # MEETEI MAYEK LETTER E..MEETEI MAYEK VOWEL SIG
+AAF0..AAF1 ; DISALLOWED # MEETEI MAYEK CHEIKHAN..MEETEI MAYEK AHANG KHU
+AAF2..AAF6 ; PVALID # MEETEI MAYEK ANJI..MEETEI MAYEK VIRAMA
+AB30..AB5A ; PVALID # LATIN SMALL LETTER BARRED ALPHA..LATIN SMALL
+AB5B..AB5F ; DISALLOWED # MODIFIER BREVE WITH INVERTED BREVE..MODIFIER
+AB64..AB65 ; PVALID # LATIN SMALL LETTER INVERTED ALPHA..GREEK LETT
+FA2E..FA2F ; DISALLOWED # CJK COMPATIBILITY IDEOGRAPH-FA2E..CJK COMPATI
+FE27..FE2D ; PVALID # COMBINING LIGATURE LEFT HALF BELOW..COMBINING
+1018B..1018C; DISALLOWED # GREEK ONE QUARTER SIGN..GREEK SINUSOID SIGN
+101A0 ; DISALLOWED # GREEK SYMBOL TAU RHO
+102E0 ; PVALID # COPTIC EPACT THOUSANDS MARK
+102E1..102FB; DISALLOWED # COPTIC EPACT DIGIT ONE..COPTIC EPACT NUMBER N
+1031F ; PVALID # OLD ITALIC LETTER ESS
+10350..1037A; PVALID # OLD PERMIC LETTER AN..COMBINING OLD PERMIC LE
+10500..10527; PVALID # ELBASAN LETTER A..ELBASAN LETTER KHE
+10530..10563; PVALID # CAUCASIAN ALBANIAN LETTER ALT..CAUCASIAN ALBA
+1056F ; DISALLOWED # CAUCASIAN ALBANIAN CITATION MARK
+10600..10736; PVALID # LINEAR A SIGN AB001..LINEAR A SIGN A664
+10740..10755; PVALID # LINEAR A SIGN A701 A..LINEAR A SIGN A732 JE
+10760..10767; PVALID # LINEAR A SIGN A800..LINEAR A SIGN A807
+10860..10876; PVALID # PALMYRENE LETTER ALEPH..PALMYRENE LETTER TAW
+10877..1087F; DISALLOWED # PALMYRENE LEFT-POINTING FLEURON..PALMYRENE NU
+10880..1089E; PVALID # NABATAEAN LETTER FINAL ALEPH..NABATAEAN LETTE
+108A7..108AF; DISALLOWED # NABATAEAN NUMBER ONE..NABATAEAN NUMBER ONE HU
+10980..109B7; PVALID # MEROITIC HIEROGLYPHIC LETTER A..MEROITIC CURS
+109BE..109BF; PVALID # MEROITIC CURSIVE LOGOGRAM RMT..MEROITIC CURSI
+10A80..10A9C; PVALID # OLD NORTH ARABIAN LETTER HEH..OLD NORTH ARABI
+10A9D..10A9F; DISALLOWED # OLD NORTH ARABIAN NUMBER ONE..OLD NORTH ARABI
+10AC0..10AC7; PVALID # MANICHAEAN LETTER ALEPH..MANICHAEAN LETTER WA
+10AC8 ; DISALLOWED # MANICHAEAN SIGN UD
+10AC9..10AE6; PVALID # MANICHAEAN LETTER ZAYIN..MANICHAEAN ABBREVIAT
+10AEB..10AF6; DISALLOWED # MANICHAEAN NUMBER ONE..MANICHAEAN PUNCTUATION
+10B80..10B91; PVALID # PSALTER PAHLAVI LETTER ALEPH..PSALTER PAHLAVI
+10B99..10B9C; DISALLOWED # PSALTER PAHLAVI SECTION MARK..PSALTER PAHLAVI
+10BA9..10BAF; DISALLOWED # PSALTER PAHLAVI NUMBER ONE..PSALTER PAHLAVI N
+1107F ; PVALID # BRAHMI NUMBER JOINER
+110D0..110E8; PVALID # SORA SOMPENG LETTER SAH..SORA SOMPENG LETTER
+110F0..110F9; PVALID # SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT N
+11100..11134; PVALID # CHAKMA SIGN CANDRABINDU..CHAKMA MAAYYAA
+11136..1113F; PVALID # CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE
+11140..11143; DISALLOWED # CHAKMA SECTION MARK..CHAKMA QUESTION MARK
+11150..11173; PVALID # MAHAJANI LETTER A..MAHAJANI SIGN NUKTA
+11174..11175; DISALLOWED # MAHAJANI ABBREVIATION SIGN..MAHAJANI SECTION
+11176 ; PVALID # MAHAJANI LIGATURE SHRI
+11180..111C4; PVALID # SHARADA SIGN CANDRABINDU..SHARADA OM
+111C5..111C8; DISALLOWED # SHARADA DANDA..SHARADA SEPARATOR
+111CD ; DISALLOWED # SHARADA SUTRA MARK
+111D0..111DA; PVALID # SHARADA DIGIT ZERO..SHARADA EKAM
+111E1..111F4; DISALLOWED # SINHALA ARCHAIC DIGIT ONE..SINHALA ARCHAIC NU
+11200..11211; PVALID # KHOJKI LETTER A..KHOJKI LETTER JJA
+11213..11237; PVALID # KHOJKI LETTER NYA..KHOJKI SIGN SHADDA
+11238..1123D; DISALLOWED # KHOJKI DANDA..KHOJKI ABBREVIATION SIGN
+112B0..112EA; PVALID # KHUDAWADI LETTER A..KHUDAWADI SIGN VIRAMA
+112F0..112F9; PVALID # KHUDAWADI DIGIT ZERO..KHUDAWADI DIGIT NINE
+11301..11303; PVALID # GRANTHA SIGN CANDRABINDU..GRANTHA SIGN VISARG
+11305..1130C; PVALID # GRANTHA LETTER A..GRANTHA LETTER VOCALIC L
+1130F..11310; PVALID # GRANTHA LETTER EE..GRANTHA LETTER AI
+11313..11328; PVALID # GRANTHA LETTER OO..GRANTHA LETTER NA
+1132A..11330; PVALID # GRANTHA LETTER PA..GRANTHA LETTER RA
+11332..11333; PVALID # GRANTHA LETTER LA..GRANTHA LETTER LLA
+11335..11339; PVALID # GRANTHA LETTER VA..GRANTHA LETTER HA
+1133C..11344; PVALID # GRANTHA SIGN NUKTA..GRANTHA VOWEL SIGN VOCALI
+11347..11348; PVALID # GRANTHA VOWEL SIGN EE..GRANTHA VOWEL SIGN AI
+1134B..1134D; PVALID # GRANTHA VOWEL SIGN OO..GRANTHA SIGN VIRAMA
+11357 ; PVALID # GRANTHA AU LENGTH MARK
+1135D..11363; PVALID # GRANTHA SIGN PLUTA..GRANTHA VOWEL SIGN VOCALI
+11366..1136C; PVALID # COMBINING GRANTHA DIGIT ZERO..COMBINING GRANT
+11370..11374; PVALID # COMBINING GRANTHA LETTER A..COMBINING GRANTHA
+11480..114C5; PVALID # TIRHUTA ANJI..TIRHUTA GVANG
+114C6 ; DISALLOWED # TIRHUTA ABBREVIATION SIGN
+114C7 ; PVALID # TIRHUTA OM
+114D0..114D9; PVALID # TIRHUTA DIGIT ZERO..TIRHUTA DIGIT NINE
+11580..115B5; PVALID # SIDDHAM LETTER A..SIDDHAM VOWEL SIGN VOCALIC
+115B8..115C0; PVALID # SIDDHAM VOWEL SIGN E..SIDDHAM SIGN NUKTA
+115C1..115C9; DISALLOWED # SIDDHAM SIGN SIDDHAM..SIDDHAM END OF TEXT MAR
+11600..11640; PVALID # MODI LETTER A..MODI SIGN ARDHACANDRA
+11641..11643; DISALLOWED # MODI DANDA..MODI ABBREVIATION SIGN
+11644 ; PVALID # MODI SIGN HUVA
+11650..11659; PVALID # MODI DIGIT ZERO..MODI DIGIT NINE
+11680..116B7; PVALID # TAKRI LETTER A..TAKRI SIGN NUKTA
+116C0..116C9; PVALID # TAKRI DIGIT ZERO..TAKRI DIGIT NINE
+118A0..118BF; DISALLOWED # WARANG CITI CAPITAL LETTER NGAA..WARANG CITI
+118C0..118E9; PVALID # WARANG CITI SMALL LETTER NGAA..WARANG CITI DI
+118EA..118F2; DISALLOWED # WARANG CITI NUMBER TEN..WARANG CITI NUMBER NI
+118FF ; PVALID # WARANG CITI OM
+11AC0..11AF8; PVALID # PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL ST
+1236F..12398; PVALID # CUNEIFORM SIGN KAP ELAMITE..CUNEIFORM SIGN UM
+12463..1246E; DISALLOWED # CUNEIFORM NUMERIC SIGN ONE QUARTER GUR..CUNEI
+12474 ; DISALLOWED # CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON
+16A40..16A5E; PVALID # MRO LETTER TA..MRO LETTER TEK
+16A60..16A69; PVALID # MRO DIGIT ZERO..MRO DIGIT NINE
+16A6E..16A6F; DISALLOWED # MRO DANDA..MRO DOUBLE DANDA
+16AD0..16AED; PVALID # BASSA VAH LETTER ENNI..BASSA VAH LETTER I
+16AF0..16AF4; PVALID # BASSA VAH COMBINING HIGH TONE..BASSA VAH COMB
+16AF5 ; DISALLOWED # BASSA VAH FULL STOP
+16B00..16B36; PVALID # PAHAWH HMONG VOWEL KEEB..PAHAWH HMONG MARK CI
+16B37..16B3F; DISALLOWED # PAHAWH HMONG SIGN VOS THOM..PAHAWH HMONG SIGN
+16B40..16B43; PVALID # PAHAWH HMONG SIGN VOS SEEV..PAHAWH HMONG SIGN
+16B44..16B45; DISALLOWED # PAHAWH HMONG SIGN XAUS..PAHAWH HMONG SIGN CIM
+16B50..16B59; PVALID # PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT N
+16B5B..16B61; DISALLOWED # PAHAWH HMONG NUMBER TENS..PAHAWH HMONG NUMBER
+16B63..16B77; PVALID # PAHAWH HMONG SIGN VOS LUB..PAHAWH HMONG SIGN
+16B7D..16B8F; PVALID # PAHAWH HMONG CLAN SIGN TSHEEJ..PAHAWH HMONG C
+16F00..16F44; PVALID # MIAO LETTER PA..MIAO LETTER HHA
+16F50..16F7E; PVALID # MIAO LETTER NASALIZATION..MIAO VOWEL SIGN NG
+16F8F..16F9F; PVALID # MIAO TONE RIGHT..MIAO LETTER REFORMED TONE-8
+1BC00..1BC6A; PVALID # DUPLOYAN LETTER H..DUPLOYAN LETTER VOCALIC M
+1BC70..1BC7C; PVALID # DUPLOYAN AFFIX LEFT HORIZONTAL SECANT..DUPLOY
+1BC80..1BC88; PVALID # DUPLOYAN AFFIX HIGH ACUTE..DUPLOYAN AFFIX HIG
+1BC90..1BC99; PVALID # DUPLOYAN AFFIX LOW ACUTE..DUPLOYAN AFFIX LOW
+1BC9C ; DISALLOWED # DUPLOYAN SIGN O WITH CROSS
+1BC9D..1BC9E; PVALID # DUPLOYAN THICK LETTER SELECTOR..DUPLOYAN DOUB
+1BC9F..1BCA3; DISALLOWED # DUPLOYAN PUNCTUATION CHINOOK FULL STOP..SHORT
+1E800..1E8C4; PVALID # MENDE KIKAKUI SYLLABLE M001 KI..MENDE KIKAKUI
+1E8C7..1E8CF; DISALLOWED # MENDE KIKAKUI DIGIT ONE..MENDE KIKAKUI DIGIT
+1E8D0..1E8D6; PVALID # MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE K
+1EE00..1EE03; DISALLOWED # ARABIC MATHEMATICAL ALEF..ARABIC MATHEMATICAL
+1EE05..1EE1F; DISALLOWED # ARABIC MATHEMATICAL WAW..ARABIC MATHEMATICAL
+1EE21..1EE22; DISALLOWED # ARABIC MATHEMATICAL INITIAL BEH..ARABIC MATHE
+1EE24 ; DISALLOWED # ARABIC MATHEMATICAL INITIAL HEH
+1EE27 ; DISALLOWED # ARABIC MATHEMATICAL INITIAL HAH
+1EE29..1EE32; DISALLOWED # ARABIC MATHEMATICAL INITIAL YEH..ARABIC MATHE
+1EE34..1EE37; DISALLOWED # ARABIC MATHEMATICAL INITIAL SHEEN..ARABIC MAT
+1EE39 ; DISALLOWED # ARABIC MATHEMATICAL INITIAL DAD
+1EE3B ; DISALLOWED # ARABIC MATHEMATICAL INITIAL GHAIN
+1EE42 ; DISALLOWED # ARABIC MATHEMATICAL TAILED JEEM
+1EE47 ; DISALLOWED # ARABIC MATHEMATICAL TAILED HAH
+1EE49 ; DISALLOWED # ARABIC MATHEMATICAL TAILED YEH
+1EE4B ; DISALLOWED # ARABIC MATHEMATICAL TAILED LAM
+1EE4D..1EE4F; DISALLOWED # ARABIC MATHEMATICAL TAILED NOON..ARABIC MATHE
+1EE51..1EE52; DISALLOWED # ARABIC MATHEMATICAL TAILED SAD..ARABIC MATHEM
+1EE54 ; DISALLOWED # ARABIC MATHEMATICAL TAILED SHEEN
+1EE57 ; DISALLOWED # ARABIC MATHEMATICAL TAILED KHAH
+1EE59 ; DISALLOWED # ARABIC MATHEMATICAL TAILED DAD
+1EE5B ; DISALLOWED # ARABIC MATHEMATICAL TAILED GHAIN
+1EE5D ; DISALLOWED # ARABIC MATHEMATICAL TAILED DOTLESS NOON
+1EE5F ; DISALLOWED # ARABIC MATHEMATICAL TAILED DOTLESS QAF
+1EE61..1EE62; DISALLOWED # ARABIC MATHEMATICAL STRETCHED BEH..ARABIC MAT
+1EE64 ; DISALLOWED # ARABIC MATHEMATICAL STRETCHED HEH
+1EE67..1EE6A; DISALLOWED # ARABIC MATHEMATICAL STRETCHED HAH..ARABIC MAT
+1EE6C..1EE72; DISALLOWED # ARABIC MATHEMATICAL STRETCHED MEEM..ARABIC MA
+1EE74..1EE77; DISALLOWED # ARABIC MATHEMATICAL STRETCHED SHEEN..ARABIC M
+1EE79..1EE7C; DISALLOWED # ARABIC MATHEMATICAL STRETCHED DAD..ARABIC MAT
+1EE7E ; DISALLOWED # ARABIC MATHEMATICAL STRETCHED DOTLESS FEH
+1EE80..1EE89; DISALLOWED # ARABIC MATHEMATICAL LOOPED ALEF..ARABIC MATHE
+1EE8B..1EE9B; DISALLOWED # ARABIC MATHEMATICAL LOOPED LAM..ARABIC MATHEM
+1EEA1..1EEA3; DISALLOWED # ARABIC MATHEMATICAL DOUBLE-STRUCK BEH..ARABIC
+1EEA5..1EEA9; DISALLOWED # ARABIC MATHEMATICAL DOUBLE-STRUCK WAW..ARABIC
+1EEAB..1EEBB; DISALLOWED # ARABIC MATHEMATICAL DOUBLE-STRUCK LAM..ARABIC
+1EEF0..1EEF1; DISALLOWED # ARABIC MATHEMATICAL OPERATOR MEEM WITH HAH WI
+1F0BF ; DISALLOWED # PLAYING CARD RED JOKER
+1F0E0..1F0F5; DISALLOWED # PLAYING CARD FOOL..PLAYING CARD TRUMP-21
+1F10B..1F10C; DISALLOWED # DINGBAT CIRCLED SANS-SERIF DIGIT ZERO..DINGBA
+1F16A..1F16B; DISALLOWED # RAISED MC SIGN..RAISED MD SIGN
+1F321..1F32C; DISALLOWED # THERMOMETER..WIND BLOWING FACE
+1F336 ; DISALLOWED # HOT PEPPER
+1F37D ; DISALLOWED # FORK AND KNIFE WITH PLATE
+1F394..1F39F; DISALLOWED # HEART WITH TIP ON THE LEFT..ADMISSION TICKETS
+1F3C5 ; DISALLOWED # SPORTS MEDAL
+1F3CB..1F3CE; DISALLOWED # WEIGHT LIFTER..RACING CAR
+1F3D4..1F3DF; DISALLOWED # SNOW CAPPED MOUNTAIN..STADIUM
+1F3F1..1F3F7; DISALLOWED # WHITE PENNANT..LABEL
+1F43F ; DISALLOWED # CHIPMUNK
+1F441 ; DISALLOWED # EYE
+1F4F8 ; DISALLOWED # CAMERA WITH FLASH
+1F4FD..1F4FE; DISALLOWED # FILM PROJECTOR..PORTABLE STEREO
+1F53E..1F54A; DISALLOWED # LOWER RIGHT SHADOWED WHITE CIRCLE..DOVE OF PE
+1F568..1F579; DISALLOWED # RIGHT SPEAKER..JOYSTICK
+1F57B..1F5A3; DISALLOWED # LEFT HAND TELEPHONE RECEIVER..BLACK DOWN POIN
+1F5A5..1F5FA; DISALLOWED # DESKTOP COMPUTER..WORLD MAP
+1F600 ; DISALLOWED # GRINNING FACE
+1F611 ; DISALLOWED # EXPRESSIONLESS FACE
+1F615 ; DISALLOWED # CONFUSED FACE
+1F617 ; DISALLOWED # KISSING FACE
+1F619 ; DISALLOWED # KISSING FACE WITH SMILING EYES
+1F61B ; DISALLOWED # FACE WITH STUCK-OUT TONGUE
+1F61F ; DISALLOWED # WORRIED FACE
+1F626..1F627; DISALLOWED # FROWNING FACE WITH OPEN MOUTH..ANGUISHED FACE
+1F62C ; DISALLOWED # GRIMACING FACE
+1F62E..1F62F; DISALLOWED # FACE WITH OPEN MOUTH..HUSHED FACE
+1F634 ; DISALLOWED # SLEEPING FACE
+1F641..1F642; DISALLOWED # SLIGHTLY FROWNING FACE..SLIGHTLY SMILING FACE
+1F650..1F67F; DISALLOWED # NORTH WEST POINTING LEAF..REVERSE CHECKER BOA
+1F6C6..1F6CF; DISALLOWED # TRIANGLE WITH ROUNDED CORNERS..BED
+1F6E0..1F6EC; DISALLOWED # HAMMER AND WRENCH..AIRPLANE ARRIVING
+1F6F0..1F6F3; DISALLOWED # SATELLITE..PASSENGER SHIP
+1F780..1F7D4; DISALLOWED # BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE.
+1F800..1F80B; DISALLOWED # LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD
+1F810..1F847; DISALLOWED # LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWH
+1F850..1F859; DISALLOWED # LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERI
+1F860..1F887; DISALLOWED # WIDE-HEADED LEFTWARDS LIGHT BARB ARROW..WIDE-
+1F890..1F8AD; DISALLOWED # LEFTWARDS TRIANGLE ARROWHEAD..WHITE ARROW SHA
+
+Appendix B. Changes from Unicode 7.0.0 to Unicode 8.0.0
+
+ Changes from derived property value UNASSIGNED to either PVALID or
+ DISALLOWED.
+
+08B3..08B4 ; PVALID # ARABIC LETTER AIN WITH THREE DOTS BELOW..ARAB
+08E3 ; PVALID # ARABIC TURNED DAMMA BELOW
+0AF9 ; PVALID # GUJARATI LETTER ZHA
+0C5A ; PVALID # TELUGU LETTER RRRA
+0D5F ; PVALID # MALAYALAM LETTER ARCHAIC II
+13F5 ; PVALID # CHEROKEE LETTER MV
+13F8..13FD ; DISALLOWED # CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETT
+20BE ; DISALLOWED # LARI SIGN
+218A..218B ; DISALLOWED # TURNED DIGIT TWO..TURNED DIGIT THREE
+2BEC..2BEF ; DISALLOWED # LEFTWARDS TWO-HEADED ARROW WITH TRIANGLE ARRO
+9FCD..9FD5 ; PVALID # <CJK Ideograph>..<CJK Ideograph>
+A69E ; PVALID # COMBINING CYRILLIC LETTER EF
+A78F ; PVALID # LATIN LETTER SINOLOGICAL DOT
+A7B2..A7B4 ; DISALLOWED # LATIN CAPITAL LETTER J WITH CROSSED-TAIL..LAT
+A7B5 ; PVALID # LATIN SMALL LETTER BETA
+A7B6 ; DISALLOWED # LATIN CAPITAL LETTER OMEGA
+A7B7 ; PVALID # LATIN SMALL LETTER OMEGA
+A8FC ; DISALLOWED # DEVANAGARI SIGN SIDDHAM
+A8FD ; PVALID # DEVANAGARI JAIN OM
+AB60..AB63 ; PVALID # LATIN SMALL LETTER SAKHA YAT..LATIN SMALL LET
+AB70..ABBF ; DISALLOWED # CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETTE
+FE2E..FE2F ; PVALID # COMBINING CYRILLIC TITLO LEFT HALF..COMBINING
+108E0..108F2; PVALID # HATRAN LETTER ALEPH..HATRAN LETTER QOPH
+108F4..108F5; PVALID # HATRAN LETTER SHIN..HATRAN LETTER TAW
+108FB..108FF; DISALLOWED # HATRAN NUMBER ONE..HATRAN NUMBER ONE HUNDRED
+109BC..109BD; DISALLOWED # MEROITIC CURSIVE FRACTION ELEVEN TWELFTHS..ME
+109C0..109CF; DISALLOWED # MEROITIC CURSIVE NUMBER ONE..MEROITIC CURSIVE
+109D2..109FF; DISALLOWED # MEROITIC CURSIVE NUMBER ONE HUNDRED..MEROITIC
+10C80..10CB2; DISALLOWED # OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN
+10CC0..10CF2; PVALID # OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN S
+10CFA..10CFF; DISALLOWED # OLD HUNGARIAN NUMBER ONE..OLD HUNGARIAN NUMBE
+111C9 ; DISALLOWED # SHARADA SANDHI MARK
+111CA..111CC; PVALID # SHARADA SIGN NUKTA..SHARADA EXTRA SHORT VOWEL
+111DB ; DISALLOWED # SHARADA SIGN SIDDHAM
+111DC ; PVALID # SHARADA HEADSTROKE
+111DD..111DF; DISALLOWED # SHARADA CONTINUATION SIGN..SHARADA SECTION MA
+11280..11286; PVALID # MULTANI LETTER A..MULTANI LETTER GA
+11288 ; PVALID # MULTANI LETTER GHA
+1128A..1128D; PVALID # MULTANI LETTER CA..MULTANI LETTER JJA
+1128F..1129D; PVALID # MULTANI LETTER NYA..MULTANI LETTER BA
+1129F..112A8; PVALID # MULTANI LETTER BHA..MULTANI LETTER RHA
+112A9 ; DISALLOWED # MULTANI SECTION MARK
+11300 ; PVALID # GRANTHA SIGN COMBINING ANUSVARA ABOVE
+11350 ; PVALID # GRANTHA OM
+115CA..115D7; DISALLOWED # SIDDHAM SECTION MARK WITH TRIDENT AND U-SHAPE
+115D8..115DD; PVALID # SIDDHAM LETTER THREE-CIRCLE ALTERNATE I..SIDD
+11700..11719; PVALID # AHOM LETTER KA..AHOM LETTER JHA
+1171D..1172B; PVALID # AHOM CONSONANT SIGN MEDIAL LA..AHOM SIGN KILL
+11730..11739; PVALID # AHOM DIGIT ZERO..AHOM DIGIT NINE
+1173A..1173F; DISALLOWED # AHOM NUMBER TEN..AHOM SYMBOL VI
+12399 ; PVALID # CUNEIFORM SIGN U U
+12480..12543; PVALID # CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM S
+14400..14646; PVALID # ANATOLIAN HIEROGLYPH A001..ANATOLIAN HIEROGLY
+1D1DE..1D1E8; DISALLOWED # MUSICAL SYMBOL KIEVAN C CLEF..MUSICAL SYMBOL
+1D800..1D9FF; DISALLOWED # SIGNWRITING HAND-FIST INDEX..SIGNWRITING HEAD
+1DA00..1DA36; PVALID # SIGNWRITING HEAD RIM..SIGNWRITING AIR SUCKING
+1DA37..1DA3A; DISALLOWED # SIGNWRITING AIR BLOW SMALL ROTATIONS..SIGNWRI
+1DA3B..1DA6C; PVALID # SIGNWRITING MOUTH CLOSED NEUTRAL..SIGNWRITING
+1DA6D..1DA74; DISALLOWED # SIGNWRITING SHOULDER HIP SPINE..SIGNWRITING T
+1DA75 ; PVALID # SIGNWRITING UPPER BODY TILTING FROM HIP JOINT
+1DA76..1DA83; DISALLOWED # SIGNWRITING LIMB COMBINATION..SIGNWRITING LOC
+1DA84 ; PVALID # SIGNWRITING LOCATION HEAD NECK
+1DA85..1DA8B; DISALLOWED # SIGNWRITING LOCATION TORSO..SIGNWRITING PAREN
+1DA9B..1DA9F; PVALID # SIGNWRITING FILL MODIFIER-2..SIGNWRITING FILL
+1DAA1..1DAAF; PVALID # SIGNWRITING ROTATION MODIFIER-2..SIGNWRITING
+1F32D..1F32F; DISALLOWED # HOT DOG..BURRITO
+1F37E..1F37F; DISALLOWED # BOTTLE WITH POPPING CORK..POPCORN
+1F3CF..1F3D3; DISALLOWED # CRICKET BAT AND BALL..TABLE TENNIS PADDLE AND
+1F3F8..1F3FF; DISALLOWED # BADMINTON RACQUET AND SHUTTLECOCK..EMOJI MODI
+1F4FF ; DISALLOWED # PRAYER BEADS
+1F54B..1F54F; DISALLOWED # KAABA..BOWL OF HYGIEIA
+1F643..1F644; DISALLOWED # UPSIDE-DOWN FACE..FACE WITH ROLLING EYES
+1F6D0 ; DISALLOWED # PLACE OF WORSHIP
+1F910..1F918; DISALLOWED # ZIPPER-MOUTH FACE..SIGN OF THE HORNS
+1F980..1F984; DISALLOWED # CRAB..UNICORN FACE
+1F9C0 ; DISALLOWED # CHEESE WEDGE
+2B820..2CEA1; PVALID # <CJK Ideograph Extension E>..<CJK Ideograph E
+
+Appendix C. Changes from Unicode 8.0.0 to Unicode 9.0.0
+
+ Changes from derived property value UNASSIGNED to either PVALID or
+ DISALLOWED.
+
+08B6..08BD ; PVALID # ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARAB
+08D4..08E1 ; PVALID # ARABIC SMALL HIGH WORD AR-RUB..ARABIC SMALL H
+08E2 ; DISALLOWED # ARABIC DISPUTED END OF AYAH
+0C80 ; PVALID # KANNADA SIGN SPACING CANDRABINDU
+0D4F ; DISALLOWED # MALAYALAM SIGN PARA
+0D54..0D56 ; PVALID # MALAYALAM LETTER CHILLU M..MALAYALAM LETTER C
+0D58..0D5E ; DISALLOWED # MALAYALAM FRACTION ONE ONE-HUNDRED-AND-SIXTIE
+0D76..0D78 ; DISALLOWED # MALAYALAM FRACTION ONE SIXTEENTH..MALAYALAM F
+1C80..1C88 ; DISALLOWED # CYRILLIC SMALL LETTER ROUNDED VE..CYRILLIC SM
+1DFB ; PVALID # COMBINING DELETION MARK
+23FB..23FE ; DISALLOWED # POWER SYMBOL..POWER SLEEP SYMBOL
+2E43..2E44 ; DISALLOWED # DASH WITH LEFT UPTURN..DOUBLE SUSPENSION MARK
+A7AE ; DISALLOWED # LATIN CAPITAL LETTER SMALL CAPITAL I
+A8C5 ; PVALID # SAURASHTRA SIGN CANDRABINDU
+1018D..1018E; DISALLOWED # GREEK INDICTION SIGN..NOMISMA SIGN
+104B0..104D3; DISALLOWED # OSAGE CAPITAL LETTER A..OSAGE CAPITAL LETTER
+104D8..104FB; PVALID # OSAGE SMALL LETTER A..OSAGE SMALL LETTER ZHA
+1123E ; PVALID # KHOJKI SIGN SUKUN
+11400..1144A; PVALID # NEWA LETTER A..NEWA SIDDHI
+1144B..1144F; DISALLOWED # NEWA DANDA..NEWA ABBREVIATION SIGN
+11450..11459; PVALID # NEWA DIGIT ZERO..NEWA DIGIT NINE
+1145B ; DISALLOWED # NEWA PLACEHOLDER MARK
+1145D ; DISALLOWED # NEWA INSERTION SIGN
+11660..1166C; DISALLOWED # MONGOLIAN BIRGA WITH ORNAMENT..MONGOLIAN TURN
+11C00..11C08; PVALID # BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC
+11C0A..11C36; PVALID # BHAIKSUKI LETTER E..BHAIKSUKI VOWEL SIGN VOCA
+11C38..11C40; PVALID # BHAIKSUKI VOWEL SIGN E..BHAIKSUKI SIGN AVAGRA
+11C41..11C45; DISALLOWED # BHAIKSUKI DANDA..BHAIKSUKI GAP FILLER-2
+11C50..11C59; PVALID # BHAIKSUKI DIGIT ZERO..BHAIKSUKI DIGIT NINE
+11C5A..11C6C; DISALLOWED # BHAIKSUKI NUMBER ONE..BHAIKSUKI HUNDREDS UNIT
+11C70..11C71; DISALLOWED # MARCHEN HEAD MARK..MARCHEN MARK SHAD
+11C72..11C8F; PVALID # MARCHEN LETTER KA..MARCHEN LETTER A
+11C92..11CA7; PVALID # MARCHEN SUBJOINED LETTER KA..MARCHEN SUBJOINE
+11CA9..11CB6; PVALID # MARCHEN SUBJOINED LETTER YA..MARCHEN SIGN CAN
+16FE0 ; PVALID # TANGUT ITERATION MARK
+17000..187EC; PVALID # <Tangut Ideograph>..<Tangut Ideograph>
+18800..18AF2; PVALID # TANGUT COMPONENT-001..TANGUT COMPONENT-755
+1E000..1E006; PVALID # COMBINING GLAGOLITIC LETTER AZU..COMBINING GL
+1E008..1E018; PVALID # COMBINING GLAGOLITIC LETTER ZEMLJA..COMBINING
+1E01B..1E021; PVALID # COMBINING GLAGOLITIC LETTER SHTA..COMBINING G
+1E023..1E024; PVALID # COMBINING GLAGOLITIC LETTER YU..COMBINING GLA
+1E026..1E02A; PVALID # COMBINING GLAGOLITIC LETTER YO..COMBINING GLA
+1E900..1E921; DISALLOWED # ADLAM CAPITAL LETTER ALIF..ADLAM CAPITAL LETT
+1E922..1E94A; PVALID # ADLAM SMALL LETTER ALIF..ADLAM NUKTA
+1E950..1E959; PVALID # ADLAM DIGIT ZERO..ADLAM DIGIT NINE
+1E95E..1E95F; DISALLOWED # ADLAM INITIAL EXCLAMATION MARK..ADLAM INITIAL
+1F19B..1F1AC; DISALLOWED # SQUARED THREE D..SQUARED VOD
+1F23B ; DISALLOWED # SQUARED CJK UNIFIED IDEOGRAPH-914D
+1F57A ; DISALLOWED # MAN DANCING
+1F5A4 ; DISALLOWED # BLACK HEART
+1F6D1..1F6D2; DISALLOWED # OCTAGONAL SIGN..SHOPPING TROLLEY
+1F6F4..1F6F6; DISALLOWED # SCOOTER..CANOE
+1F919..1F91E; DISALLOWED # CALL ME HAND..HAND WITH INDEX AND MIDDLE FING
+1F920..1F927; DISALLOWED # FACE WITH COWBOY HAT..SNEEZING FACE
+1F930 ; DISALLOWED # PREGNANT WOMAN
+1F933..1F93E; DISALLOWED # SELFIE..HANDBALL
+1F940..1F94B; DISALLOWED # WILTED FLOWER..MARTIAL ARTS UNIFORM
+1F950..1F95E; DISALLOWED # CROISSANT..PANCAKES
+1F985..1F991; DISALLOWED # EAGLE..SQUID
+
+Appendix D. Changes from Unicode 9.0.0 to Unicode 10.0.0
+
+ Changes from derived property value UNASSIGNED to either PVALID or
+ DISALLOWED.
+
+0860..086A ; PVALID # SYRIAC LETTER MALAYALAM NGA..SYRIAC LETTER MA
+09FC ; PVALID # BENGALI LETTER VEDIC ANUSVARA
+09FD ; DISALLOWED # BENGALI ABBREVIATION SIGN
+0AFA..0AFF ; PVALID # GUJARATI SIGN SUKUN..GUJARATI SIGN TWO-CIRCLE
+0D00 ; PVALID # MALAYALAM SIGN COMBINING ANUSVARA ABOVE
+0D3B..0D3C ; PVALID # MALAYALAM SIGN VERTICAL BAR VIRAMA..MALAYALAM
+1CF7 ; PVALID # VEDIC SIGN ATIKRAMA
+1DF6..1DF9 ; PVALID # COMBINING KAVYKA ABOVE RIGHT..COMBINING WIDE
+20BF ; DISALLOWED # BITCOIN SIGN
+23FF ; DISALLOWED # OBSERVER EYE SYMBOL
+2BD2 ; DISALLOWED # GROUP MARK
+2E45..2E49 ; DISALLOWED # INVERTED LOW KAVYKA..DOUBLE STACKED COMMA
+312E ; PVALID # BOPOMOFO LETTER O WITH DOT ABOVE
+9FD6..9FEA ; PVALID # <CJK Ideograph>..<CJK Ideograph>
+1032D..1032F; PVALID # OLD ITALIC LETTER YE..OLD ITALIC LETTER SOUTH
+11A00..11A3E; PVALID # ZANABAZAR SQUARE LETTER A..ZANABAZAR SQUARE C
+11A3F..11A46; DISALLOWED # ZANABAZAR SQUARE INITIAL HEAD MARK..ZANABAZAR
+11A47 ; PVALID # ZANABAZAR SQUARE SUBJOINER
+11A50..11A83; PVALID # SOYOMBO LETTER A..SOYOMBO LETTER KSSA
+11A86..11A99; PVALID # SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO SU
+11A9A..11A9C; DISALLOWED # SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD
+11A9E..11AA2; DISALLOWED # SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPL
+11D00..11D06; PVALID # MASARAM GONDI LETTER A..MASARAM GONDI LETTER
+11D08..11D09; PVALID # MASARAM GONDI LETTER AI..MASARAM GONDI LETTER
+11D0B..11D36; PVALID # MASARAM GONDI LETTER AU..MASARAM GONDI VOWEL
+11D3A ; PVALID # MASARAM GONDI VOWEL SIGN E
+11D3C..11D3D; PVALID # MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VO
+11D3F..11D47; PVALID # MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI RA
+11D50..11D59; PVALID # MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT
+16FE1 ; PVALID # NUSHU ITERATION MARK
+1B002..1B11E; PVALID # HENTAIGANA LETTER A-1..HENTAIGANA LETTER N-MU
+1B170..1B2FB; PVALID # NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
+1F260..1F265; DISALLOWED # ROUNDED SYMBOL FOR FU..ROUNDED SYMBOL FOR CAI
+1F6D3..1F6D4; DISALLOWED # STUPA..PAGODA
+1F6F7..1F6F8; DISALLOWED # SLED..FLYING SAUCER
+1F900..1F90B; DISALLOWED # CIRCLED CROSS FORMEE WITH FOUR DOTS..DOWNWARD
+1F91F ; DISALLOWED # I LOVE YOU HAND SIGN
+1F928..1F92F; DISALLOWED # FACE WITH ONE EYEBROW RAISED..SHOCKED FACE WI
+1F931..1F932; DISALLOWED # BREAST-FEEDING..PALMS UP TOGETHER
+1F94C ; DISALLOWED # CURLING STONE
+1F95F..1F96B; DISALLOWED # DUMPLING..CANNED FOOD
+1F992..1F997; DISALLOWED # GIRAFFE FACE..CRICKET
+1F9D0..1F9E6; DISALLOWED # FACE WITH MONOCLE..SOCKS
+2CEB0..2EBE0; PVALID # <CJK Ideograph Extension F>..<CJK Ideograph E
+
+Appendix E. Changes from Unicode 10.0.0 to Unicode 11.0.0
+
+ Changes from derived property value DISALLOWED to PVALID.
+
+ 111C9 ; PVALID # SHARADA SANDHI MARK
+
+ Changes from derived property value UNASSIGNED to either PVALID or
+ DISALLOWED.
+
+0560 ; PVALID # ARMENIAN SMALL LETTER TURNED AYB
+0588 ; PVALID # ARMENIAN SMALL LETTER YI WITH STROKE
+05EF ; PVALID # HEBREW YOD TRIANGLE
+07FD ; PVALID # NKO DANTAYALAN
+07FE..07FF ; DISALLOWED # NKO DOROME SIGN..NKO TAMAN SIGN
+08D3 ; PVALID # ARABIC SMALL LOW WAW
+09FE ; PVALID # BENGALI SANDHI MARK
+0A76 ; DISALLOWED # GURMUKHI ABBREVIATION SIGN
+0C04 ; PVALID # TELUGU SIGN COMBINING ANUSVARA ABOVE
+0C84 ; DISALLOWED # KANNADA SIGN SIDDHAM
+1878 ; PVALID # MONGOLIAN LETTER CHA WITH TWO DOTS
+1C90..1CBA ; DISALLOWED # GEORGIAN MTAVRULI CAPITAL LETTER AN..GEORGIAN
+1CBD..1CBF ; DISALLOWED # GEORGIAN MTAVRULI CAPITAL LETTER AEN..GEORGIA
+2BBA..2BBC ; DISALLOWED # OVERLAPPING WHITE SQUARES..OVERLAPPING BLACK
+2BD3..2BEB ; DISALLOWED # PLUTO FORM TWO..STAR WITH RIGHT HALF BLACK
+2BF0..2BFE ; DISALLOWED # ERIS FORM ONE..REVERSED RIGHT ANGLE
+2E4A..2E4E ; DISALLOWED # DOTTED SOLIDUS..PUNCTUS ELEVATUS MARK
+312F ; PVALID # BOPOMOFO LETTER NN
+9FEB..9FEF ; PVALID # <CJK Ideograph>..<CJK Ideograph>
+A7AF ; PVALID # LATIN LETTER SMALL CAPITAL Q
+A7B8 ; DISALLOWED # LATIN CAPITAL LETTER U WITH STROKE
+A7B9 ; PVALID # LATIN SMALL LETTER U WITH STROKE
+A8FE..A8FF ; PVALID # DEVANAGARI LETTER AY..DEVANAGARI VOWEL SIGN A
+10A34..10A35; PVALID # KHAROSHTHI LETTER TTTA..KHAROSHTHI LETTER VHA
+10A48 ; DISALLOWED # KHAROSHTHI FRACTION ONE HALF
+10D00..10D27; PVALID # HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA SIG
+10D30..10D39; PVALID # HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA D
+10F00..10F1C; PVALID # OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER
+10F1D..10F26; DISALLOWED # OLD SOGDIAN NUMBER ONE..OLD SOGDIAN FRACTION
+10F27 ; PVALID # OLD SOGDIAN LIGATURE AYIN-DALETH
+10F30..10F50; PVALID # SOGDIAN LETTER ALEPH..SOGDIAN COMBINING STROK
+10F51..10F59; DISALLOWED # SOGDIAN NUMBER ONE..SOGDIAN PUNCTUATION HALF
+110CD ; DISALLOWED # KAITHI NUMBER SIGN ABOVE
+11144..11146; PVALID # CHAKMA LETTER LHAA..CHAKMA VOWEL SIGN EI
+1133B ; PVALID # COMBINING BINDU BELOW
+1145E ; PVALID # NEWA SANDHI MARK
+1171A ; PVALID # AHOM LETTER ALTERNATE BA
+11800..1183A; PVALID # DOGRA LETTER A..DOGRA SIGN NUKTA
+1183B ; DISALLOWED # DOGRA ABBREVIATION SIGN
+11A9D ; PVALID # SOYOMBO MARK PLUTA
+11D60..11D65; PVALID # GUNJALA GONDI LETTER A..GUNJALA GONDI LETTER
+11D67..11D68; PVALID # GUNJALA GONDI LETTER EE..GUNJALA GONDI LETTER
+11D6A..11D8E; PVALID # GUNJALA GONDI LETTER OO..GUNJALA GONDI VOWEL
+11D90..11D91; PVALID # GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VO
+11D93..11D98; PVALID # GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI OM
+11DA0..11DA9; PVALID # GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT
+11EE0..11EF6; PVALID # MAKASAR LETTER KA..MAKASAR VOWEL SIGN O
+11EF7..11EF8; DISALLOWED # MAKASAR PASSIMBANG..MAKASAR END OF SECTION
+16E40..16E5F; DISALLOWED # MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAP
+16E60..16E7F; PVALID # MEDEFAIDRIN SMALL LETTER M..MEDEFAIDRIN SMALL
+16E80..16E9A; DISALLOWED # MEDEFAIDRIN DIGIT ZERO..MEDEFAIDRIN EXCLAMATI
+187ED..187F1; PVALID # <Tangut Ideograph>..<Tangut Ideograph>
+1D2E0..1D2F3; DISALLOWED # MAYAN NUMERAL ZERO..MAYAN NUMERAL NINETEEN
+1D372..1D378; DISALLOWED # IDEOGRAPHIC TALLY MARK ONE..TALLY MARK FIVE
+1EC71..1ECB4; DISALLOWED # INDIC SIYAQ NUMBER ONE..INDIC SIYAQ ALTERNATE
+1F12F ; DISALLOWED # COPYLEFT SYMBOL
+1F6F9 ; DISALLOWED # SKATEBOARD
+1F7D5..1F7D8; DISALLOWED # CIRCLED TRIANGLE..NEGATIVE CIRCLED SQUARE
+1F94D..1F94F; DISALLOWED # LACROSSE STICK AND BALL..FLYING DISC
+1F96C..1F970; DISALLOWED # LEAFY GREEN..SMILING FACE WITH SMILING EYES A
+1F973..1F976; DISALLOWED # FACE WITH PARTY HORN AND PARTY HAT..FREEZING
+1F97A ; DISALLOWED # FACE WITH PLEADING EYES
+1F97C..1F97F; DISALLOWED # LAB COAT..FLAT SHOE
+1F998..1F9A2; DISALLOWED # KANGAROO..SWAN
+1F9B0..1F9B9; DISALLOWED # EMOJI COMPONENT RED HAIR..SUPERVILLAIN
+1F9C1..1F9C2; DISALLOWED # CUPCAKE..SALT SHAKER
+1F9E7..1F9FF; DISALLOWED # RED GIFT ENVELOPE..NAZAR AMULET
+1FA60..1FA6D; DISALLOWED # XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
+
+Appendix F. Changes from Unicode 11.0.0 to Unicode 12.0.0
+
+ Changes from derived property value UNASSIGNED to either PVALID or
+ DISALLOWED.
+
+0C77 ; DISALLOWED # TELUGU SIGN SIDDHAM
+0E86 ; PVALID # LAO LETTER PALI GHA
+0E89 ; PVALID # LAO LETTER PALI CHA
+0E8C ; PVALID # LAO LETTER PALI JHA
+0E8E..0E93 ; PVALID # LAO LETTER PALI NYA..LAO LETTER PALI NNA
+0E98 ; PVALID # LAO LETTER PALI DHA
+0EA0 ; PVALID # LAO LETTER PALI BHA
+0EA8..0EA9 ; PVALID # LAO LETTER SANSKRIT SHA..LAO LETTER SANSKRIT
+0EAC ; PVALID # LAO LETTER PALI LLA
+0EBA ; PVALID # LAO SIGN PALI VIRAMA
+1CFA ; PVALID # VEDIC SIGN DOUBLE ANUSVARA ANTARGOMUKHA
+2BC9 ; DISALLOWED # NEPTUNE FORM TWO
+2BFF ; DISALLOWED # HELLSCHREIBER PAUSE SYMBOL
+2E4F ; DISALLOWED # CORNISH VERSE DIVIDER
+A7BA ; DISALLOWED # LATIN CAPITAL LETTER GLOTTAL A
+A7BB ; PVALID # LATIN SMALL LETTER GLOTTAL A
+A7BC ; DISALLOWED # LATIN CAPITAL LETTER GLOTTAL I
+A7BD ; PVALID # LATIN SMALL LETTER GLOTTAL I
+A7BE ; DISALLOWED # LATIN CAPITAL LETTER GLOTTAL U
+A7BF ; PVALID # LATIN SMALL LETTER GLOTTAL U
+A7C2 ; DISALLOWED # LATIN CAPITAL LETTER ANGLICANA W
+A7C3 ; PVALID # LATIN SMALL LETTER ANGLICANA W
+A7C4..A7C6 ; DISALLOWED # LATIN CAPITAL LETTER C WITH PALATAL HOOK..LAT
+AB66..AB67 ; PVALID # LATIN SMALL LETTER DZ DIGRAPH WITH RETROFLEX
+10FE0..10FF6; PVALID # ELYMAIC LETTER ALEPH..ELYMAIC LIGATURE ZAYIN-
+1145F ; PVALID # NEWA LETTER VEDIC ANUSVARA
+116B8 ; PVALID # TAKRI LETTER ARCHAIC KHA
+119A0..119A7; PVALID # NANDINAGARI LETTER A..NANDINAGARI LETTER VOCA
+119AA..119D7; PVALID # NANDINAGARI LETTER E..NANDINAGARI VOWEL SIGN
+119DA..119E1; PVALID # NANDINAGARI VOWEL SIGN E..NANDINAGARI SIGN AV
+119E2 ; DISALLOWED # NANDINAGARI SIGN SIDDHAM
+119E3..119E4; PVALID # NANDINAGARI HEADSTROKE..NANDINAGARI VOWEL SIG
+11A84..11A85; PVALID # SOYOMBO SIGN JIHVAMULIYA..SOYOMBO SIGN UPADHM
+11FC0..11FF1; DISALLOWED # TAMIL FRACTION ONE THREE-HUNDRED-AND-TWENTIET
+11FFF ; DISALLOWED # TAMIL PUNCTUATION END OF TEXT
+13430..13438; DISALLOWED # EGYPTIAN HIEROGLYPH VERTICAL JOINER..EGYPTIAN
+16F45..16F4A; PVALID # MIAO LETTER BRI..MIAO LETTER RTE
+16F4F ; PVALID # MIAO SIGN CONSONANT MODIFIER BAR
+16F7F..16F87; PVALID # MIAO VOWEL SIGN UOG..MIAO VOWEL SIGN UI
+16FE2 ; DISALLOWED # OLD CHINESE HOOK MARK
+16FE3 ; PVALID # OLD CHINESE ITERATION MARK
+187F2..187F7; PVALID # <Tangut Ideograph>..<Tangut Ideograph>
+1B150..1B152; PVALID # HIRAGANA LETTER SMALL WI..HIRAGANA LETTER SMA
+1B164..1B167; PVALID # KATAKANA LETTER SMALL WI..KATAKANA LETTER SMA
+1E100..1E12C; PVALID # NYIAKENG PUACHUE HMONG LETTER MA..NYIAKENG PU
+1E130..1E13D; PVALID # NYIAKENG PUACHUE HMONG TONE-B..NYIAKENG PUACH
+1E140..1E149; PVALID # NYIAKENG PUACHUE HMONG DIGIT ZERO..NYIAKENG P
+1E14E ; PVALID # NYIAKENG PUACHUE HMONG LOGOGRAM NYAJ
+1E14F ; DISALLOWED # NYIAKENG PUACHUE HMONG CIRCLED CA
+1E2C0..1E2F9; PVALID # WANCHO LETTER AA..WANCHO DIGIT NINE
+1E2FF ; DISALLOWED # WANCHO NGUN SIGN
+1E94B ; PVALID # ADLAM NASALIZATION MARK
+1ED01..1ED3D; DISALLOWED # OTTOMAN SIYAQ NUMBER ONE..OTTOMAN SIYAQ FRACT
+1F16C ; DISALLOWED # RAISED MR SIGN
+1F6D5 ; DISALLOWED # HINDU TEMPLE
+1F6FA ; DISALLOWED # AUTO RICKSHAW
+1F7E0..1F7EB; DISALLOWED # LARGE ORANGE CIRCLE..LARGE BROWN SQUARE
+1F90D..1F90F; DISALLOWED # WHITE HEART..PINCHING HAND
+1F93F ; DISALLOWED # DIVING MASK
+1F971 ; DISALLOWED # YAWNING FACE
+1F97B ; DISALLOWED # SARI
+1F9A5..1F9AA; DISALLOWED # SLOTH..OYSTER
+1F9AE..1F9AF; DISALLOWED # GUIDE DOG..PROBING CANE
+1F9BA..1F9BF; DISALLOWED # SAFETY VEST..MECHANICAL LEG
+1F9C3..1F9CA; DISALLOWED # BEVERAGE BOX..ICE CUBE
+1F9CD..1F9CF; DISALLOWED # STANDING PERSON..DEAF PERSON
+1FA00..1FA53; DISALLOWED # NEUTRAL CHESS KING..BLACK CHESS KNIGHT-BISHOP
+1FA70..1FA73; DISALLOWED # BALLET SHOES..SHORTS
+1FA78..1FA7A; DISALLOWED # DROP OF BLOOD..STETHOSCOPE
+1FA80..1FA82; DISALLOWED # YO-YO..PARACHUTE
+1FA90..1FA95; DISALLOWED # RINGED PLANET..BANJO
+
+Acknowledgments
+
+ Thanks to Harald Alvestrand, Marc Blanchet, Martin Dürst, Asmus
+ Freytag, Ted Hardie, John Klensin, Erik Nordmark, Pete Resnick, Peter
+ Saint-Andre, Michel Suignard, Andrew Sullivan, and Suzanne Woolf for
+ input to this document.
+
+Author's Address
+
+ Patrik Fältström
+ Netnod
+ Email: paf@netnod.se