doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc6497.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 843 insertions, 0 deletions
diff --git a/doc/rfc/rfc6497.txt b/doc/rfc/rfc6497.txt
new file mode 100644
index 0000000..a73c143
--- /dev/null
+++ b/doc/rfc/rfc6497.txt
@@ -0,0 +1,843 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                          M. Davis
+Request for Comments: 6497                                        Google
+Category: Informational                                      A. Phillips
+ISSN: 2070-1721                                                   Lab126
+                                                               Y. Umaoka
+                                                                     IBM
+                                                                 C. Falk
+                                                       Infinite Automata
+                                                           February 2012
+
+
+                BCP 47 Extension T - Transformed Content
+
+Abstract
+
+   This document specifies an Extension to BCP 47 that provides subtags
+   for specifying the source language or script of transformed content,
+   including content that has been transliterated, transcribed, or
+   translated, or in some other way influenced by the source.  It also
+   provides for additional information used for identification.
+
+Status of This Memo
+
+   This document is not an Internet Standards Track specification; it is
+   published for informational purposes.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Not all documents
+   approved by the IESG are a candidate for any level of Internet
+   Standard; see Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc6497.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 1]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+Copyright Notice
+
+   Copyright (c) 2012 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+Table of Contents
+
+   1. Introduction ....................................................2
+      1.1. Requirements Language ......................................4
+   2. BCP 47 Required Information .....................................4
+      2.1. Overview ...................................................4
+      2.2. Structure ..................................................6
+      2.3. Canonicalization ...........................................7
+      2.4. BCP 47 Registration Form ...................................8
+      2.5. Field Definitions ..........................................8
+      2.6. Registration of Field Subtags .............................10
+      2.7. Registration of Additional Fields .........................11
+      2.8. Committee Responses to Registration Proposals .............11
+      2.9. Machine-Readable Data .....................................11
+   3. Acknowledgements ...............................................14
+   4. IANA Considerations ............................................14
+   5. Security Considerations ........................................14
+   6. References .....................................................14
+      6.1. Normative References ......................................14
+      6.2. Informative References ....................................15
+
+1.  Introduction
+
+   [BCP47] permits the definition and registration of language tag
+   extensions "that contain a language component and are compatible with
+   applications that understand language tags".  This document defines
+   an extension for specifying the source of content that has been
+   transformed, including text that has been transliterated,
+   transcribed, or translated, or in some other way influenced by the
+   source.  It may be used in queries to request content that has been
+   transformed.  The "singleton" identifier for this extension is 't'.
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 2]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   Language tags, as defined by [BCP47], are useful for identifying the
+   language of content.  There are mechanisms for specifying variant
+   subtags for special purposes.  However, these variants are
+   insufficient for specifying content that has undergone
+   transformations, including content that has been transliterated,
+   transcribed, or translated.  The correct interpretation of the
+   content may depend upon knowledge of the conventions used for the
+   transformation.
+
+   Suppose that Italian or Russian cities on a map are transcribed for
+   Japanese users.  Each name needs to be transliterated into katakana
+   using rules appropriate for the specific source and target language.
+   When tagging such data, it is important to be able to indicate not
+   only the resulting content language ("ja" in this case), but also the
+   source language.
+
+   Transforms such as transliterations may vary, depending not only on
+   the basis of the source and target script, but also on the source and
+   target language.  Thus, the Russian <U+041F U+0443 U+0442 U+0438
+   U+043D> (which corresponds to the Cyrillic <PE, U, TE, I, EN>)
+   transliterates into "Putin" in English but "Poutine" in French.  The
+   identifier could be used to indicate a desired mechanical
+   transformation in an API, or could be used to tag data that has been
+   converted (mechanically or by hand) according to a transliteration
+   method.
+
+   In addition, many different conventions have arisen for how to
+   transform text, even between the same languages and scripts.  For
+   example, "Gaddafi" is commonly transliterated from Arabic to English
+   as any of (G/Q/K/Kh)a(d/dh/dd/dhdh/th/zz)af(i/y).  Some examples of
+   standardized conventions used for transcribing or transliterating
+   text include:
+
+   a.  United Nations Group of Experts on Geographical Names (UNGEGN)
+
+   b.  US Library of Congress (LOC)
+
+   c.  US Board on Geographic Names (BGN)
+
+   d.  Korean Ministry of Culture, Sports and Tourism (MCST)
+
+   e.  International Organization for Standardization (ISO)
+
+   The usage of this extension is not limited to formal transformations,
+   and may include other instances where the content is in some other
+   way influenced by the source.  For example, this extension could be
+   used to designate a request for a speech recognizer that is tailored
+
+
+
+
+Davis, et al.                 Informational                     [Page 3]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   specifically for second-language speakers who are first-language
+   speakers of a particular language (e.g., a recognizer for "English
+   spoken with a Chinese accent").
+
+1.1.  Requirements Language
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119 [RFC2119].
+
+2.  BCP 47 Required Information
+
+2.1.  Overview
+
+   Identification of transformed content can be done using the 't'
+   extension defined in this document.  This extension is formed by the
+   't' singleton followed by a sequence of subtags that would form a
+   language tag as defined by [BCP47].  This allows the source language
+   or script to be specified to the degree of precision required.  There
+   are restrictions on the sequence of subtags.  They MUST form a
+   regular, valid, canonical language tag, and MUST neither include
+   extensions nor private use sequences introduced by the singleton 'x'.
+   Where only the script is relevant (such as identifying a script-
+   script transliteration), then 'und' is used for the primary language
+   subtag.
+
+   For example:
+
+   +---------------------+---------------------------------------------+
+   | Language Tag        | Description                                 |
+   +---------------------+---------------------------------------------+
+   | ja-t-it             | The content is Japanese, transformed from   |
+   |                     | Italian.                                    |
+   | ja-Kana-t-it        | The content is Japanese Katakana,           |
+   |                     | transformed from Italian.                   |
+   | und-Latn-t-und-cyrl | The content is in the Latin script,         |
+   |                     | transformed from the Cyrillic script.       |
+   +---------------------+---------------------------------------------+
+
+   Note that the sequence of subtags governed by 't' cannot contain a
+   singleton (a single-character subtag), because that would start a new
+   extension.  For example, the tag "ja-t-i-ami" does not indicate that
+   the source is in "i-ami", because "i-ami" is not a regular language
+   tag in [BCP47].  That tag would express an empty 't' extension
+   followed by an 'i' extension.
+
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 4]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   The 't' extension is not intended for use in structured data that
+   already provides separate source and target language identifiers.
+   For example, this is the case in localization interchange formats
+   such as XLIFF.  In such cases, it would be inappropriate to use
+   "ja-t-it" for the target language tag because the source language tag
+   "it" would already be present in the data.  Instead, one would use
+   the language tag "ja".
+
+   As noted earlier, it is sometimes necessary to indicate additional
+   information about a transformation.  This additional information is
+   optionally supplied after the source in a series of one or more
+   fields, where each field consists of a field separator subtag
+   followed by one or more non-separator subtags.  Each field separator
+   subtag consists of a single letter followed by a single digit.
+
+   A transformation mechanism is an optional field that indicates the
+   specification used for the transformation, such as "UNGEGN" for the
+   United Nations Group of Experts on Geographical Names
+   transliterations and transcriptions.  It uses the 'm0' field
+   separator followed by certain subtags.
+
+   For example:
+
+   +------------------------------------+------------------------------+
+   | Language Tag                       | Description                  |
+   +------------------------------------+------------------------------+
+   | und-Cyrl-t-und-latn-m0-ungegn-2007 | The content is in Cyrillic,  |
+   |                                    | transformed from Latin,      |
+   |                                    | according to a UNGEGN        |
+   |                                    | specification dated 2007.    |
+   +------------------------------------+------------------------------+
+
+   The field separator subtags, such as 'm0', were chosen because they
+   are short, visually distinctive, and cannot occur in a language
+   subtag (outside of an extension and after 'x'), thus eliminating the
+   potential for collision or confusion with the source language tag.
+
+   The field subtags are defined by Section 3 of Unicode Technical
+   Standard #35: Unicode Locale Data Markup Language (LDML) [UTS35], the
+   main specification for the Unicode Common Locale Data Repository
+   (CLDR) project.  That section also defines the parallel 'u' extension
+   [RFC6067], for which the Unicode Consortium is also the maintaining
+   authority.  As required by BCP 47, subtags follow the language tag
+   ABNF and other rules for the formation of language tags and subtags,
+   are restricted to the ASCII letters and digits, are not case
+   sensitive, and do not exceed eight characters in length.
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 5]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   The LDML specification is available over the Internet and at no cost,
+   and is available via a royalty-free license at
+   http://unicode.org/copyright.html.  LDML is versioned, and each
+   version of LDML is numbered, dated, and stable.  Extension subtags,
+   once defined by LDML, are never retracted or substantially changed in
+   meaning.
+
+   The maintaining authority for the 't' extension is the Unicode
+   Consortium:
+
+   +---------------+---------------------------------------------------+
+   | Item          | Value                                             |
+   +---------------+---------------------------------------------------+
+   | Name          | Unicode Consortium                                |
+   | Contact Email | cldr-contact@unicode.org                          |
+   | Discussion    | cldr-users@unicode.org                            |
+   | List Email    |                                                   |
+   | URL Location  | cldr.unicode.org                                  |
+   | Specification | Unicode Technical Standard #35 Unicode Locale     |
+   |               | Data Markup Language (LDML),                      |
+   |               | http://unicode.org/reports/tr35/                  |
+   | Section       | Section 3 Unicode Language and Locale Identifiers |
+   +---------------+---------------------------------------------------+
+
+2.2.  Structure
+
+   The subtags in the 't' extension are of the following form:
+
+   t-ext     = "t"                      ; Extension
+             (("-" lang *("-" field))   ; Source + optional field(s)
+             / 1*("-" field))           ; Field(s) only (no source)
+
+   lang      = language                 ; BCP 47, with restrictions
+             ["-" script]
+             ["-" region]
+             *("-" variant)
+
+   field     = fsep 1*("-" 3*8alphanum) ; With restrictions
+
+   fsep      = ALPHA DIGIT              ; Subtag separators
+   alphanum  = ALPHA / DIGIT
+
+   where <language>, <script>, <region>, and <variant> rules are
+   specified in [BCP47], and <ALPHA> and <DIGIT> rules in [RFC5234].
+
+
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 6]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   Description and restrictions:
+
+   a.  The 't' extension MUST have at least one subtag.
+
+   b.  The 't' extension normally starts with a source language tag,
+       which MUST be a regular, canonical language tag as specified by
+       [BCP47].  Tags described by the 'irregular' production in BCP 47
+       MUST NOT be used to form the language tag.  The source language
+       tag MAY be omitted: some field values do not require it.
+
+   c.  There is optionally a sequence of fields, where each field has a
+       separator followed by a sequence of one or more subtags.  Two
+       identical field separators MUST NOT be present in the language
+       tag.
+
+   d.  The order of the fields in a 't' extension is not significant.
+       The order of subtags within a field is significant.  See
+       Section 2.3 ("Canonicalization").
+
+   e.  The 't' subtag fields are defined by Section 3 of Unicode
+       Technical Standard #35: Unicode Locale Data Markup Language
+       [UTS35].
+
+2.3.  Canonicalization
+
+   As required by [BCP47], the use of uppercase or lowercase letters is
+   not significant in the subtags used in this extension.  The canonical
+   form for all subtags in the extension is lowercase, with the fields
+   ordered by the separators, alphabetically.  The order of subtags
+   within a field is significant, and MUST NOT be changed in the process
+   of canonicalizing.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 7]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+2.4.  BCP 47 Registration Form
+
+   Per RFC 5646, Section 3.7 [BCP47]:
+
+   %%
+   Identifier: t
+   Description: Specifying Transformed Content
+   Comments: Subtags for the identification of content that has been
+      transformed, including but not limited to:
+      transliteration, transcription, and translation.
+   Added: 2011-12-16
+   RFC: RFC 6497
+   Authority: Unicode Consortium
+   Contact_Email: cldr-contact@unicode.org
+   Mailing_List: cldr-users@unicode.org
+   URL: http://www.unicode.org/Public/cldr/latest/core.zip
+   %%
+
+2.5.  Field Definitions
+
+   Assignment of 't' field subtags is determined by the Unicode CLDR
+   Technical Committee, in accordance with the policies and procedures
+   in http://www.unicode.org/consortium/tc-procedures.html, and subject
+   to the Unicode Consortium Policies on
+   http://www.unicode.org/policies/policies.html.
+
+   Assignments that can be made by successive versions of LDML [UTS35]
+   by the Unicode Consortium without requiring a new RFC include:
+
+   o  The allocation of new field separator subtags for use after the
+      't' extension.
+
+   o  The allocation of subtags valid after a field separator subtag.
+
+   o  The addition of subtag aliases and descriptions.
+
+   o  The modification of subtag descriptions.
+
+   Changes to the syntax or meaning of the 't' extension would require a
+   new RFC that obsoletes this document; such an RFC would break
+   stability, and would thus be contrary to the policies of the Unicode
+   Consortium.
+
+
+
+
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 8]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   At the time this document was published, one field separator subtag
+   was specified in [UTS35]: the transform mechanism.  That field is
+   summarized here:
+
+   a.  The transform mechanism consists of a sequence of subtags
+       starting with the 'm0' separator followed by one or more
+       mechanism subtags.  Each mechanism subtag has a length of 3 to 8
+       alphanumeric characters.  The sequence as a whole provides an
+       identification of the specification for the transform, such as
+       the mechanism subtag 'ungegn' in "und-Cyrl-t-und-latn-m0-ungegn".
+       In many cases, only one mechanism subtag is necessary, but
+       multiple subtags MAY be defined in [UTS35] where necessary.
+
+   b.  Any purely numeric subtag is a representation of a date in the
+       Gregorian calendar.  It MAY occur in any mechanism field, but it
+       SHOULD only be used where necessary.  If it does occur:
+
+       *  it MUST occur as the final subtag in the field
+
+       *  it MUST NOT be the only subtag in the field
+
+       *  it MUST only consist of a sequence of digits of the form YYYY,
+          YYYYMM, or YYYYMMDD
+
+       *  it SHOULD be as short as possible
+
+       Note: The format is related to that of [RFC3339], but is not the
+       same.  The RFC 3339 full-date won't work because it uses hyphens.
+       The offset ("Z") is not used because the date is a publication
+       date (aka 'floating date').  For more information, see
+       Section 3.3 ("Floating Time") of [W3C-TimeZones].
+
+   c.  Examples:
+
+       *  20110623 represents June 23, 2011.
+
+       *  There are three dated versions of the UNGEGN transliteration
+          specification for Hebrew to Latin.  They can be represented by
+          the following language tags:
+
+          +  und-Hebr-t-und-latn-m0-ungegn-1972
+
+          +  und-Hebr-t-und-latn-m0-ungegn-1977
+
+          +  und-Hebr-t-und-latn-m0-ungegn-2007
+
+
+
+
+
+
+Davis, et al.                 Informational                     [Page 9]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+       *  Suppose that the BGN transliteration specification for
+          Cyrillic to Latin had three versions, dated June 11, 1999;
+          Dec 30, 1999; and May 1, 2011.  In that case, the
+          corresponding first two DATE subtags would require the months
+          to be distinctive (199906 and 199912), but the last subtag
+          would only require the year (2011).
+
+   d.  Some mechanisms may use a versioning system that is not
+       distinguished by date, or not by date alone.  In the latter case,
+       the version will be of a form specified by [UTS35] for that
+       mechanism.  For example, if the mechanism xxx uses versions of
+       the form v21a, then a tag could look like "ja-t-it-m0-xxx-v21a".
+       If there are multiple sub-versions distinguished by date, then a
+       tag could look like "ja-t-it-m0-xxx-v21a-2007".
+
+   A language tag with the 't' extension MAY be used to request a
+   specific transform of content.  In such a case, the recipient SHOULD
+   return content that corresponds as closely as feasible to the
+   requested transform, including the specification of the mechanism.
+   For example, if the request is ja-t-it-m0-xxx-v21a-2007, and the
+   recipient has content corresponding to both ja-t-it-m0-xxx-v21a and
+   ja-t-it-m0-xxx-v21b-2009, then the v21a version would be preferred.
+   As is the case for language matching as discussed in [BCP47],
+   different implementations MAY have different measures of "closeness".
+
+2.6.  Registration of Field Subtags
+
+   Registration of transform mechanisms is requested by filing a ticket
+   at http://cldr.unicode.org/.  The proposal in the ticket MUST contain
+   the following information:
+
+   +-------------+-----------------------------------------------------+
+   | Item        | Description                                         |
+   +-------------+-----------------------------------------------------+
+   | Subtag      | The proposed mechanism subtag (or subtag sequence). |
+   | Description | A description of the proposed mechanism; that       |
+   |             | description MUST be sufficient to distinguish it    |
+   |             | from other mechanisms in use.                       |
+   | Version     | If versioning for the mechanism is not done         |
+   |             | according to date, then a description of the        |
+   |             | versioning conventions used for the mechanism.      |
+   +-------------+-----------------------------------------------------+
+
+   Proposals for clarifications of descriptions or additional aliases
+   may also be requested by filing a ticket.
+
+
+
+
+
+
+Davis, et al.                 Informational                    [Page 10]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   The committee MAY define a template for submissions that requests
+   more information, if it is found that such information would be
+   useful in evaluating proposals.
+
+2.7.  Registration of Additional Fields
+
+   In the event that it proves necessary to add an additional field
+   (such as 'm2'), it can be requested by filing a ticket at
+   http://cldr.unicode.org/.  The proposal in the ticket MUST contain a
+   full description of the proposed field semantics and subtag syntax,
+   and MUST conform to the ABNF syntax for "field" presented in
+   Section 2.2.
+
+2.8.  Committee Responses to Registration Proposals
+
+   The committee MUST post each proposal publicly within 2 weeks after
+   reception, to allow for comments.  The committee must respond
+   publicly to each proposal within 4 weeks after reception.
+
+   The response MAY:
+
+   o  request more information or clarification
+
+   o  accept the proposal, optionally with modifications to the subtag
+      or description
+
+   o  reject the proposal, because of significant objections raised on
+      the mailing list or due to problems with constraints in this
+      document or in [UTS35]
+
+   Accepted tickets result in a new entry in the machine-readable CLDR
+   BCP 47 data or, in the case of a clarified description, modifications
+   to the description attribute value for an existing entry.
+
+2.9.  Machine-Readable Data
+
+   Beginning with CLDR version 1.7.2, machine-readable files are
+   available listing the data defined for BCP 47 extensions for each
+   successive version of [UTS35].  The data in these files is used for
+   testing the validity of subtags for the 't' extension and for the 'u'
+   extension [RFC6067], for which the Unicode Consortium is also the
+   maintaining authority.  These releases are listed on
+   http://cldr.unicode.org/index/downloads.  Each release has an
+   associated data directory of the form
+   "http://unicode.org/Public/cldr/<version>", where "<version>" is
+   replaced by the release number.  For example, for version 1.7.2, the
+
+
+
+
+
+Davis, et al.                 Informational                    [Page 11]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   "core.zip" file is located at
+   http://unicode.org/Public/cldr/1.7.2/core.zip.  The most recent
+   version is always identified by the version "latest" and can be
+   accessed by the URL in Section 2.4.
+
+   Inside the "core.zip" file, the directory "common/bcp47" contains the
+   data files listing the valid attributes, keys, and types for each
+   successive version of [UTS35].  Each data file lists the keys and
+   types relevant to that topic.
+
+   The XML structure lists the keys, such as <key extension="t"
+   name="m0" description="Transliteration extension mechanism">, with
+   subelements for the types, such as <type name="ungegn"
+   description="United Nations Group of Experts on Geographical
+   Names"/>.  The currently defined attributes for the mechanisms
+   include:
+
+   +-------------+-------------------------------+---------------------+
+   | Attribute   | Description                   | Examples            |
+   +-------------+-------------------------------+---------------------+
+   | name        | The name of the mechanism,    | UNGEGN, ALALC       |
+   |             | limited to 3-8 characters (or |                     |
+   |             | sequences of them).           |                     |
+   | description | A description of the name,    | United Nations      |
+   |             | with all and only that        | Group of Experts on |
+   |             | information necessary to      | Geographical Names; |
+   |             | distinguish one name from     | American Library    |
+   |             | others with which it might be | Association-Library |
+   |             | confused.  Descriptions are   | of Congress         |
+   |             | not intended to provide       |                     |
+   |             | general background            |                     |
+   |             | information.                  |                     |
+   | since       | Indicates the first version   | 1.9, 2.0.1          |
+   |             | of CLDR where the name        |                     |
+   |             | appears.  (Required for new   |                     |
+   |             | items.)                       |                     |
+   | alias       | Alternative name of the key   |                     |
+   |             | or type, not limited in       |                     |
+   |             | number of characters.         |                     |
+   |             | Aliases are intended for      |                     |
+   |             | backwards compatibility, not  |                     |
+   |             | to provide all possible       |                     |
+   |             | alternate names or            |                     |
+   |             | designations.  (Optional.)    |                     |
+   +-------------+-------------------------------+---------------------+
+
+   The file for the transform extension is "transform.xml".  The initial
+   version of that file contains the following information.
+
+
+
+Davis, et al.                 Informational                    [Page 12]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+   <keyword>
+     <key extension="t" name="m0" description=
+         "Transliteration extension mechanism">
+       <type name="ungegn" description=
+           "United Nations Group of Experts on Geographical Names"
+           since="21"/>
+       <type name="alaloc" description=
+           "American Library Association-Library of Congress"
+           since="21"/>
+       <type name="bgn" description=
+           "US Board on Geographic Names"
+           since="21"/>
+       <type name="mcst" description=
+           "Korean Ministry of Culture, Sports and Tourism"
+           since="21"/>
+       <type name="iso" description=
+           "International Organization for Standardization"
+           since="21"/>
+       <type name="din" description=
+           "Deutsches Institut fuer Normung"
+           since="21"/>
+       <type name="gost" description=
+           "Euro-Asian Council for Standardization, Metrology
+            and Certification"
+           since="21"/>
+     </key>
+   </keyword>
+
+   To get the version information in XML when working with the data
+   files, the XML parser must be validating.  When the 'core.zip' file
+   is unzipped, the 'dtd' directory will be at the same level as the
+   'bcp47' directory; this is required for correct validation.  For each
+   release after CLDR 1.8, types introduced in that release are also
+   marked in the data files by the XML attribute "since", such as in the
+   following example:
+   <type name="adp" since="1.9"/>
+
+   The data is also currently maintained in a source code repository,
+   with each release tagged, for viewing directly without unzipping.
+   For example, see:
+
+   o  http://unicode.org/repos/cldr/tags/release-1-7-2/common/bcp47/
+
+   o  http://unicode.org/repos/cldr/tags/release-1-8/common/bcp47/
+
+   For more information, see
+   http://cldr.unicode.org/index/bcp47-extension.
+
+
+
+
+Davis, et al.                 Informational                    [Page 13]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+3.  Acknowledgements
+
+   Thanks to John Emmons and the rest of the Unicode CLDR Technical
+   Committee for their work in developing the BCP 47 subtags for LDML.
+
+4.  IANA Considerations
+
+   IANA has inserted the record of Section 2.4 into the Language
+   Extensions Registry, according to Section 3.7 ("Extensions and the
+   Extensions Registry") of "Tags for Identifying Languages" [BCP47].
+   Per Section 5.2 of [BCP47], there might be occasional (rare) requests
+   by the Unicode Consortium (the "Authority" listed in the record) for
+   maintenance of this record.  Changes that can be submitted to IANA
+   without the publication of a new RFC are limited to modification of
+   the Comments, Contact_Email, Mailing_List, and URL fields.  Any such
+   requested changes MUST use the domain 'unicode.org' in any new
+   addresses or URIs, MUST explicitly cite this document (so that IANA
+   can reference these requirements), and MUST originate from the
+   'unicode.org' domain.  The domain or authority can only be changed
+   via a new RFC.
+
+5.  Security Considerations
+
+   The security considerations for this extension are the same as those
+   for [BCP47].  See RFC 5646, Section 6, Security Considerations
+   [BCP47].
+
+6.  References
+
+6.1.  Normative References
+
+   [BCP47]    Phillips, A., Ed., and M. Davis, Ed., "Tags for
+              Identifying Languages", BCP 47, RFC 5646, September 2009.
+
+   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
+              Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+   [RFC5234]  Crocker, D., Ed., and P. Overell, "Augmented BNF for
+              Syntax Specifications: ABNF", STD 68, RFC 5234,
+              January 2008.
+
+   [UTS35]    Davis, M., "Unicode Technical Standard #35: Locale Data
+              Markup Language (LDML)", February 2012,
+              <http://www.unicode.org/reports/tr35/>.
+
+
+
+
+
+
+
+Davis, et al.                 Informational                    [Page 14]
+
+RFC 6497                   BCP 47 Extension T              February 2012
+
+
+6.2.  Informative References
+
+   [RFC3339]  Klyne, G. and C. Newman, "Date and Time on the Internet:
+              Timestamps", RFC 3339, July 2002.
+
+   [RFC6067]  Davis, M., Phillips, A., and Y. Umaoka, "BCP 47
+              Extension U", RFC 6067, December 2010.
+
+   [W3C-TimeZones]
+              Phillips, Ed., "W3C Working Group Note: Working with Time
+              Zones", July 2011,
+              <http://www.w3.org/TR/2011/NOTE-timezone-20110705/>.
+
+Authors' Addresses
+
+   Mark Davis
+   Google
+
+   EMail: mark@macchiato.com
+
+
+   Addison Phillips
+   Lab126
+
+   EMail: addison@lab126.com
+
+
+   Yoshito Umaoka
+   IBM
+
+   EMail: yoshito_umaoka@us.ibm.com
+
+
+   Courtney Falk
+   Infinite Automata
+
+   EMail: court@infiauto.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Davis, et al.                 Informational                    [Page 15]
+
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc6497.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)