doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1766.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 507 insertions, 0 deletions
diff --git a/doc/rfc/rfc1766.txt b/doc/rfc/rfc1766.txt
new file mode 100644
index 0000000..901c50e
--- /dev/null
+++ b/doc/rfc/rfc1766.txt
@@ -0,0 +1,507 @@
+
+
+
+
+
+
+Network Working Group                                      H. Alvestrand
+Request for Comments: 1766                                       UNINETT
+Category: Standards Track                                     March 1995
+
+
+                Tags for the Identification of Languages
+
+Status of this Memo
+
+   This document specifies an Internet standards track protocol for the
+   Internet community, and requests discussion and suggestions for
+   improvements.  Please refer to the current edition of the "Internet
+   Official Protocol Standards" (STD 1) for the standardization state
+   and status of this protocol.  Distribution of this memo is unlimited.
+
+Abstract
+
+   This document describes a language tag for use in cases where it is
+   desired to indicate the language used in an information object.
+
+   It also defines a Content-language: header, for use in the case where
+   one desires to indicate the language of something that has RFC-822-
+   like headers, like MIME body parts or Web documents, and a new
+   parameter to the Multipart/Alternative type, to aid in the usage of
+   the Content-Language: header.
+
+1.  Introduction
+
+   There are a number of languages spoken by human beings in this world.
+
+   A great number of these people would prefer to have information
+   presented in a language that they understand.
+
+   In some contexts, it is possible to have information in more than one
+   language, or it might be possible to provide tools for assisting in
+   the understanding of a language (like dictionaries).
+
+   A prerequisite for any such function is a means of labelling the
+   information content with an identifier for the language in which is
+   is written.
+
+   In the tradition of solving only problems that we think we
+   understand, this document specifies an identifier mechanism, and one
+   possible use for it.
+
+
+
+
+
+
+
+Alvestrand                                                      [Page 1]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+2.  The Language tag
+
+   The language tag is composed of 1 or more parts: A primary language
+   tag and a (possibly empty) series of subtags.
+
+   The syntax of this tag in RFC-822 EBNF is:
+
+    Language-Tag = Primary-tag *( "-" Subtag )
+    Primary-tag = 1*8ALPHA
+    Subtag = 1*8ALPHA
+
+   Whitespace is not allowed within the tag.
+
+   All tags are to be treated as case insensitive; there exist
+   conventions for capitalization of some of them, but these should not
+   be taken to carry meaning.
+
+   The namespace of language tags is administered by the IANA according
+   to the rules in section 5 of this document.
+
+   The following registrations are predefined:
+
+   In the primary language tag:
+
+    -    All 2-letter tags are interpreted according to ISO standard
+         639, "Code for the representation of names of languages" [ISO
+         639].
+
+    -    The value "i" is reserved for IANA-defined registrations
+
+    -    The value "x" is reserved for private use. Subtags of "x"
+         will not be registered by the IANA.
+
+    -    Other values cannot be assigned except by updating this
+         standard.
+
+   The reason for reserving all other tags is to be open towards new
+   revisions of ISO 639; the use of "i" and "x" is the minimum we can do
+   here to be able to extend the mechanism to meet our requirements.
+
+   In the first subtag:
+
+    -    All 2-letter codes are interpreted as ISO 3166 alpha-2
+         country codes denoting the area in which the language is
+         used.
+
+    -    Codes of 3 to 8 letters may be registered with the IANA by
+         anyone who feels a need for it, according to the rules in
+
+
+
+Alvestrand                                                      [Page 2]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+         chapter 5 of this document.
+
+   The information in the subtag may for instance be:
+
+    -    Country identification, such as en-US (this usage is
+         described in ISO 639)
+
+    -    Dialect or variant information, such as no-nynorsk or en-
+         cockney
+
+    -    Languages not listed in ISO 639 that are not variants of
+         any listed language, which can be registered with the i-
+         prefix, such as i-cherokee
+
+    -    Script variations, such as az-arabic and az-cyrillic
+
+   In the second and subsequent subtag, any value can be registered.
+
+   NOTE: The ISO 639/ISO 3166 convention is that language names are
+   written in lower case, while country codes are written in upper case.
+   This convention is recommended, but not enforced; the tags are case
+   insensitive.
+
+   NOTE: ISO 639 defines a registration authority for additions to and
+   changes in the list of languages in ISO 639. This authority is:
+
+         International Information Centre for Terminology (Infoterm)
+         P.O. Box 130
+         A-1021 Wien
+         Austria
+         Phone: +43 1  26 75 35 Ext. 312
+         Fax:   +43 1 216 32 72
+
+   The following codes have been added in 1989 (nothing later): ug
+   (Uigur), iu (Inuktitut, also called Eskimo), za (Zhuang), he (Hebrew,
+   replacing iw), yi (Yiddish, replacing ji), and id (Indonesian,
+   replacing in).
+
+   NOTE: The registration agency for ISO 3166 (country codes) is:
+
+         ISO 3166 Maintenance Agency Secretariat
+         c/o DIN Deutches Institut fuer Normung
+         Burggrafenstrasse 6
+         Postfach 1107
+         D-10787 Berlin
+         Germany
+         Phone: +49 30 26 01 320
+         Fax:   +49 30 26 01 231
+
+
+
+Alvestrand                                                      [Page 3]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+   The country codes AA, QM-QZ, XA-XZ and ZZ are reserved by ISO 3166 as
+   user-assigned codes.
+
+2.1.  Meaning of the language tag
+
+   The language tag always defines a language as spoken (or written) by
+   human beings for communication of information to other human beings.
+   Computer languages are explicitly excluded.
+
+   There is no guaranteed relationship between languages whose tags
+   start out with the same series of subtags; especially, they are NOT
+   guraranteed to be mutually comprehensible, although this will
+   sometimes be the case.
+
+   Applications should always treat language tags as a single token; the
+   division into main tag and subtags is an administrative mechanism,
+   not a navigation aid.
+
+   The relationship between the tag and the information it relates to is
+   defined by the standard describing the context in which it appears.
+   So, this section can only give possible examples of its usage.
+
+    -    For a single information object, it should be taken as the
+         set of languages that is required for a complete
+         comprehension of the complete object. Example: Simple text.
+
+    -    For an aggregation of information objects, it should be taken
+         as the set of languages used inside components of that
+         aggregation.  Examples: Document stores and libraries.
+
+    -    For information objects whose purpose in life is providing
+         alternatives, it should be regarded as a hint that the
+         material inside is provided in several languages, and that
+         one has to inspect each of the alternatives in order to find
+         its language or languages.  In this case, multiple languages
+         need not mean that one needs to be multilingual to get
+         complete understanding of the document. Example: MIME
+         multipart/alternative.
+
+    -    It would be possible to define (for instance) an SGML DTD
+         that defines a <LANG xx> tag for indicating that following or
+         contained text is written in this language, such that one
+         could write "<LANG FR>C'est la vie</LANG>"; the Norwegian-
+         speaking user could then access a French-Norwegian dictionary
+         to find out what the quote meant.
+
+
+
+
+
+
+Alvestrand                                                      [Page 4]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+3.  The Content-language header
+
+   The Language header is intended for use in the case where one desires
+   to indicate the language(s) of something that has RFC-822-like
+   headers, like MIME body parts or Web documents.
+
+   The RFC-822 EBNF of the Language header is:
+
+    Language-Header = "Content-Language" ":" 1#Language-tag
+
+   Note that the Language-Header is allowed to list several languages in
+   a comma-separated list.
+
+   Whitespace is allowed, which means also that one can place
+   parenthesized comments anywhere in the language sequence.
+
+3.1.  Examples of Content-language values
+
+   NOTE: NONE of the subtags shown in this document have actually been
+   assigned; they are used for illustration purposes only.
+
+   Norwegian official document, with parallel text in both official
+   versions of Norwegian. (Both versions are readable by all
+   Norwegians).
+
+      Content-Type: multipart/alternative;
+             differences=content-language
+      Content-Language: no-nynorsk, no-bokmaal
+
+   Voice recording from the London docks
+
+      Content-type: audio/basic
+      Content-Language: en-cockney
+
+   Document in Sami, which does not have an ISO 639 code, and is spoken
+   in several countries, but with about half the speakers in Norway,
+   with six different, mutually incomprehensible dialects:
+
+      Content-type: text/plain; charset=iso-8859-10
+      Content-Language: i-sami-no (North Sami)
+
+   An English-French dictionary
+
+      Content-type: application/dictionary
+      Content-Language: en, fr (This is a dictionary)
+
+   An official EC document (in a few of its official languages)
+
+
+
+
+Alvestrand                                                      [Page 5]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+      Content-type: multipart/alternative
+      Content-Language: en, fr, de, da, el, it
+
+   An excerpt from Star Trek
+
+      Content-type: video/mpeg
+      Content-Language: x-klingon
+
+4.  Use of Content-Language with Multipart/Alternative
+
+   When using the Multipart/Alternative body part of MIME, it is
+   possible to have the body parts giving the same information content
+   in different languages. In this case, one should put a Content-
+   Language header on each of the body parts, and a summary Content-
+   Language header onto the Multipart/Alternative itself.
+
+4.1.  The differences parameter to multipart/alternative
+
+   As defined in RFC 1541, Multipart/Alternative only has one parameter:
+   boundary.
+
+   The common usage of Multipart/Alternative is to have more than one
+   format of the same message (f.ex. PostScript and ASCII).
+
+   The use of language tags to differentiate between different
+   alternatives will certainly not lead all MIME UAs to present the most
+   sensible body part as default.
+
+   Therefore, a new parameter is defined, to allow the configuration of
+   MIME readers to handle language differences in a sensible manner.
+
+    Name: Differences
+    Value: One or more of
+         Content-Type
+         Content-Language
+
+   Further values can be registered with IANA; it must be the name of a
+   header for which a definition exists in a published RFC.  If not
+   present, Differences=Content-Type is assumed.
+
+   The intent is that the MIME reader can look at these headers of the
+   message component to do an intelligent choice of what to present to
+   the user, based on knowledge about the user preferences and
+   capabilities.
+
+   (The intent of having registration with IANA of the fields used in
+   this context is to maintain a list of usages that a mail UA may
+   expect to see, not to reject usages.)
+
+
+
+Alvestrand                                                      [Page 6]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+   (NOTE: The MIME specification [RFC 1521], section 7.2, states that
+   headers not beginning with "Content-" are generally to be ignored in
+   body parts. People defining a header for use with "differences="
+   should take note of this.)
+
+   The mechanism for deciding which body part to present is outside the
+   scope of this document.
+
+    MIME EXAMPLE:
+
+    Content-Type: multipart/alternative; differences=Content-Language;
+              boundary="limit"
+    Content-Language: en, fr, de
+
+    --limit
+    Content-Language: fr
+
+    Le renard brun et agile saute par dessus le chien paresseux
+    --limit
+    Content-Language: de
+    Content-Type: text/plain; charset=iso-8859-1
+    Content-Transfer-encoding: quoted-printable
+
+    Der schnelle braune Fuchs h=FCpft =FCber den faulen Hund
+    --limit
+    Content-Language: en
+
+    The quick brown fox jumps over the lazy dog
+    --limit--
+
+   When composing a message, the choice of sequence may be somewhat
+   arbitrary. However, non-MIME mail readers will show the first body
+   part first, meaning that this should most likely be the language
+   understood by most of the recipients.
+
+5.  IANA registration procedure for language tags
+
+   Any language tag must start with an existing tag, and extend it.
+
+   This registration form should be used by anyone who wants to use a
+   language tag not defined by ISO or IANA.
+
+
+
+
+
+
+
+
+
+
+Alvestrand                                                      [Page 7]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+----------------------------------------------------------------------
+LANGUAGE TAG REGISTRATION FORM
+
+Name of requester          :
+E-mail address of requester:
+Tag to be registered       :
+
+English name of language   :
+
+Native name of language (transcribed into ASCII):
+
+Reference to published description of the language (book or article):
+----------------------------------------------------------------------
+
+   The language form must be sent to <ietf-types@uninett.no> for a 2-
+   week review period before submitting it to IANA.  (This is an open
+   list. Requests to be added should be sent to <ietf-types-
+   request@uninett.no>.)
+
+   When the two week period has passed, the language tag reviewer, who
+   is appointed by the IETF Applications Area Director, either forwards
+   the request to IANA@ISI.EDU, or rejects it because of significant
+   objections raised on the list.
+
+   Decisions made by the reviewer may be appealed to the IESG.
+
+   All registered forms are available online in the directory
+   ftp://ftp.isi.edu/in-notes/iana/assignments/languages/
+
+6.  Security Considerations
+
+   Security issues are not discussed in this memo.
+
+7.  Character set considerations
+
+   Codes may always be expressed using the US-ASCII character repertoire
+   (a-z), which is present in most character sets.
+
+   The issue of deciding upon the rendering of a character set based on
+   the language tag is not addressed in this memo; however, it is
+   thought impossible to make such a decision correctly for all cases
+   unless means of switching language in the middle of a text are
+   defined (for example, a rendering engine that decides font based on
+   Japanese or Chinese language will fail to work when a mixed
+   Japanese-Chinese text is encountered)
+
+
+
+
+
+
+Alvestrand                                                      [Page 8]
+
+RFC 1766                      Language Tag                    March 1995
+
+
+8.  Acknowledgements
+
+   This document has benefited from innumberable rounds of review and
+   comments in various fora of the IETF and the Internet working groups.
+   As so, any list of contributors is bound to be incomplete; please
+   regard the following as only a selection from the group of people who
+   have contributed to make this document what it is today.
+
+   In alphabetical order:
+
+   Tim Berners-Lee, Nathaniel Borenstein, Jim Conklin, Dave Crocker,
+   Ned Freed, Tim Goodwin, Olle Jarnefors, John Klensin, Keith Moore,
+   Masataka Ohta, Keld Jorn Simonsen, Rhys Weatherley, and many, many
+   others.
+
+9.  Author's Address
+
+   Harald Tveit Alvestrand
+   UNINETT
+   Pb. 6883 Elgeseter
+   N-7002 TRONDHEIM
+   NORWAY
+
+   EMail: Harald.T.Alvestrand@uninett.no
+   Phone: +47 73 59 70 94
+
+10.  References
+
+    [ISO 639]
+         ISO 639:1988 (E/F) - Code for the representation of names of
+         languages - The International Organization for
+         Standardization, 1st edition, 1988 17 pages Prepared by
+         ISO/TC 37 - Terminology (principles and coordination).
+
+    [ISO 3166]
+         ISO 3166:1988 (E/F) - Codes for the representation of names
+         of countries - The International Organization for
+         Standardization, 3rd edition, 1988-08-15.
+
+    [RFC 1521]
+         Borenstein, N., and N. Freed, "MIME Part One: Mechanisms for
+         Specifying and Describing the Format of Internet Message
+         Bodies", RFC 1521, Bellcore, Innosoft, September 1993.
+
+    [RFC 1327]
+         Kille, S., "Mapping between X.400(1988) / ISO 10021 and RFC
+         822", RFC 1327, University College London, May 1992.
+
+
+
+
+Alvestrand                                                      [Page 9]
+
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1766.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)