doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc5242.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 787 insertions, 0 deletions
diff --git a/doc/rfc/rfc5242.txt b/doc/rfc/rfc5242.txt
new file mode 100644
index 0000000..4f4d08e
--- /dev/null
+++ b/doc/rfc/rfc5242.txt
@@ -0,0 +1,787 @@
+
+
+
+
+
+
+Network Working Group                                         J. Klensin
+Request for Comments: 5242
+Category: Informational                                    H. Alvestrand
+                                                                  Google
+                                                            1 April 2008
+
+
+A Generalized Unified Character Code: Western European and CJK Sections
+
+Status of This Memo
+
+   This memo provides information for the Internet community.  It does
+   not specify an Internet standard of any kind.  Distribution of this
+   memo is unlimited.
+
+IESG Note
+
+   This is not an IETF document.  Readers should be aware of RFC 4690,
+   "Review and Recommendations for Internationalized Domain Names
+   (IDNs)", and its references.
+
+   This document is not a candidate for any level of Internet Standard.
+   The IETF disclaims any knowledge of the fitness of this document for
+   any purpose, and in particular notes that it has not had IETF review
+   for such things as security, congestion control, or inappropriate
+   interaction with deployed protocols.  The RFC Editor has chosen to
+   publish this document at its discretion.  Readers of this document
+   should exercise caution in evaluating its value for implementation
+   and deployment.
+
+Abstract
+
+   Many issues have been identified with the use of general-purpose
+   character sets for internationalized domain names and similar
+   purposes.  This memo describes a fully unified coded character set
+   for scripts based on Latin, Greek, Cyrillic, and Chinese (CJK)
+   characters.  It is not a complete specification of that character
+   set.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 1]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+Table of Contents
+
+   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
+     1.1.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .  3
+     1.2.  Discussion . . . . . . . . . . . . . . . . . . . . . . . .  4
+   2.  Types of Characters  . . . . . . . . . . . . . . . . . . . . .  4
+     2.1.  Base Character . . . . . . . . . . . . . . . . . . . . . .  4
+     2.2.  Nonspacing Marks . . . . . . . . . . . . . . . . . . . . .  4
+     2.3.  Case Indicators  . . . . . . . . . . . . . . . . . . . . .  4
+     2.4.  Joining Indicators . . . . . . . . . . . . . . . . . . . .  5
+     2.5.  Character-Matrix Positioning Indicators  . . . . . . . . .  5
+     2.6.  Position Shaping Controls  . . . . . . . . . . . . . . . .  6
+     2.7.  Repetition Indicators  . . . . . . . . . . . . . . . . . .  6
+     2.8.  Control Characters . . . . . . . . . . . . . . . . . . . .  7
+   3.  Code Assigment Groupings . . . . . . . . . . . . . . . . . . .  7
+   4.  Canonical Form . . . . . . . . . . . . . . . . . . . . . . . .  7
+   5.  Examples of Graphic Element Codes  . . . . . . . . . . . . . .  8
+   6.  Composite Characters and Unicode Equivalences  . . . . . . . . 10
+   7.  Ideographic Characters . . . . . . . . . . . . . . . . . . . . 11
+   8.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 11
+   9.  Security Considerations  . . . . . . . . . . . . . . . . . . . 12
+   10. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 12
+   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13
+     11.1. Normative References . . . . . . . . . . . . . . . . . . . 13
+     11.2. Informative References . . . . . . . . . . . . . . . . . . 13
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 2]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+1.  Introduction
+
+   Many issues have been identified with the use of general-purpose
+   character sets for internationalized domain names and similar
+   purposes.  This memo specifies a fully unified coded character set
+   for scripts based on Latin, Greek, Cyrillic, and Chinese characters.
+
+   There are four important principles in this work:
+
+   1.  If it looks alike, it is alike.  The number of base characters
+       and marks should be minimized.  Glyphs are more important than
+       character abstractions.
+
+   2.  If it is the same thing, it is the same thing.  Two symbols that
+       have the same semantic meaning in all contexts should be encoded
+       in a way that allows their identity to be discovered by removing
+       modifiers, rather than having to resort to external equivalence
+       tables.
+
+   3.  For simplicity, when a character form can be evaluated on the
+       basis of either serif or sanserif fonts, the sanserif font is
+       always preferred.
+
+   4.  The use of combining characters and modifiers is preferred to
+       adding more base characters.
+
+   Based on these principles, it becomes obvious that:
+
+   o  Ligatures, digraphs, and final forms are constructed with special
+      modifiers so that relationships to basic forms are obvious.
+
+   o  Symbols consisting of multiple marks are always constructed from
+      combining characters and positional modifiers; thus, the "i"
+      character is constructed from the vertical line symbol followed by
+      a combining dot above.  Similarly "f" is composed of a centered
+      vertical line, a right hook in the top position, and an
+      appropriately-positioned composing hyphen.
+
+   This document draws strongly from the design and terminology of
+   Unicode [Unicode] but represents a radically different approach.
+
+1.1.  Terminology
+
+   All special-use terms in this document, including descriptions of
+   behaviors and related relationships, are used with their common-sense
+   meanings.
+
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 3]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+1.2.  Discussion
+
+   Questions to, and contributions for, this coding system should be
+   addressed to the mailing list
+   unified-ccs@xn--iwem3b1f.xn--90ase1a.bogus.domain.name.
+
+2.  Types of Characters
+
+   This document defines several types of characters.  Note that these
+   definitions are not the same as the Unicode definitions for similar
+   or identical terms.
+
+2.1.  Base Character
+
+   Any character that is used as an atomic shape, rather than being
+   assembled from such a character in combination with combining
+   (overstriking) marks, symbols, or specially-designed base characters.
+   When used alone, base characters always take up space.  For example,
+   a, c, l,...
+
+2.2.  Nonspacing Marks
+
+   Marks, symbols, and character components that are used to form
+   characters when used in combination with base characters.  They do
+   not occupy separate character positions when displayed.
+
+   For example, the special combining symbols LeftUpperHook and
+   RightLowerHook, described in Section 5, are nonspacing marks.
+
+2.3.  Case Indicators
+
+   In scripts with case, only the lower-case characters are base
+   characters.  Upper-case forms are represented by using the UC
+   modifier.  So the traditional "A" character is represented by
+   "a<UC>".  Note that this means that case-independent comparisons are
+   made simply by ignoring the <UC> modifiers rather than by complicated
+   mapping operations.
+
+   The initial set of case modifiers consists exclusively of:
+
+   UC Upper-case, code value 1 (hexadecimal)
+
+   The code values two through four are reserved for the impending
+   encoding of scripts with more than two cases; five is reserved for
+   expansion in case a script with more than four cases is identified.
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 4]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+2.4.  Joining Indicators
+
+   Zero-width joiners are used to build characters, not only to separate
+   or join words.  As compared to Unicode, a richer set of joiners is
+   used to distinguish between the inter-word and ligature-forming
+   (including half-character forming) cases.  Unicode ZWJ and ZWNJ are
+   supplemented by ZWCJ, OJ, and ONJ.  ZWCJ is used to modify a spacing
+   basic character into a nonspacing role.  For example, there is no "w"
+   character, but only "u<ZWCJ>u".  Upper-case "W" is coded as
+   u<ZWCJ>u<UC> -- the CWCJ binds more tightly than the UC modifier.
+
+   The initial set of joining indicators consists exclusively of:
+
+   ZWCJ  Character joiner (also known as "ligature joiner"), code value
+      6 (hexadecimal).
+
+   OJ Overlay joiner (permits use of a subsequent character that would
+      normally be spacing as nonspacing), code value 7 (hexadecimal).
+
+   ONJ  Overlay non-joiner (turns a nonspacing mark into a standalone
+      character), code value 8 (hexadecimal).  This joiner should not be
+      necessary, and is normally prohibited by the "shortest string"
+      rule.  But there may be unanticipated cases.
+
+   ZWJ  Zero-width joiner for words or word-like constructions, code
+      value 9 (hexadecimal).
+
+   ZWNJ  Zero-width non-joiner for words or word-like constructions,
+      code value A (hexadecimal).
+
+2.5.  Character-Matrix Positioning Indicators
+
+   Many characters are defined by constructed glyphs using nonspacing
+   marks.  For example, the characters "b" and "d" are coded as
+   o<VerticalLine><PositionLeft> and o<VerticalLine><PositionRight>,
+   respectively.  The Catalan ligature that has caused some difficulties
+   in Internationalizing Domain Names in Applications (IDNA) [RFC3490]
+   is coded as l<ZWCJ><.><PositionVMiddle><ZWCJ>l
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 5]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+   The initial table of positioning indicators is:
+
+                     +-------------------+-----------+
+                     | Name              | Hex value |
+                     +-------------------+-----------+
+                     | PositionLeft      |        20 |
+                     | PositionCenter    |        21 |
+                     | PositionRight     |        22 |
+                     | PositionTop       |        30 |
+                     | PositionVMiddle   |        31 |
+                     | PositionBottom    |        32 |
+                     | PositionDescender |        33 |
+                     +-------------------+-----------+
+
+2.6.  Position Shaping Controls
+
+   These controls designate character form changes for initial or final-
+   form characters.  Where the distinction is important, medial-form
+   characters are the default when no qualification occurs.  As with
+   case comparisons, comparisons are performed by ignoring these control
+   functions.
+
+                        +-------------+-----------+
+                        | Name        | Hex value |
+                        +-------------+-----------+
+                        | InitialForm |        71 |
+                        | FinalForm   |        72 |
+                        +-------------+-----------+
+
+2.7.  Repetition Indicators
+
+   For compactness of coding, two repetition indicators are introduced
+   for double (Repeat2) and triple (Repeat3) characters that may be
+   treated as ligatures or special cases.  Two consecutive uses of a
+   character compare equal to the character followed by <Repeat2>.  The
+   interpretation of u<ZWCJ>u<Repeat3> is left as an exercise for the
+   reader.
+
+              The initial table of repetition indicators is:
+
+                          +---------+-----------+
+                          | Name    | Hex value |
+                          +---------+-----------+
+                          | Repeat2 |        50 |
+                          | Repeat3 |        51 |
+                          | Repeat1 |        52 |
+                          +---------+-----------+
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 6]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+   For larger repeats, these repeats can be combined; the sequence
+   <Repeat2><Repeat3> represents six repeats, while the
+   <Repeat3><Repeat2> represents five repeats.  Following the "shortest
+   string" principle (see Section 4), Repeat1 must not ever appear
+   except in combination with Repeat2 and/or Repeat3.  The generation of
+   other numbers is left as an exercise for the reader.
+
+2.8.  Control Characters
+
+   Because it is intended primarily for domain names, this specification
+   has no provision for control or spacing characters.
+
+3.  Code Assigment Groupings
+
+   Following the reasoning used in Unicode [Unicode], every character
+   occupies exactly 23 bits (conventionally stored as three octets, with
+   the leading bit always zero).  This value is chosen because both 3
+   and 23 are prime numbers, unlike 42.
+
+   The code point value zero is permanently reserved and will not be
+   used unless it is necessary to expand the code space.
+
+   Code values between 1 and 255 (decimal) are reserved for the special
+   character formation codes described in Section 2.3 through
+   Section 2.7.
+
+   Code values between 256 and 511 (decimal) are reserved for character
+   formation marks for non-ideographic characters.  Most, but not all,
+   of these are nonspacing (combining) characters.
+
+   Code values between 512 and 1023 are reserved on general principles
+   and in case it is necessary to invent new rules and make them
+   retroactive.
+
+   Code values of 1024 and above are to be allocated for characters,
+   glyphs, and other character elements.
+
+4.  Canonical Form
+
+   When glyphs are constructed using the mechanisms described here,
+   there is a single canonical form for representing any given glyph.
+   There are no exceptions to that form, and any sequence of characters
+   and qualifiers that is not consistent with the form is invalid.  If
+   there are two possible ways to represent a given character, the
+   shorter one (in octet count) is the only permitted form.  If there
+   are two possible ways that are of the same length, the only permitted
+   form is the one that has the smaller value when the numeric values of
+   all of the octets in each are summed.
+
+
+
+Klensin & Alvestrand         Informational                      [Page 7]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+   The ordering rules are as follows:
+
+   1.  A base character or composite character (see below) must come
+       first.
+
+   2.  The base character may be followed by ZWCJ or OJ, but not both,
+       followed by a base or nonspacing character or mark.
+
+   3.  If ZWCJ appears, the next character must be a base character or
+       nonspacing mark.
+
+   4.  If OJ appears, the next character must be a base character, since
+       the function of OJ is to make a spacing base character into a
+       nonspacing (overlay) character.
+
+   5.  That character can be followed by positional qualifiers that
+       apply to it.  Vertical positional qualifiers precede horizontal
+       positional qualifiers.
+
+   6.  That sequence of characters may be followed by a case qualifier.
+
+   7.  That entire sequence of characters forms a composite character.
+       When the composite character is non-trivial, the rules may be
+       applied to it recursively.  If grouping is needed to distinguish
+       between one composite character and the next, ZWNCJ may be used
+       at the beginning of a composite character to identify a group
+       boundary.
+
+5.  Examples of Graphic Element Codes
+
+   The initial lists of positioning and combining controls appear above.
+   This section shows codes for some base characters.  Names in upper
+   case are the Unicode names for the characters.  These are followed,
+   for information, by the Unicode code point designations.  The code
+   point list is informative, not normative, and may not be complete
+   (especially since additional matching code points may be added to
+   Unicode over time).  Note that several Unicode characters that are
+   considered different by Unicode are assigned the same code sequence
+   in the system specified here.
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 8]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+   +------------------------+-------+----------------------------------+
+   | Name                   |   Hex | Comment                          |
+   |                        | value |                                  |
+   +------------------------+-------+----------------------------------+
+   | FULL STOP (U+002E)     |   110 | Used as both base character (in  |
+   |                        |       | bottom center position) and as   |
+   |                        |       | movable dot with OJ and          |
+   |                        |       | positional qualifiers.           |
+   | HYPHEN-MINUS (U+002D)  |   108 | Used as a spacing base character |
+   |                        |       | (in horizontally and vertically  |
+   |                        |       | centered position) and as a      |
+   |                        |       | movable half-width horizontal    |
+   |                        |       | line with OJ and positional      |
+   |                        |       | qualifiers.  In the context of   |
+   |                        |       | this specification, should be    |
+   |                        |       | known as Half Horizontal Line.   |
+   | LOW LINE (U+005F)      |   109 | Used as a spacing base character |
+   |                        |       | (in bottom position) and as a    |
+   |                        |       | movable full-width horizontal    |
+   |                        |       | line with OJ and positional      |
+   |                        |       | qualifiers.  In the context of   |
+   |                        |       | this specification, should be    |
+   |                        |       | known as Horizontal Line.        |
+   | VERTICAL LINE (U+007C) |   102 | As with the horizontal lines,    |
+   |                        |       | normally a spacing base          |
+   |                        |       | character (in the middle         |
+   |                        |       | position between left and        |
+   |                        |       | right), but can be used as a     |
+   |                        |       | right to left movable            |
+   |                        |       | full-height vertical line with   |
+   |                        |       | OJ and/or positional qualifiers. |
+   | HalfHeightVerticalLine |   105 | Similar to VERTICAL LINE, but    |
+   |                        |       | only half height.                |
+   | SOLIDUS (U+002F)       |   103 | Used only for character          |
+   |                        |       | formation; forward slash         |
+   | REVERSE SOLIDUS        |   104 | Used only for character          |
+   | (U+005C)               |       | formation; reverse slash         |
+   | RightUpperHook         |   131 | Used only for character          |
+   |                        |       | formation; nonspacing mark.      |
+   | LeftUpperHook          |   132 | Used only for character          |
+   |                        |       | formation; nonspacing mark.      |
+   | LeftLowerHook          |   133 | Used only for character          |
+   |                        |       | formation; nonspacing mark.      |
+   | RightLowerHook         |   134 | Used only for character          |
+   |                        |       | formation; nonspacing mark.      |
+   | HalfHeightHoop         |   140 | Used only for character          |
+   |                        |       | formation; nonspacing mark.      |
+
+
+
+
+Klensin & Alvestrand         Informational                      [Page 9]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+   | HalfHeightInvertedHoop |   141 | Used only for character          |
+   |                        |       | formation; nonspacing mark.      |
+   | DIGIT ZERO (U+0030)    |   400 |                                  |
+   | DIGIT ONE (U+0031)     |   401 |                                  |
+   | DIGIT TWO (U+0032)     |   402 |                                  |
+   | DIGIT NINE (U+0039)    |   409 |                                  |
+   | LATIN SMALL LETTER A   |   40A |                                  |
+   | (U+0061)               |       |                                  |
+   | LATIN SMALL LETTER O   |   418 | Unify with Greek Omicron         |
+   | (U+006F, U+03BF)       |       |                                  |
+   | LATIN SMALL LETTER C   |   40C | Unifying C with Cyrillic ES      |
+   | (U+0063, U+0441)       |       |                                  |
+   | GREEK SMALL LETTER     |   491 |                                  |
+   | SIGMA (U+03C3)         |       |                                  |
+   +------------------------+-------+----------------------------------+
+
+6.  Composite Characters and Unicode Equivalences
+
+   This section provides examples of characters that are derived from or
+   based on others, known as "composite characters".
+
+   +------------------+--------------+---------------------------------+
+   | Name             |    Hex value | Comment                         |
+   +------------------+--------------+---------------------------------+
+   | LATIN SMALL      |  418 007 102 |                                 |
+   | LETTER B         |          020 |                                 |
+   | (U+0062)         |              |                                 |
+   | LATIN SMALL      |  418 007 102 |                                 |
+   | LETTER D         |          022 |                                 |
+   | (U+0064)         |              |                                 |
+   | LATIN SMALL      |  40C 007 108 |                                 |
+   | LETTER E         |          031 |                                 |
+   | (U+0065)         |              |                                 |
+   | LATIN SMALL      |  40A 006 40C |                                 |
+   | LETTER AE        |  007 108 031 |                                 |
+   | (U+00E6)         |              |                                 |
+   | LATIN SMALL      |  102 131 030 | Note that 007 is not needed     |
+   | LETTER F         |      007 108 | before 131 because hooks are    |
+   | (U+0066)         |              | exclusively nonspacing          |
+   |                  |              | (combining).                    |
+   | LATIN SMALL      |  102 020 141 |                                 |
+   | LETTER H         |      021 032 |                                 |
+   | (U+0068)         |              |                                 |
+   | LATIN SMALL      |  105 007 110 |                                 |
+   | LETTER I         |      021 030 |                                 |
+   | (U+0069)         |              |                                 |
+
+
+
+
+
+Klensin & Alvestrand         Informational                     [Page 10]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+   | LATIN SMALL      |  105 020 141 |                                 |
+   | LETTER N         |      021 032 |                                 |
+   | (U+006E)         |              |                                 |
+   | LATIN SMALL      |  418 007 102 | Unified P, Greek Rho, Cyrillic  |
+   | LETTER P         |  033 020 033 | ER                              |
+   | (U+0070, U+03C1, |              |                                 |
+   | U+0440)          |              |                                 |
+   | LATIN CAPITAL    |      40A 001 |                                 |
+   | LETTER A         |              |                                 |
+   | (U+0041)         |              |                                 |
+   | LATIN CAPITAL    |  418 007 102 |                                 |
+   | LETTER B         |      020 001 |                                 |
+   | (U+0042)         |              |                                 |
+   | LATIN CAPITAL    |      40C 001 |                                 |
+   | LETTER C         |              |                                 |
+   | (U+0043)         |              |                                 |
+   | LATIN CAPITAL    |  418 007 102 |                                 |
+   | LETTER D         |      022 001 |                                 |
+   | (U+0044)         |              |                                 |
+   | GREEK SMALL      |      491 072 |                                 |
+   | LETTER FINAL     |              |                                 |
+   | SIGMA (U+03C2)   |              |                                 |
+   +------------------+--------------+---------------------------------+
+
+7.  Ideographic Characters
+
+   Because of the traditional model of forming characters using selected
+   radicals and strokes in combination, Han-derived ("CJK") characters
+   are even more naturally represented, with less ambiguity, in the
+   system specified here than European ones.  The mechanisms used in
+   this specification and represented in the tables (see Section 8) are
+   similar to those described as "Radicals" and "Strokes" in Section 5.1
+   and in Section 5.2 ("Ideographic Description Characters") of The
+   Unicode Standard [Unicode].  Of course, following the same principles
+   outlined above for European characters, only radicals, stroke, and
+   description controls would be treated as base characters; no distinct
+   compound precomposed ideographic characters are registered.
+
+8.  IANA Considerations
+
+   IANA is requested to keep the actual registry of characters and code
+   tables.  The registry entries consist of a character name (preferably
+   matching the Unicode character name when one is available), the code
+   sequence used to represent the character and optional descriptive
+   information.  The characters and codes identified in Section 2,
+   Section 5, and Section 6 above should be used to initialize the
+   table.  Since the coding system is user-extensible, registrations
+   should be accepted for new characters as long as they don't look like
+
+
+
+Klensin & Alvestrand         Informational                     [Page 11]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+   old ones.  A designated expert with a background in calligraphy or
+   abstract art, and considerable experience in evaluating claims about
+   the count of angels on heads of pins, should be selected to advise
+   IANA on "looks like".
+
+9.  Security Considerations
+
+   The representation of characters in this format should be a
+   significant boon for security.  It eliminates many possibilities of
+   phishing attacks, since Principle 1 prevents the existence of two
+   characters that look alike but are different.
+
+   By detaching the encoding of characters for domain names from the
+   encoding of characters for other purposes, it also guarantees that
+   reasonable-looking names will have been encoded by competent
+   entities, thereby providing a significant degree of safety by
+   obscurity.
+
+   Because of the method by which upper-case forms are encoded and
+   because similarity is sometimes in the mind of the beholder, this
+   specification will not completely eliminate opportunities for visual
+   confusion.  For example, because the lower-case characters are quite
+   different, LATIN CAPITAL LETTER A and GREEK CAPITAL LETTER ALPHA will
+   never compare equal, even though they look alike.
+
+10.  Acknowledgments
+
+   The authors would like to acknowledge the many contributions of
+   J.F.C. Morphin for pointing out the inadequacies of trying to address
+   the challenges of internationalization within the context of existing
+   engineering principles.  His comments and related ones, in
+   combination with issues encountered in trying to internationalize
+   domain names based on Unicode, have contributed greatly to the frame
+   of mind underlying large parts of the proposal documented here.  The
+   theoretical framework for this coding system is based, in part, on
+   Unicode and its collection of names and sample glyphs but represents
+   a very different approach to the coding system itself.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                     [Page 12]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+11.  References
+
+11.1.  Normative References
+
+   [Unicode]  The Unicode Consortium, "The Unicode Standard, Version
+              5.0", 2007.
+              Boston, MA, USA: Addison-Wesley.  ISBN 0-321-48091-0
+
+11.2.  Informative References
+
+   [RFC3490]  Faltstrom, P., Hoffman, P., and A. Costello,
+              "Internationalizing Domain Names in Applications (IDNA)",
+              RFC 3490, March 2003.
+
+Authors' Addresses
+
+   John C Klensin
+   1770 Massachusetts Ave, #322
+   Cambridge, MA  02140
+   USA
+
+   Phone: +1 617 491 5735
+   EMail: john+ietf@jck.com
+
+
+   Harald Tveit Alvestrand
+   Google
+   Beddingen 10
+   Trondheim,   7014
+   Norway
+
+   EMail: harald@alvestrand.no
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                     [Page 13]
+
+RFC 5242                      Unified CCS                     April 2008
+
+
+Full Copyright Statement
+
+   Copyright (C) The IETF Trust (2008).
+
+   This document is subject to the rights, licenses and restrictions
+   contained in BCP 78 and at http://www.rfc-editor.org/copyright.html,
+   and except as set forth therein, the authors retain all their rights.
+
+   This document and the information contained herein are provided on an
+   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
+   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
+   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
+   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+   The IETF takes no position regarding the validity or scope of any
+   Intellectual Property Rights or other rights that might be claimed to
+   pertain to the implementation or use of the technology described in
+   this document or the extent to which any license under such rights
+   might or might not be available; nor does it represent that it has
+   made any independent effort to identify any such rights.  Information
+   on the procedures with respect to rights in RFC documents can be
+   found in BCP 78 and BCP 79.
+
+   Copies of IPR disclosures made to the IETF Secretariat and any
+   assurances of licenses to be made available, or the result of an
+   attempt made to obtain a general license or permission for the use of
+   such proprietary rights by implementers or users of this
+   specification can be obtained from the IETF on-line IPR repository at
+   http://www.ietf.org/ipr.
+
+   The IETF invites any interested party to bring to its attention any
+   copyrights, patents or patent applications, or other proprietary
+   rights that may cover technology that may be required to implement
+   this standard.  Please address the information to the IETF at
+   ietf-ipr@ietf.org.
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin & Alvestrand         Informational                     [Page 14]
+
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc5242.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)