diff options
| author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 | 
|---|---|---|
| committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 | 
| commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
| tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc2237.txt | |
| parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) | |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc2237.txt')
| -rw-r--r-- | doc/rfc/rfc2237.txt | 339 | 
1 files changed, 339 insertions, 0 deletions
| diff --git a/doc/rfc/rfc2237.txt b/doc/rfc/rfc2237.txt new file mode 100644 index 0000000..21dac0a --- /dev/null +++ b/doc/rfc/rfc2237.txt @@ -0,0 +1,339 @@ + + + + + + +Network Working Group                                          K. Tamaru +Request for Comments: 2237                         Microsoft Corporation +Category: Informational                                    November 1997 + + + +           Japanese Character Encoding for Internet Messages + + +Status of this Memo + +   This memo provides information for the Internet community.  It does +   not specify an Internet standard of any kind.  Distribution of this +   memo is unlimited. + +Copyright Notice + +   Copyright (C) The Internet Society (1997).  All Rights Reserved. + +1. Abstract + +   This memo defines an encoding scheme for the Japanese Characters, +   describes "ISO-2022-JP-1", which is used in electronic mail [RFC- +   822], and network news [RFC 1036]. Also this memo provides a listing +   of the Japanese Character Set that can be used in this encoding +   scheme. + +2. Requirements Notation + +   This document uses terms that appear in capital letters to indicate +   particular requirements of this specification. Those terms are +   "MUST", "SHOULD", "MUST NOT", "SHOULD NOT", and "MAY". The meaning of +   each term are found in [RFC-2119] + +3. Introduction + +   RFC 1468 defines the way Japanese Characters are encoded, likewise +   what this memo defines. It defines the use of JIS X 0208 as the +   double-byte character set in ISO-2022-JP text. + +   Today, many operating systems support proprietary extended Japanese +   characters or JIS X 0212, This includes the Unicode character set, +   which does not conform to JIS X 0201 nor JIS X 0208. Therefore, this +   limits the ability to communicate and correspond precise information +   because of the limited availability of Kanji characters. Fortunately +   JIS (Japanese Industry Standard) defines JIS X 0212 as "code of the + + + + + +Tamaru                       Informational                      [Page 1] + +RFC 2237              Japanese Character Encoding          November 1997 + + +   supplementary Japanese graphic character set for information +   interchange". Most Japanese characters which are used in regular +   electronic mail in most cases can be accommodated in JIS X 0201, JIS +   X 0208 and JIS X 0212. + +   Also it is recognized that there is a tendency to use Unicode, +   however, Unicode is not yet widely used and there is a certain +   limitation with old electronic mail system. Furthermore, the purpose +   of this comment is to add the capability of writing out JIS X 0212. + +   This comment does not describe any representation of iso-2022-jp-1 +   version information in addition to JIS X 0212 support. + +4. Description + +   In "ISO-2022-JP-1" text, the initial character code of the message is +   in ASCII. The "double-byte-seq"(see "Format Syntax" section) (ESC "$" +   "B" / ESC "$" "@" / ESC "$" "(" "D") is the only designator that +   indicates that the following character is double-byte, and it is +   valid until another escape sequence appears.  It is very discouraged +   to use (ESC "$" "@") for double byte character encoding, new +   implementation SHOULD use only (ESC "$" "B") for double byte encoding +   instead. + +   The end of "ISO-2022-JP-1" text MUST be in ASCII. Also it is strongly +   recommended to back up to the ASCII at the end of each line rather +   than JIS X 0201-Roman if there is any none ASCII character in middle +   of a line. + +   Since "ISO-2022-JP-1" is designed to add the capability of writing +   out JIS X 0212, if the message does not contain none of JIS X 0212 +   characters. "ISO-2022-JP" text MUST BE used. + +   JIS X 0201-Roman is not identical to the ASCII with two different +   characters. + +   The following list are the escape sequences and character sets that +   can be used in "ISO-2022-JP-1" text. The registered number in the ISO +   2375 Register which allow double-byte ideographic scripts to be +   encoded within ISO/IEC 2022 code structure is indicated as reg# +   below. + +   reg# character set     ESC sequence                  designated to +   6    ASCII             ESC 2/8 4/2                   ESC ( B    G0 +   42   JIS X 0208-1978   ESC 2/4 4/0                   ESC $ @    G0 +   87   JIS X 0208-1983   ESC 2/4 4/2                   ESC $ B    G0 +   14   JIS X 0201-Roman  ESC 2/8 4/10                  ESC ( J    G0 +   159  JIS X 0212-1990   ESC 2/4 2/8 4/4               ESC $ ( D  G0 + + + +Tamaru                       Informational                      [Page 2] + +RFC 2237              Japanese Character Encoding          November 1997 + + +   Other restrictions are given in the Formal Syntax below. + +5. Formal Syntax + +   The notational conventions used here are identical to those used in +   STD 11, RFC 822 [RFC822]. + +   The * (asterisk) convention is as follows: +          l*m something +   meaning at least l and at most m something, with l and m taking +   default values of 0 and infinity, respectively. + +   iso-2022-jp-1-text  = *( line CRLF ) [line] + +   line                = (*single-byte-char *segment +                        single-byte-seq *single-byte-char) / +                        *single-byte-char + +   segment             = single-byte-segment / double-byte-segment + +   single-byte-segment = single-byte-seq *single-byte-char +   double-byte-segment = double-byte-seq *(one-of-94 one-of-94) + +   reset-seq           = ESC "(" ( "B" / "J" ) +   single-byte-seq     = ESC "(" ( "B" / "J" ) +   double-byte-seq     = (ESC "$" ( "@" / "B" )) / +                              (ESC "$" "(" "D" ) + +   CRLF             = CR LF;( Octal, Decimal.) +   ESC              = <ISO 2022 ESC, escape>;( 33,27.) +   SI               = <ISO 2022 SI, shift-in>;( 17,15.) +   SO               = <ISO 2022 SO, shift-out>;( 16,14.) +   CR               = <ASCII CR, carriage return>;( 15,13.) +   LF               = <ASCII LF, linefeed>;( 12,10.) +   one-of-94        = <any one of 94 values>;(41-176,33.-126.) +   one-of-96        = <any one of 96 values>;(40-177,32.-127.) +   7BIT             = <any 7-bit value>;(0-177,0.-127.) +   single-byte-char = <any 7BIT, including bare CR & bare LF, +                        but NOT including CRLF, and not including +                        ESC, SI, SO> + +6. Security Considerations + +   This memo raises no known security issues. + + + + + + + +Tamaru                       Informational                      [Page 3] + +RFC 2237              Japanese Character Encoding          November 1997 + + +7. MIME Considerations + +   The name to be used for the Japanese encoding scheme in content is +   "ISO-2022-JP-1". When this name is used in the MIME message form, it +   would be: + +              Content-Type: text/plain; charset=iso-2022-jp-1 + +   Since the "ISO-2022-JP-1" is 7bit encoding, it will be unnecessary to +   encode in another format by specifying the "Content-Transfer- +   Encoding" header. Also applying Based64 or Quoted-Printable encoding +   MAY cause today's software to fail to decode the message. + +   "ISO-2022-JP-1" can be used in MIME headers. Also "ISO-2022-JP-1" +   text can be used with Base64 or Quoted-Printable encoding. + +8. Additional Information + +   As long as mail systems are capable of writing out Unicode, it is +   recommended to also write out Unicode text in addition to "ISO- +   2022-JP-1" text. Also writing out "ISO-2022-JP" text in addition to +   "ISO-2022-JP-1" is strongly encouraged for backward compatibility +   reasons. + +   Some mail systems write out 8bits characters in 'parameter' and +   'value' defined in [RFC 822] and [RFC 1521]. All 8bit characters MUST +   NOT be used in those fields. The implementation of future mail +   systems SHOULD support those only for interoperability reasons. + +9. References + +   [ISO2022] +             International Organization for Standardization (ISO), +             "Information processing -- ISO 7-bit and 8-bit coded +             character sets -- Code extension techniques", +             International Standard, Ref. No. ISO 2022-1986 (E). + +   [ISOREG] +             International Organization for Standardization (ISO), +             "International Register of Coded Character Sets To Be Used +             With Escape Sequences". + +   [RFC-822] +             Crocker, D., "Standard for the Format of ARPA Internet +             Text Messages", STD 11, RFC 822, August 1982. + + + + + + +Tamaru                       Informational                      [Page 4] + +RFC 2237              Japanese Character Encoding          November 1997 + + +   [RFC-1468] +             Murai, J., Crispin, M., and E. van der Poel, "Japanese +             Character Encoding for Internet Messages", RFC 1468, June +             1993. + +   [RFC-1766] +             Alvestrand, H., "Tags for the Identification of +             Languages", RFC 1766, March 1995. + +   [RFC-2045] +             Freed, N., and N. Borenstein, "Multipurpose Internet Mail +             Extensions (MIME) Part One: Format of Internet Message +             Bodies", RFC 2045, December 1996. + +   [RFC-2046] +             Freed, N., and N. Borenstein, "Multipurpose Internet Mail +             Extensions (MIME) Part Two: Media Types", RFC 2046, +             December 1996. + +   [RFC-2047] +             Moore, K., "Multipurpose Internet Mail Extensions (MIME) +             Part Three: Representation of Non-ASCII Text in Internet +             Message Headers", RFC 2047, December 1996. + +   [RFC-2048] +             Freed, N., Klensin, J. and J. Postel, "Multipurpose +             Internet Mail Extensions (MIME) Part Four: MIME +             Registration Procedures", RFC 2048, December 1996. + +   [RFC-2049] +             Freed, N., and N. Borenstein, "Multipurpose Internet Mail +             Extensions (MIME) Part Five: Conformance Criteria and +             Examples", RFC 2049, December 1996. + +   [RFC-2119] +             Bradner, S., "Key words for use in RFCs to Indicate +             Requirement Levels", RFC 2119, March 1997. + +Author's Address + +   Kenzaburo Tamaru +   Microsoft Corporation +   One Microsoft Way +   Redmond, WA 98052-6399 + +   EMail: kenzat@microsoft.com + + + + + +Tamaru                       Informational                      [Page 5] + +RFC 2237              Japanese Character Encoding          November 1997 + + +Full Copyright Statement + +   Copyright (C) The Internet Society (1997).  All Rights Reserved. + +   This document and translations of it may be copied and furnished to +   others, and derivative works that comment on or otherwise explain it +   or assist in its implementation may be prepared, copied, published +   and distributed, in whole or in part, without restriction of any +   kind, provided that the above copyright notice and this paragraph are +   included on all such copies and derivative works.  However, this +   document itself may not be modified in any way, such as by removing +   the copyright notice or references to the Internet Society or other +   Internet organizations, except as needed for the purpose of +   developing Internet standards in which case the procedures for +   copyrights defined in the Internet Standards process must be +   followed, or as required to translate it into languages other than +   English. + +   The limited permissions granted above are perpetual and will not be +   revoked by the Internet Society or its successors or assigns. + +   This document and the information contained herein is provided on an +   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING +   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING +   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION +   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF +   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + + + + + + + + + + + + + + + + + + + + + + + + +Tamaru                       Informational                      [Page 6] + |