summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc1468.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1468.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc1468.txt')
-rw-r--r--doc/rfc/rfc1468.txt339
1 files changed, 339 insertions, 0 deletions
diff --git a/doc/rfc/rfc1468.txt b/doc/rfc/rfc1468.txt
new file mode 100644
index 0000000..e8cb5b8
--- /dev/null
+++ b/doc/rfc/rfc1468.txt
@@ -0,0 +1,339 @@
+
+
+
+
+
+
+Network Working Group J. Murai
+Request for Comments: 1468 Keio University
+ M. Crispin
+ Panda Programming
+ E. van der Poel
+ June 1993
+
+
+ Japanese Character Encoding for Internet Messages
+
+Status of this Memo
+
+ This memo provides information for the Internet community. It does
+ not specify an Internet standard. Distribution of this memo is
+ unlimited.
+
+Introduction
+
+ This document describes the encoding used in electronic mail [RFC822]
+ and network news [RFC1036] messages in several Japanese networks. It
+ was first specified by and used in JUNET [JUNET]. The encoding is now
+ also widely used in Japanese IP communities.
+
+ The name given to this encoding is "ISO-2022-JP", which is intended
+ to be used in the "charset" parameter field of MIME headers (see
+ [MIME1] and [MIME2]).
+
+Description
+
+ The text starts in ASCII [ASCII], and switches to Japanese characters
+ through an escape sequence. For example, the escape sequence ESC $ B
+ (three bytes, hexadecimal values: 1B 24 42) indicates that the bytes
+ following this escape sequence are Japanese characters, which are
+ encoded in two bytes each. To switch back to ASCII, the escape
+ sequence ESC ( B is used.
+
+ The following table gives the escape sequences and the character sets
+ used in ISO-2022-JP messages. The ISOREG number is the registration
+ number in ISO's registry [ISOREG].
+
+ Esc Seq Character Set ISOREG
+
+ ESC ( B ASCII 6
+ ESC ( J JIS X 0201-1976 ("Roman" set) 14
+ ESC $ @ JIS X 0208-1978 42
+ ESC $ B JIS X 0208-1983 87
+
+ Note that JIS X 0208 was called JIS C 6226 until the name was changed
+
+
+
+Murai, Crispin & van der Poel [Page 1]
+
+RFC 1468 Japanese Character Encoding for Internet Messages June 1993
+
+
+ on March 1st, 1987. Likewise, JIS C 6220 was renamed JIS X 0201.
+
+ The "Roman" character set of JIS X 0201 [JISX0201] is identical to
+ ASCII except for backslash () and tilde (~). The backslash is
+ replaced by the Yen sign, and the tilde is replaced by overline. This
+ set is Japan's national variant of ISO 646 [ISO646].
+
+ The JIS X 0208 [JISX0208] character sets consist of Kanji, Hiragana,
+ Katakana and some other symbols and characters. Each character takes
+ up two bytes.
+
+ For further details about the JIS Japanese national character set
+ standards, refer to [JISX0201] and [JISX0208]. For further
+ information about the escape sequences, see [ISO2022] and [ISOREG].
+
+ If there are JIS X 0208 characters on a line, there must be a switch
+ to ASCII or to the "Roman" set of JIS X 0201 before the end of the
+ line (i.e., before the CRLF). This means that the next line starts in
+ the character set that was switched to before the end of the previous
+ line.
+
+ Also, the text must end in ASCII.
+
+ Other restrictions are given in the Formal Syntax below.
+
+Formal Syntax
+
+ The notational conventions used here are identical to those used in
+ RFC 822 [RFC822].
+
+ The * (asterisk) convention is as follows:
+
+ l*m something
+
+ meaning at least l and at most m somethings, with l and m taking
+ default values of 0 and infinity, respectively.
+
+
+ message = headers 1*( CRLF *single-byte-char *segment
+ single-byte-seq *single-byte-char )
+ ; see also [MIME1] "body-part"
+ ; note: must end in ASCII
+
+ headers = <see [RFC822] "fields" and [MIME1] "body-part">
+
+ segment = single-byte-segment / double-byte-segment
+
+ single-byte-segment = single-byte-seq 1*single-byte-char
+
+
+
+Murai, Crispin & van der Poel [Page 2]
+
+RFC 1468 Japanese Character Encoding for Internet Messages June 1993
+
+
+ double-byte-segment = double-byte-seq 1*( one-of-94 one-of-94 )
+
+ single-byte-seq = ESC "(" ( "B" / "J" )
+
+ double-byte-seq = ESC "$" ( "@" / "B" )
+
+ CRLF = CR LF
+
+ ; ( Octal, Decimal.)
+
+ ESC = <ISO 2022 ESC, escape> ; ( 33, 27.)
+
+ SI = <ISO 2022 SI, shift-in> ; ( 17, 15.)
+
+ SO = <ISO 2022 SO, shift-out> ; ( 16, 14.)
+
+ CR = <ASCII CR, carriage return>; ( 15, 13.)
+
+ LF = <ASCII LF, linefeed> ; ( 12, 10.)
+
+ one-of-94 = <any one of 94 values> ; (41-176, 33.-126.)
+
+ 7BIT = <any 7-bit value> ; ( 0-177, 0.-127.)
+
+ single-byte-char = <any 7BIT, including bare CR & bare LF, but NOT
+ including CRLF, and not including ESC, SI, SO>
+
+MIME Considerations
+
+ The name given to the JUNET character encoding is "ISO-2022-JP". This
+ name is intended to be used in MIME messages as follows:
+
+ Content-Type: text/plain; charset=iso-2022-jp
+
+ The ISO-2022-JP encoding is already in 7-bit form, so it is not
+ necessary to use a Content-Transfer-Encoding header. It should be
+ noted that applying the Base64 or Quoted-Printable encoding will
+ render the message unreadable in current JUNET software.
+
+ ISO-2022-JP may also be used in MIME Part 2 headers. The "B"
+ encoding should be used with ISO-2022-JP text.
+
+Background Information
+
+ The JUNET encoding was described in the JUNET User's Guide [JUNET]
+ (JUNET Riyou No Tebiki Dai Ippan).
+
+ The encoding is based on the particular usage of ISO 2022 announced
+
+
+
+Murai, Crispin & van der Poel [Page 3]
+
+RFC 1468 Japanese Character Encoding for Internet Messages June 1993
+
+
+ by 4/1 (see [ISO2022] for details). However, the escape sequence
+ normally used for this announcement is not included in ISO-2022-JP
+ messages.
+
+ The Kana set of JIS X 0201 is not used in ISO-2022-JP messages.
+
+ In the past, some systems erroneously used the escape sequence ESC (
+ H in JUNET messages. This escape sequence is officially registered
+ for a Swedish character set [ISOREG], and should not be used in ISO-
+ 2022-JP messages.
+
+ Some systems do not distinguish between ESC ( B and ESC ( J or
+ between ESC $ @ and ESC $ B for display. However, when relaying a
+ message to another system, the escape sequences must not be altered
+ in any way.
+
+ The human user (not implementor) should try to keep lines within 80
+ display columns, or, preferably, within 75 (or so) columns, to allow
+ insertion of ">" at the beginning of each line in excerpts. Each JIS
+ X 0208 character takes up two columns, and the escape sequences do
+ not take up any columns. The implementor is reminded that JIS X 0208
+ characters take up two bytes and should not be split in the middle to
+ break lines for displaying, etc.
+
+ The JIS X 0208 standard was revised in 1990, to add two characters at
+ the end of the table. Although ISO 2022 specifies special additional
+ escape sequences to indicate the use of revised character sets, it is
+ suggested here not to make use of this special escape sequence in
+ ISO-2022-JP text, even if the two characters added to JIS X 0208 in
+ 1990 are used.
+
+ For further information about Japanese character encodings such as PC
+ codes, FTP locations of implementations, etc, see "Electronic
+ Handling of Japanese Text" [JPN.INF].
+
+References
+
+ [ASCII] American National Standards Institute, "Coded character set
+ -- 7-bit American national standard code for information
+ interchange", ANSI X3.4-1986.
+
+ [ISO646] International Organization for Standardization (ISO),
+ "Information technology -- ISO 7-bit coded character set for
+ information interchange", International Standard, Ref. No. ISO/IEC
+ 646:1991.
+
+ [ISO2022] International Organization for Standardization (ISO),
+ "Information processing -- ISO 7-bit and 8-bit coded character sets
+
+
+
+Murai, Crispin & van der Poel [Page 4]
+
+RFC 1468 Japanese Character Encoding for Internet Messages June 1993
+
+
+ -- Code extension techniques", International Standard, Ref. No. ISO
+ 2022-1986 (E).
+
+ [ISOREG] International Organization for Standardization (ISO),
+ "International Register of Coded Character Sets To Be Used With
+ Escape Sequences".
+
+ [JISX0201] Japanese Standards Association, "Code for Information
+ Interchange", JIS X 0201-1976.
+
+ [JISX0208] Japanese Standards Association, "Code of the Japanese
+ graphic character set for information interchange", JIS X 0208-1978,
+ -1983 and -1990.
+
+ [JPN.INF] Ken R. Lunde <lunde@adobe.com>, "Electronic Handling of
+ Japanese Text", March 1992,
+ msi.umn.edu(128.101.24.1):pub/lunde/japan[123].inf
+
+ [JUNET] JUNET Riyou No Tebiki Sakusei Iin Kai (JUNET User's Guide
+ Drafting Committee), "JUNET Riyou No Tebiki (Dai Ippan)" ("JUNET
+ User's Guide (First Edition)"), February 1988.
+
+ [MIME1] Borenstein N., and N. Freed, "MIME (Multipurpose
+ Internet Mail Extensions): Mechanisms for Specifying and
+ Describing the Format of Internet Message Bodies", RFC 1341,
+ Bellcore, Innosoft, June 1992.
+
+ [MIME2] Moore, K., "Representation of Non-ASCII Text in Internet
+ Message Headers", RFC 1342, University of Tennessee, June 1992.
+
+ [RFC822] Crocker, D., "Standard for the Format of ARPA Internet
+ Text Messages", STD 11, RFC 822, UDEL, August 1982.
+
+ [RFC1036] Horton M., and R. Adams, "Standard for Interchange of USENET
+ Messages", RFC 1036, AT&T Bell Laboratories, Center for Seismic
+ Studies, December 1987.
+
+Acknowledgements
+
+ Many people assisted in drafting this document. The authors wish to
+ thank in particular Akira Kato, Masahiro Sekiguchi and Ken'ichi
+ Handa.
+
+Security Considerations
+
+ Security issues are not discussed in this memo.
+
+
+
+
+
+Murai, Crispin & van der Poel [Page 5]
+
+RFC 1468 Japanese Character Encoding for Internet Messages June 1993
+
+
+Authors' Addresses
+
+ Jun Murai
+ Keio University
+ 5322 Endo, Fujisawa
+ Kanagawa 252 Japan
+
+ Fax: +81 466 49 1101
+ EMail: jun@wide.ad.jp
+
+
+ Mark Crispin
+ Panda Programming
+ 6158 Lariat Loop NE
+ Bainbridge Island, WA 98110-2098
+ USA
+
+ Phone: +1 206 842 2385
+ EMail: MRC@PANDA.COM
+
+
+ Erik M. van der Poel
+ A-105 Park Avenue
+ 4-4-10 Ohta, Kisarazu
+ Chiba 292 Japan
+
+ Phone: +81 438 22 5836
+ Fax: +81 438 22 5837
+ EMail: erik@poel.juice.or.jp
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Murai, Crispin & van der Poel [Page 6]
+ \ No newline at end of file