1 files changed, 339 insertions, 0 deletions
diff --git a/doc/rfc/rfc1554.txt b/doc/rfc/rfc1554.txt
new file mode 100644
index 0000000..544b34e
--- /dev/null
+++ b/doc/rfc/rfc1554.txt
@@ -0,0 +1,339 @@
+
+
+
+
+
+
+Network Working Group                                            M. Ohta
+Request for Comments: 1554                 Tokyo Institute of Technology
+Category: Informational                                         K. Handa
+                                                                     ETL
+                                                           December 1993
+
+
+          ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP
+
+Status of this Memo
+
+   This memo provides information for the Internet community.  This memo
+   does not specify an Internet standard of any kind.  Distribution of
+   this memo is unlimited.
+
+Introduction
+
+   This memo describes a text encoding scheme: "ISO-2022-JP-2", which is
+   used experimentally for electronic mail [RFC822] and network news
+   [RFC1036] messages in several Japanese networks.  The encoding is a
+   multilingual extension of "ISO-2022-JP", the existing encoding for
+   Japanese [2022JP].  The encoding is supported by an Emacs based
+   multilingual text editor: MULE [MULE].
+
+   The name, "ISO-2022-JP-2", is intended to be used in the "charset"
+   parameter field of MIME headers (see [MIME1] and [MIME2]).
+
+Description
+
+   The text with "ISO-2022-JP-2" starts in ASCII [ASCII], and switches
+   to other character sets of ISO 2022 [ISO2022] through limited
+   combinations of escape sequences.  All the characters are encoded
+   with 7 bits only.
+
+   At the beginning of text, the existence of an announcer sequence:
+   "ESC 2/0 4/1 ESC 2/0 4/6 ESC 2/0 5/10" is (though omitted) assumed.
+   Thus, characters of 94 character sets are designated to G0 and
+   invoked as GL.  C1 control characters are represented with 7 bits.
+   Characters of 96 character sets are designated to G2 and invoked with
+   SS2 (single shift two, "ESC 4/14" or "ESC N").
+
+   For example, the escape sequence "ESC 2/4 2/8 4/3" or "ESC $ ( C"
+   indicates that the bytes following the escape sequence are Korean KSC
+   characters, which are encoded in two bytes each.  The escape sequence
+   "ESC 2/14 4/1" or "ESC . A" indicates that ISO 8859-1 is designated
+   to G2. After the designation, the single shifted sequence "ESC 4/14
+   4/1" or "ESC N A" is interpreted to represent a character "A with
+   acute".
+
+
+
+Ohta & Handa                                                    [Page 1]
+
+RFC 1554         Multilingual Extension of ISO-2022-JP     December 1993
+
+
+   The following table gives the escape sequences and the character sets
+   used in "ISO-2022-JP-2" messages. The reg# is the registration number
+   in ISO's registry [ISOREG].
+
+                              94 character sets
+      reg#  character set      ESC sequence                designated to
+      ------------------------------------------------------------------
+      6     ASCII              ESC 2/8 4/2      ESC ( B    G0
+      42    JIS X 0208-1978    ESC 2/4 4/0      ESC $ @    G0
+      87    JIS X 0208-1983    ESC 2/4 4/2      ESC $ B    G0
+      14    JIS X 0201-Roman   ESC 2/8 4/10     ESC ( J    G0
+      58    GB2312-1980        ESC 2/4 4/1      ESC $ A    G0
+      149   KSC5601-1987       ESC 2/4 2/8 4/3  ESC $ ( C  G0
+      159   JIS X 0212-1990    ESC 2/4 2/8 4/4  ESC $ ( D  G0
+
+                              96 character sets
+      reg#  character set      ESC sequence                designated to
+      ------------------------------------------------------------------
+      100   ISO8859-1          ESC 2/14 4/1     ESC . A    G2
+      126   ISO8859-7(Greek)   ESC 2/14 4/6     ESC . F    G2
+
+   For further information about the character sets and the escape
+   sequences, see [ISO2022] and [ISOREG].
+
+   If there is any G0 designation in text, there must be a switch to
+   ASCII or to JIS X 0201-Roman before a space character (but not
+   necessarily before "ESC 4/14 2/0" or "ESC N ' '") or control
+   characters such as tab or CRLF.  This means that the next line starts
+   in the character set that was switched to before the end of the
+   previous line.  Though the designation to JIS X 0201-Roman is allowed
+   for backward compatibility to "ISO-2022-JP", its use is discouraged.
+   Applications such as pagers and editors which randomly seek within a
+   text file encoded with "ISO-2022-JP-2" may assume that all the lines
+   begin with ASCII, not with JIS X 0201-Roman.
+
+   At the beginning of a line, information on G2 designation of the
+   previous line is cleared.  New designation must be given before a
+   character in 96 character sets is used in the line.
+
+   The text must end in ASCII designated to G0.
+
+   As the "ISO-2022-JP", and thus, "ISO-2022-JP-2", is designed to
+   represent English and modern Japanese, left-to-right directionality
+   is assumed if the text is displayed horizontally.
+
+   Users of "ISO-2022-JP-2" must be aware that some common transport
+   such as old Bnews can not relay a 7-bit value 7/15 (decimal 127),
+   which is used to encode, say, "y with diaeresis" of ISO 8859-1.
+
+
+
+Ohta & Handa                                                    [Page 2]
+
+RFC 1554         Multilingual Extension of ISO-2022-JP     December 1993
+
+
+   Other restrictions are given in the Formal Syntax section below.
+
+Formal Syntax
+
+   The notational conventions used here are identical to those used in
+   STD 11, RFC 822 [RFC822].
+
+   The * (asterisk) convention is as follows:
+
+      l*m something
+
+   meaning at least l and at most m somethings, with l and m taking
+   default values of 0 and infinity, respectively.
+
+   message             = headers 1*(CRLF text)
+                                          ; see also [MIME1] "body-part"
+                                          ; note: must end in ASCII
+
+   text                = *(single-byte-char /
+                           g2-desig-seq /
+                           single-shift-char)
+                          [*segment
+                           reset-seq
+                           *(single-byte-char /
+                             g2-desig-seq /
+                             single-shift-char ) ]
+                                          ; note: g2-desig-seq must
+                                          ; precede single-shift-char
+
+   headers             = <see [RFC822] "fields" and [MIME1] "body-part">
+
+   segment             = single-byte-segment / double-byte-segment
+
+   single-byte-segment = single-byte-seq
+                         *(single-byte-char /
+                           g2-desig-seq /
+                           single-shift-char )
+
+   double-byte-segment = double-byte-seq
+                         *((one-of-94 one-of-94) /
+                           g2-desig-seq /
+                           single-shift-char )
+
+   reset-seq           = ESC "(" ( "B" / "J" )
+
+   single-byte-seq     = ESC "(" ( "B" / "J" )
+
+   double-byte-seq     = (ESC "$" ( "@" / "A" / "B" )) /
+
+
+
+Ohta & Handa                                                    [Page 3]
+
+RFC 1554         Multilingual Extension of ISO-2022-JP     December 1993
+
+
+                         (ESC "$" "(" ( "C" / "D" ))
+
+   g2-desig-seq        = ESC "." ( "A" / "F" )
+
+   single-shift-seq    = ESC "N"
+
+   single-shift-char   = single-shift-seq one-of-96
+
+   CRLF                = CR LF
+
+                                                    ; ( Octal, Decimal.)
+
+   ESC                 = <ISO 2022 ESC, escape>     ; (    33,      27.)
+
+   SI                  = <ISO 2022 SI, shift-in>    ; (    17,      15.)
+
+   SO                  = <ISO 2022 SO, shift-out>   ; (    16,      14.)
+
+   CR                  = <ASCII CR, carriage return>; (    15,      13.)
+
+   LF                  = <ASCII LF, linefeed>       ; (    12,      10.)
+
+   one-of-94           = <any one of 94 values>     ; (41-176, 33.-126.)
+
+   one-of-96           = <any one of 96 values>     ; (40-177, 32.-127.)
+
+   7BIT                = <any 7-bit value>          ; ( 0-177,  0.-127.)
+
+   single-byte-char    = <any 7BIT, including bare CR & bare LF, but NOT
+                          including CRLF, and not including ESC, SI, SO>
+
+MIME Considerations
+
+   The name given to the character encoding is "ISO-2022-JP-2". This
+   name is intended to be used in MIME messages as follows:
+
+      Content-Type: text/plain; charset=iso-2022-jp-2
+
+   The "ISO-2022-JP-2" encoding is already in 7-bit form, so it is not
+   necessary to use a Content-Transfer-Encoding header. It should be
+   noted that applying the Base64 or Quoted-Printable encoding will
+   render the message unreadable in non-MIME-compliant software.
+
+   "ISO-2022-JP-2" may also be used in MIME headers.  Both "B" and "Q"
+   encoding could be useful with "ISO-2022-JP-2" text.
+
+
+
+
+
+
+Ohta & Handa                                                    [Page 4]
+
+RFC 1554         Multilingual Extension of ISO-2022-JP     December 1993
+
+
+References
+
+   [ASCII] American National Standards Institute, "Coded character set
+           -- 7-bit American national standard code for information
+           interchange", ANSI X3.4-1986.
+
+
+   [ISO2022] International Organization for Standardization (ISO),
+             "Information processing -- ISO 7-bit and 8-bit coded
+             character sets -- Code extension techniques",
+             International Standard, Ref. No. ISO 2022-1986 (E).
+
+   [ISOREG] International Organization for Standardization (ISO),
+            "International Register of Coded Character Sets To Be Used
+            With Escape Sequences".
+
+   [MIME1] Borenstein, N., and N. Freed, "MIME  (Multipurpose Internet
+           Mail Extensions) Part One: Mechanisms for Specifying and
+           Describing the Format of Internet Message Bodies", RFC 1521,
+           September 1993.
+
+   [MIME2] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part
+           Two: Message Header Extensions for Non-ASCII Text", RFC 1522,
+           September 1993.
+
+   [RFC822] Crocker, D., "Standard for the Format of ARPA Internet Text
+            Messages", STD 11, RFC 1522, UDEL, August 1982.
+
+   [RFC1036] Horton M., and R. Adams, "Standard for Interchange of
+             USENET Messages", RFC 1036, AT&T Bell Laboratories, Center
+             for Seismic Studies, December 1987.
+
+   [2022JP] Murai, J., Crispin, M., and E. van der Poel, "Japanese
+            Character Encoding for Internet Messages", RFC 1468, June
+            1993.
+
+   [MULE] Nishikimi, M., Handa, K., and S. Tomura, "Mule: MULtilingual
+          Enhancement to GNU Emacs", Proc. of INET'93, August, 1993.
+
+Acknowledgements
+
+   This memo is the result of discussion between various people in a
+   news group: fj.kanji and is reviewed by a mailing list: jp-msg
+   @iij.ad.jp.  The Authors wish to thank in particular Prof. Eiichi
+   Wada for his suggestions based on profound knowledge in ISO 2022 and
+   related standards.
+
+
+
+
+
+Ohta & Handa                                                    [Page 5]
+
+RFC 1554         Multilingual Extension of ISO-2022-JP     December 1993
+
+
+Security Considerations
+
+   Security issues are not discussed in this memo.
+
+Authors' Addresses
+
+   Masataka Ohta
+   Tokyo Institute of Technology
+   2-12-1, O-okayama, Meguro-ku,
+   Tokyo 152, JAPAN
+
+   Phone: +81-3-5499-7084
+   Fax: +81-3-3729-1940
+   EMail: mohta@cc.titech.ac.jp
+
+
+   Ken'ichi Handa
+   Electrotechnical Laboratory
+   Umezono 1-1-4, Tsukuba,
+   Ibaraki 305, JAPAN
+
+   Phone: +81-298-58-5916
+   Fax: +81-298-58-5918
+   EMail: handa@etl.go.jp
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Ohta & Handa                                                    [Page 6]
+
+\ No newline at end of file