diff options
| author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 | 
|---|---|---|
| committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 | 
| commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
| tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1896.txt | |
| parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) | |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc1896.txt')
| -rw-r--r-- | doc/rfc/rfc1896.txt | 1179 | 
1 files changed, 1179 insertions, 0 deletions
diff --git a/doc/rfc/rfc1896.txt b/doc/rfc/rfc1896.txt new file mode 100644 index 0000000..889794d --- /dev/null +++ b/doc/rfc/rfc1896.txt @@ -0,0 +1,1179 @@ + + + + + + +Network Working Group                                         P. Resnick +Request for Comments: 1896                                      QUALCOMM +Obsoletes: 1523, 1563                                          A. Walker +Category: Informational                                         InterCon +                                                           February 1996 + + +                  The text/enriched MIME Content-type + +Status of this Memo + +   This memo provides information for the Internet community.  This memo +   does not specify an Internet standard of any kind.  Distribution of +   this memo is unlimited. + +Abstract + +   MIME [RFC-1521] defines a format and general framework for the +   representation of a wide variety of data types in Internet mail. This +   document defines one particular type of MIME data, the text/enriched +   MIME type. The text/enriched MIME type is intended to facilitate the +   wider interoperation of simple enriched text across a wide variety of +   hardware and software platforms. This document is only a minor +   revision to the text/enriched MIME type that was first described in +   [RFC-1523] and [RFC-1563], and is only intended to be used in the +   short term until other MIME types for text formatting in Internet +   mail are developed and deployed. + +The text/enriched MIME type + +   In order to promote the wider interoperability of simple formatted +   text, this document defines an extremely simple subtype of the MIME +   content-type "text", the "text/enriched" subtype. The content-type +   line for this type may have one optional parameter, the "charset" +   parameter, with the same values permitted for the "text/plain" MIME +   content-type. + +   The text/enriched subtype was designed to meet the following +   criteria: + +   1. The syntax must be extremely simple to parse, so that even +      teletype-oriented mail systems can easily strip away the +      formatting information and leave only the readable text. + +   2. The syntax must be extensible to allow for new formatting +      commands that are deemed essential for some application. + + + + + +Resnick & Walker             Informational                      [Page 1] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +   3. If the character set in use is ASCII or an 8-bit ASCII superset, +      then the raw form of the data must be readable enough to be +      largely unobjectionable in the event that it is displayed on the +      screen of the user of a non-MIME-conformant mail reader. + +   4. The capabilities must be extremely limited, to ensure that it can +      represent no more than is likely to be representable by the +      user's primary word processor. While this limits what can be +      sent, it increases the likelihood that what is sent can be +      properly displayed. + +   There are other text formatting standards which meet some of these +   criteria. In particular, HTML and SGML have come into widespread use +   on the Internet. However, there are two important reasons that this +   document further promotes the use of text/enriched in Internet mail +   over other such standards: + +   1. Most MIME-aware Internet mail applications are already able to +      either properly format text/enriched mail or, at the very least, +      are able to strip out the formatting commands and display the +      readable text. The same is not true for HTML or SGML. + +   2. The current RFC on HTML [RFC-1866] and Internet Drafts on SGML +      have many features which are not necessary for Internet mail, and +      are missing a few capabilities that text/enriched already has. + +   For these reasons, this document is promoting the use of +   text/enriched until other Internet standards come into more +   widespread use. For those who will want to use HTML, Appendix B of +   this document contains a very simple C program that converts +   text/enriched to HTML 2.0 described in [RFC-1866]. + +Syntax + +   The syntax of "text/enriched" is very simple. It represents text in a +   single character set--US-ASCII by default, although a different +   character set can be specified by the use of the "charset" parameter. +   (The semantics of text/enriched in non-ASCII character sets are +   discussed later in this document.) All characters represent +   themselves, with the exception of the "<" character (ASCII 60), which +   is used to mark the beginning of a formatting command. A literal +   less-than sign ("<") can be represented by a sequence of two such +   characters, "<<". + +   Formatting instructions consist of formatting commands surrounded by +   angle brackets ("<>", ASCII 60 and 62). Each formatting command may +   be no more than 60 characters in length, all in US-ASCII, restricted +   to the alphanumeric and hyphen ("-") characters. Formatting commands + + + +Resnick & Walker             Informational                      [Page 2] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +   may be preceded by a solidus ("/", ASCII 47), making them negations, +   and such negations must always exist to balance the initial opening +   commands.  Thus, if the formatting command "<bold>" appears at some +   point, there must later be a "</bold>" to balance it. (NOTE: The 60 +   character limit on formatting commands does NOT include the "<", ">", +   or "/" characters that might be attached to such commands.) +   Formatting commands are always case-insensitive. That is, "bold" and +   "BoLd" are equivalent in effect, if not in good taste. + +Line break rules + +   Line breaks (CRLF pairs in standard network representation) are +   handled specially. In particular, isolated CRLF pairs are translated +   into a single SPACE character. Sequences of N consecutive CRLF pairs, +   however, are translated into N-1 actual line breaks. This permits +   long lines of data to be represented in a natural looking manner +   despite the frequency of line-wrapping in Internet mailers. When +   preparing the data for mail transport, isolated line breaks should be +   inserted wherever necessary to keep each line shorter than 80 +   characters. When preparing such data for presentation to the user, +   isolated line breaks should be replaced by a single SPACE character, +   and N consecutive CRLF pairs should be presented to the user as N-1 +   line breaks. + +   Thus text/enriched data that looks like this: + +     This is +     a single +     line + +     This is the +     next line. + + +     This is the +     next section. + +   should be displayed by a text/enriched interpreter as follows: + +     This is a single line +     This is the next line. + +     This is the next section. + +   The formatting commands, not all of which will be implemented by all +   implementations, are described in the following sections. + + + + + +Resnick & Walker             Informational                      [Page 3] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +Formatting Commands + +   The text/enriched formatting commands all begin with <commandname> +   and end with </commandname>, affecting the formatting of the text +   between those two tokens. The commands are described here, grouped +   according to type. + +Parameter Command + +   Some of the formatting commands may require one or more associated +   parameters. The "param" command is a special formatting command used +   to include these parameters. + +     Param +          Marks the affected text as command parameters, to be +          interpreted or ignored by the text/enriched interpreter, +          but not to be shown to the reader. The "param" command +          always immediately follows some other formatting command, +          and the parameter data indicates some additional +          information about the formatting that is to be done. The +          syntax of the parameter data (whatever appears between +          the initial "<param>" and the terminating "</param>") is +          defined for each command that uses it. However, it is +          always required that the format of such data must not +          contain nested "param" commands, and either must not use +          the "<" character or must use it in a way that is +          compatible with text/enriched parsing. That is, the end +          of the parameter data should be recognizable with either +          of two algorithms: simply searching for the first +          occurrence of "</param>" or parsing until a balanced +          "</param>" command is found. In either case, however, the +          parameter data should not be shown to the human reader. + +Font-Alteration Commands + +   The following formatting commands are intended to alter the font in +   which text is displayed, but not to alter the indentation or +   justification state of the text: + +     Bold +          causes the affected text to be in a bold font. Nested +          bold commands have the same effect as a single bold +          command. + +     Italic +          causes the affected text to be in an italic font. Nested +          italic commands have the same effect as a single italic +          command. + + + +Resnick & Walker             Informational                      [Page 4] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +     Underline +          causes the affected text to be underlined. Nested +          underline commands have the same effect as a single +          underline command. + +     Fixed +          causes the affected text to be in a fixed width font. +          Nested fixed commands have the same effect as a single +          fixed command. + +     FontFamily +          causes the affected text to be displayed in a specified +          typeface. The "fontfamily" command requires a parameter +          that is specified by using the "param" command. The +          parameter data is a case-insensitive string containing +          the name of a font family. Any currently available font +          family name (e.g. Times, Palatino, Courier, etc.) may be +          used. This includes font families defined by commercial +          type foundries such as Adobe, BitStream, or any other +          such foundry. Note that implementations should only use +          the general font family name, not the specific font name +          (e.g. use "Times", not "TimesRoman" nor +          "TimesBoldItalic"). When nested, the inner "fontfamily" +          command takes precedence. Also note that the "fontfamily" +          command is advisory only; it should not be expected that +          other implementations will honor the typeface information +          in this command since the font capabilities of systems +          vary drastically. + +     Color +          causes the affected text to be displayed in a specified +          color. The "color" command requires a parameter that is +          specified by using the "param" command. The parameter +          data can be one of the following: + +               red +               blue +               green +               yellow +               cyan +               magenta +               black +               white + +          or an RGB color value in the form: + +               ####,####,#### + + + + +Resnick & Walker             Informational                      [Page 5] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +          where '#' is a hexadecimal digit '0' through '9', 'A' +          through 'F', or 'a' through 'f'. The three 4-digit +          hexadecimal values are the RGB values for red, green, and +          blue respectively, where each component is expressed as +          an unsigned value between 0 (0000) and 65535 (FFFF). The +          default color for the message is unspecified, though +          black is a common choice in many environments. When +          nested, the inner "color" command takes precedence. + +     Smaller +          causes the affected text to be in a smaller font. It is +          recommended that the font size be changed by two points, +          but other amounts may be more appropriate in some +          environments. Nested smaller commands produce ever +          smaller fonts, to the limits of the implementation's +          capacity to reasonably display them, after which further +          smaller commands have no incremental effect. + +     Bigger +          causes the affected text to be in a bigger font. It is +          recommended that the font size be changed by two points, +          but other amounts may be more appropriate in some +          environments. Nested bigger commands produce ever bigger +          fonts, to the limits of the implementation's capacity to +          reasonably display them, after which further bigger +          commands have no incremental effect. + +   While the "bigger" and "smaller" operators are effectively inverses, +   it is not recommended, for example, that "<smaller>" be used to end +   the effect of "<bigger>". This is properly done with "</bigger>". + +   Since the capabilities of implementations will vary, it is to be +   expected that some implementations will not be able to act on some of +   the font-alteration commands. However, an implementation should still +   display the text to the user in a reasonable fashion. In particular, +   the lack of capability to display a particular font family, color, or +   other text attribute does not mean that an implementation should fail +   to display text. + +Fill/Justification/Indentation Commands + +   Initially, text/enriched text is intended to be displayed fully +   filled (that is, using the rules specified for replacing CRLF pairs +   with spaces or removing them as appropriate) with appropriate kerning +   and letter-tracking, and using the maximum available margins as suits +   the capabilities of the receiving user agent software. + + + + + +Resnick & Walker             Informational                      [Page 6] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +   The following commands alter that state. Each of these commands force +   a line break before and after the formatting environment if there is +   not otherwise a line break. For example, if one of these commands +   occurs anywhere other than the beginning of a line of text as +   presented, a new line is begun. + +     Center +          causes the affected text to be centered. + +     FlushLeft +          causes the affected text to be left-justified with a +          ragged right margin. + +     FlushRight +          causes the affected text to be right-justified with a +          ragged left margin. + +     FlushBoth +          causes the affected text to be filled and padded so as to +          create smooth left and right margins, i.e., to be fully +          justified. + +     ParaIndent +          causes the running margins of the affected text to be +          moved in. The recommended indentation change is the width +          of four characters, but this may differ among +          implementations. The "paraindent" command requires a +          parameter that is specified by using the "param" command. +          The parameter data is a comma-seperated list of one or +          more of the following: + +          Left +               causes the running left margin to be moved to the +               right. + +          Right +               causes the running right margin to be moved to the +               left. + +          In +               causes the first line of the affected paragraph to +               be indented in addition to the running margin. The +               remaining lines remain flush to the running margin. + +          Out +               causes all lines except for the first line of the +               affected paragraph to be indented in addition to the +               running margin. The first line remains flush to the + + + +Resnick & Walker             Informational                      [Page 7] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +               running margin. + +     Nofill +          causes the affected text to be displayed without filling. +          That is, the text is displayed without using the rules +          for replacing CRLF pairs with spaces or removing +          consecutive sequences of CRLF pairs. However, the current +          state of the margins and justification is honored; any +          indentation or justification commands are still applied +          to the text within the scope of the "nofill". + +   The "center", "flushleft", "flushright", and "flushboth" commands are +   mutually exclusive, and, when nested, the inner command takes +   precedence. + +   The "nofill" command is mutually exclusive with the "in" and "out" +   parameters of the "paraindent" command; when they occur in the same +   scope, their behavior is undefined. + +   The parameter data for the "paraindent" command may contain multiple +   occurances of the same parameter (i.e. "left", "right", "in", or +   "out").  Each occurance causes the text to be further indented in the +   manner indicated by that parameter. Nested "paraindent" commands +   cause the affected text to be further indented according to the +   parameters. Note that the "in" and "out" parameters for "paraindent" +   are mutually exclusive; when they appear together or when nested +   "paraindent" commands contain both of them, their behavior is +   undefined. + +   For purposes of the "in" and "out" parameters, a paragraph is defined +   as text that is delimited by line breaks after applying the rules for +   replacing CRLF pairs with spaces or removing consecutive sequences of +   CRLF pairs. For example, within the scope of an "out", the line +   following each CRLF is made flush with the running margin, and +   subsequent lines are indented. Within the scope of an "in", the first +   line following each CRLF is indented, and subsequent lines remain +   flush to the running margin. + +   Whether or not text is justified by default (that is, whether the +   default environment is "flushleft", "flushright", or "flushboth") is +   unspecified, and depends on the preferences of the user, the +   capabilities of the local software and hardware, and the nature of +   the character set in use. On systems where full justification is +   considered undesirable, the "flushboth" environment may be identical +   to the default environment. Note that full justification should never +   be performed inside of "center", "flushleft", "flushright", or +   "nofill" environments.  Note also that for some non-ASCII character +   sets, full justification may be fundamentally inappropriate. + + + +Resnick & Walker             Informational                      [Page 8] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +   Note that [RFC-1563] defined two additional indentation commands, +   "Indent" and "IndentRight". These commands did not force a line +   break, and therefore their behavior was unpredictable since they +   depended on the margins and character sizes that a particular +   implementation used.  Therefore, their use is deprecated and they +   should be ignored just as other unrecognized commands. + +Markup Commands + +   Commands in this section, unlike the other text/enriched commands are +   declarative markup commands. Text/enriched is not intended as a full +   markup language, but instead as a simple way to represent common +   formatting commands. Therefore, markup commands are purposely kept to +   a minimum. It is only because each was deemed so prevalent or +   necessary in an e-mail environment that these particular commands +   have been included at all. + +     Excerpt +          causes the affected text to be interpreted as a textual +          excerpt from another source, probably a message being +          responded to. Typically this will be displayed using +          indentation and an alternate font, or by indenting lines +          and preceding them with "> ", but such decisions are up +          to the implementation. Note that as with the +          justification commands, the excerpt command implicitly +          begins and ends with a line break if one is not already +          there. Nested "excerpt" commands are acceptable and +          should be interpreted as meaning that the excerpted text +          was excerpted from yet another source. Again, this can be +          displayed using additional indentation, different colors, +          etc. + +          Optionally, the "excerpt" command can take a parameter by +          using the "param" command. The format of the data is +          unspecified, but it is intended to uniquely identify the +          text from which the excerpt is taken. With this +          information, an implementation should be able to uniquely +          identify the source of any particular excerpt, especially +          if two or more excerpts in the message are from the same +          source, and display it in some way that makes this +          apparent to the user. + +     Lang +          causes the affected text to be interpreted as belonging +          to a particular language. This is most useful when two +          different languages use the same character set, but may +          require a different font or formatting depending on the +          language. For instance, Chinese and Japanese share + + + +Resnick & Walker             Informational                      [Page 9] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +          similar character glyphs, and in some character sets like +          UNICODE share common code points, but it is considered +          very important that different fonts be used for the two +          languages, especially if they appear together, so that +          meaning is not lost. Also, language information can be +          used to allow for fancier text handling, like spell +          checking or hyphenation. + +          The "lang" command requires a parameter using the "param" +          command. The parameter data can be any of the language +          tags specified in [RFC-1766], "Tags for the +          Identification of Languages". These tags are the two +          letter language codes taken from [ISO-639] or can be +          other language codes that are registered according to the +          instructions in the Langauge Tags RFC. Consult that memo +          for further information. + +Balancing and Nesting of Formatting Commands + +   Pairs of formatting commands must be properly balanced and nested. +   Thus, a proper way to describe text in bold italics is: + +     <bold><italic>the-text</italic></bold> + +   or, alternately, + +     <italic><bold>the-text</bold></italic> + +   but, in particular, the following is illegal text/enriched: + +     <bold><italic>the-text</bold></italic> + +   The nesting requirement for formatting commands imposes a slightly +   higher burden upon the composers of text/enriched bodies, but +   potentially simplifies text/enriched displayers by allowing them to +   be stack-based. The main goal of text/enriched is to be simple enough +   to make multifont, formatted email widely readable, so that those +   with the capability of sending it will be able to do so with +   confidence. Thus slightly increased complexity in the composing +   software was deemed a reasonable tradeoff for simplified reading +   software. Nonetheless, implementors of text/enriched readers are +   encouraged to follow the general Internet guidelines of being +   conservative in what you send and liberal in what you accept. Those +   implementations that can do so are encouraged to deal reasonably with +   improperly nested text/enriched data. + + + + + + +Resnick & Walker             Informational                     [Page 10] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +Unrecognized formatting commands + +   Implementations must regard any unrecognized formatting command as +   "no-op" commands, that is, as commands having no effect, thus +   facilitating future extensions to "text/enriched". Private extensions +   may be defined using formatting commands that begin with "X-", by +   analogy to Internet mail header field names. + +   In order to formally define extended commands, a new Internet +   document should be published. + +White Space in Text/enriched Data + +   No special behavior is required for the SPACE or TAB (HT) character. +   It is recommended, however, that, at least when fixed-width fonts are +   in use, the common semantics of the TAB (HT) character should be +   observed, namely that it moves to the next column position that is a +   multiple of 8. (In other words, if a TAB (HT) occurs in column n, +   where the leftmost column is column 0, then that TAB (HT) should be +   replaced by 8-(n mod 8) SPACE characters.) It should also be noted +   that some mail gateways are notorious for losing (or, less commonly, +   adding) white space at the end of lines, so reliance on SPACE or TAB +   characters at the end of a line is not recommended. + +Initial State of a text/enriched interpreter + +   Text/enriched is assumed to begin with filled text in a variable- +   width font in a normal typeface and a size that is average for the +   current display and user. The left and right margins are assumed to +   be maximal, that is, at the leftmost and rightmost acceptable +   positions. + +Non-ASCII character sets + +   One of the great benefits of MIME is the ability to use different +   varieties of non-ASCII text in messages. To use non-ASCII text in a +   message, normally a charset parameter is specified in the Content- +   type line that indicates the character set being used. For purposes +   of this RFC, any legal MIME charset parameter can be used with the +   text/enriched Content-type. However, there are two difficulties that +   arise with regard to the text/enriched Content-type when non-ASCII +   text is desired. The first problem involves difficulties that occur +   when the user wishes to create text which would normally require +   multiple non-ASCII character sets in the same text/enriched message. +   The second problem is an ambiguity that arises because of the +   text/enriched use of the "<" character in formatting commands. + + + + + +Resnick & Walker             Informational                     [Page 11] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +Using multiple non-ASCII character sets + +   Normally, if a user wishes to produce text which contains characters +   from entirely different character sets within the same MIME message +   (for example, using Russian Cyrillic characters from ISO 8859-5 and +   Hebrew characters from ISO 8859-8), a multipart message is used. +   Every time a new character set is desired, a new MIME body part is +   started with different character sets specified in the charset +   parameter of the Content-type line. However, using multiple character +   sets this way in text/enriched messages introduces problems. Since a +   change in the charset parameter requires a new part, text/enriched +   formatting commands used in the first part would not be able to apply +   to text that occurs in subsequent parts. It is not possible for +   text/enriched formatting commands to apply across MIME body part +   boundaries. + +   [RFC-1341] attempted to get around this problem in the now obsolete +   text/richtext format by introducing different character set +   formatting commands like "iso-8859-5" and "us-ascii". But this, or +   even a more general solution along the same lines, is still +   undesirable: It is common for a MIME application to decide, for +   example, what character font resources or character lookup tables it +   will require based on the information provided by the charset +   parameter of the Content-type line, before it even begins to +   interpret or display the data in that body part. By allowing the +   text/enriched interpreter to subsequently change the character set, +   perhaps to one completely different from the charset specified in the +   Content-type line (with potentially much different resource +   requirements), too much burden would be placed on the text/enriched +   interpreter itself. + +   Therefore, if multiple types of non-ASCII characters are desired in a +   text/enriched document, one of the following two methods must be +   used: + +   1. For cases where the different types of non-ASCII text can be +      limited to their own paragraphs with distinct formatting, a +      multipart message can be used with each part having a +      Content-Type of text/enriched and a different charset parameter. +      The one caveat to using this method is that each new part must +      start in the initial state for a text/enriched document. That +      means that all of the text/enriched commands in the preceding +      part must be properly balanced with ending commands before the +      next text/enriched part begins. Also, each text/enriched part +      must begin a new paragraph. + + + + + + +Resnick & Walker             Informational                     [Page 12] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +   2. If different types of non-ASCII text are to appear in the same +      line or paragraph, or if text/enriched formatting (e.g. margins, +      typeface, justification) is required across several different +      types of non-ASCII text, a single text/enriched body part should +      be used with a character set specified that contains all of the +      required characters. For example, a charset parameter of +      "UNICODE-1-1-UTF-7" as specified in [RFC-1642] could be used for +      such purposes. Not only does UNICODE contain all of the +      characters that can be represented in all of the other registered +      ISO 8859 MIME character sets, but UTF-7 is fully compatible with +      other aspects of the text/enriched standard, including the use of +      the "<" character referred to below. Any other character sets +      that are specified for use in MIME which contain different types +      of non-ASCII text can also be used in these instances. + +Use of the "<" character in formatting commands + +   If the character set specified by the charset parameter on the +   Content-type line is anything other than "US-ASCII", this means that +   the text being described by text/enriched formatting commands is in a +   non-ASCII character set. However, the commands themselves are still +   the same ASCII commands that are defined in this document. This +   creates an ambiguity only with reference to the "<" character, the +   octet with numeric value 60. In single byte character sets, such as +   the ISO-8859 family, this is not a problem; the octet 60 can be +   quoted by including it twice, just as for ASCII. The problem is more +   complicated, however, in the case of multi-byte character sets, where +   the octet 60 might appear at any point in the byte sequence for any +   of several characters. + +   In practice, however, most multi-byte character sets address this +   problem internally. For example, the UNICODE character sets can use +   the UTF-7 encoding which preserves all of the important ASCII +   characters in their single byte form. The ISO-2022 family of +   character sets can use certain character sequences to switch back +   into ASCII at any moment.  Therefore it is specified that, before +   text/enriched formatting commands, the prevailing character set +   should be "switched back" into ASCII, and that only those characters +   which would be interpreted as "<" in plain text should be interpreted +   as token delimiters in text/enriched. + +   The question of what to do for hypothetical future character sets +   that do not subsume ASCII is not addressed in this memo. + + + + + + + + +Resnick & Walker             Informational                     [Page 13] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +Minimal text/enriched conformance + +   A minimal text/enriched implementation is one that converts "<<" to +   "<", removes everything between a <param> command and the next +   balancing </param> command, removes all other formatting commands +   (all text enclosed in angle brackets), and, outside of <nofill> +   environments, converts any series of n CRLFs to n-1 CRLFs, and +   converts any lone CRLF pairs to SPACE. + +Notes for Implementors + +   It is recognized that implementors of future mail systems will want +   rich text functionality far beyond that currently defined for +   text/enriched.  The intent of text/enriched is to provide a common +   format for expressing that functionality in a form in which much of +   it, at least, will be understood by interoperating software. Thus, in +   particular, software with a richer notion of formatted text than +   text/enriched can still use text/enriched as its basic +   representation, but can extend it with new formatting commands and by +   hiding information specific to that software system in text/enriched +   <param> constructs. As such systems evolve, it is expected that the +   definition of text/enriched will be further refined by future +   published specifications, but text/enriched as defined here provides +   a platform on which evolutionary refinements can be based. + +   An expected common way that sophisticated mail programs will generate +   text/enriched data is as part of a multipart/alternative construct. +   For example, a mail agent that can generate enriched mail in ODA +   format can generate that mail in a more widely interoperable form by +   generating both text/enriched and ODA versions of the same data, +   e.g.: + +     Content-type: multipart/alternative; boundary=foo + +     --foo +     Content-type: text/enriched + +     [text/enriched version of data] +     --foo Content-type: application/oda + +     [ODA version of data] +     --foo-- + +   If such a message is read using a MIME-conformant mail reader that +   understands ODA, the ODA version will be displayed; otherwise, the +   text/enriched version will be shown. + + + + + +Resnick & Walker             Informational                     [Page 14] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +   In some environments, it might be impossible to combine certain +   text/enriched formatting commands, whereas in others they might be +   combined easily. For example, the combination of <bold> and <italic> +   might produce bold italics on systems that support such fonts, but +   there exist systems that can make text bold or italicized, but not +   both. In such cases, the most recently issued (innermost) recognized +   formatting command should be preferred. + +   One of the major goals in the design of text/enriched was to make it +   so simple that even text-only mailers will implement enriched-to- +   plain-text translators, thus increasing the likelihood that enriched +   text will become "safe" to use very widely. To demonstrate this +   simplicity, an extremely simple C program that converts text/enriched +   input into plain text output is included in Appendix A. + +Extensions to text/enriched + +   It is expected that various mail system authors will desire +   extensions to text/enriched. The simple syntax of text/enriched, and +   the specification that unrecognized formatting commands should simply +   be ignored, are intended to promote such extensions. + +An Example + +   Putting all this together, the following "text/enriched" body +   fragment: + +     From: Nathaniel Borenstein <nsb@bellcore.com> +     To: Ned Freed <ned@innosoft.com> +     Content-type: text/enriched + +     <bold>Now</bold> is the time for <italic>all</italic> +     good men +     <smaller>(and <<women>)</smaller> to +     <ignoreme>come</ignoreme> + +     to the aid of their + + +     <color><param>red</param>beloved</color> +     country. + +     By the way, +     I think that <paraindent><param>left</param><<smaller> + +     </paraindent>should REALLY be called + +     <paraindent><param>left</param><<tinier></paraindent> + + + +Resnick & Walker             Informational                     [Page 15] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +     and that I am always right. + +     -- the end + +   represents the following formatted text (which will, no doubt, look +   somewhat cryptic in the text-only version of this document): + +     Now is the time for all good men (and <women>) to come +     to the aid of their + +     beloved country. +     By the way, I think that +          <smaller> +     should REALLY be called +          <tinier> +     and that I am always right. +     -- the end + +   where the word "beloved" would be in red on a color display. + +   ti 0 Security Considerations + +   Security issues are not discussed in this memo, as the mechanism +   raises no security issues. + +Authors' Addresses + +   For more information, the authors of this document may be contacted +   via Internet mail: + +   Peter W. Resnick +   QUALCOMM Incorporated +   6455 Lusk Boulevard +   San Diego, CA 92121-2779 + +   Phone: +1 619 587 1121 +   Fax: +1 619 658 2230 +   EMail: presnick@qualcomm.com + + +   Amanda Walker +   InterCon Systems Corporation +   950 Herndon Parkway +   Herndon, VA 22070 + +   Phone: +1 703 709 5500 +   Fax: +1 703 709 5555 +   EMail: amanda@intercon.com + + + +Resnick & Walker             Informational                     [Page 16] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +Acknowledgements + +   The authors gratefully acknowledge the input of many contributors, +   readers, and implementors of the specification in this document. +   Particular thanks are due to Nathaniel Borenstein, the original +   author of RFC 1563. + +References + +   [RFC-1341] +        Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail +        Extensions): Mechanisms for Specifying and Describing the Format +        of Internet Message Bodies", 06/11/1992. + +   [RFC-1521] +        Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail +        Extensions) Part One: Mechanisms for Specifying and Describing +        the Format of Internet Message Bodies", 09/23/1993. + +   [RFC-1523] +        Borenstein, N., "The text/enriched MIME Content-type", +        09/23/1993. + +   [RFC-1563] +        Borenstein, N., "The text/enriched MIME Content-type", +        01/10/1994. + +   [RFC-1642] +        Goldsmith, D., Davis, M., "UTF-7 - A Mail-Safe Transformation +        Format of Unicode", 07/13/1994. + +   [RFC-1766] +        Alvestrand, H., "Tags for the Identification of Languages", +        03/02/1995. + +   [RFC-1866] +        Berners-Lee, T., and D. Connolly, D., "Hypertext Markup Language +        - 2.0", 11/03/1995. + + + + + + + + + + + + + +Resnick & Walker             Informational                     [Page 17] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +Appendix A--A Simple enriched-to-plain Translator in C + +   One of the major goals in the design of the text/enriched subtype of +   the text Content-Type is to make formatted text so simple that even +   text-only mailers will implement enriched-to-plain-text translators, +   thus increasing the likelihood that multifont text will become "safe" +   to use very widely. To demonstrate this simplicity, what follows is a +   simple C program that converts text/enriched input into plain text +   output. Note that the local newline convention (the single character +   represented by "\n") is assumed by this program, but that special +   CRLF handling might be necessary on some systems. + +#include <ctype.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +main() { +    int c, i, paramct=0, newlinect=0, nofill=0; +    char token[62], *p; + +    while ((c=getc(stdin)) != EOF) { +        if (c == '<') { +            if (newlinect == 1) putc(' ', stdout); +            newlinect = 0; +            c = getc(stdin); +            if (c == '<') { +                if (paramct <= 0) putc(c, stdout); +            } else { +                ungetc(c, stdin); +                for (i=0, p=token; +                    (c=getc(stdin)) != EOF && c != '>'; i++) { +                    if (i < sizeof(token)-1) +                        *p++ = isupper(c) ? tolower(c) : c; +                } +                *p = '\0'; +                if (c == EOF) break; +                if (strcmp(token, "param") == 0) +                    paramct++; +                else if (strcmp(token, "nofill") == 0) +                    nofill++; +                else if (strcmp(token, "/param") == 0) +                    paramct--; +                else if (strcmp(token, "/nofill") == 0) +                    nofill--; +            } +        } else { +            if (paramct > 0) + + + +Resnick & Walker             Informational                     [Page 18] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +                ; /* ignore params */ +            else if (c == '\n' && nofill <= 0) { +                if (++newlinect > 1) putc(c, stdout); +            } else { +                if (newlinect == 1) putc(' ', stdout); +                newlinect = 0; +                putc(c, stdout); +            } +        } +    } +    /* The following line is only needed with line-buffering */ +    putc('\n', stdout); +    exit(0); +} + +   It should be noted that one can do considerably better than this in +   displaying text/enriched data on a dumb terminal. In particular, one +   can replace font information such as "bold" with textual emphasis +   (like *this* or _T_H_I_S_). One can also properly handle the +   text/enriched formatting commands regarding indentation, +   justification, and others.  However, the above program is all that is +   necessary in order to present text/enriched on a dumb terminal +   without showing the user any formatting artifacts. + +Appendix B--A Simple enriched-to-HTML Translator in C + +   It is fully expected that other text formatting standards like HTML +   and SGML will supplant text/enriched in Internet mail. It is also +   likely that as this happens, recipients of text/enriched mail will +   wish to view such mail with an HTML viewer. To this end, the +   following is a simple example of a C program to convert text/enriched +   to HTML. Since the current version of HTML at the time of this +   document's publication is HTML 2.0 defined in [RFC-1866], this +   program converts to that standard.  There are several text/enriched +   commands that have no HTML 2.0 equivalent. In those cases, this +   program simply puts those commands into processing instructions; that +   is, surrounded by "<?" and ">". As in Appendix A, the local newline +   convention (the single character represented by "\n") is assumed by +   this program, but special CRLF handling might be necessary on some +   systems. + +#include <ctype.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +main() { +    int c, i, paramct=0, nofill=0; + + + +Resnick & Walker             Informational                     [Page 19] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +    char token[62], *p; + +    while((c=getc(stdin)) != EOF) { +        if(c == '<') { +            c = getc(stdin); +            if(c == '<') { +                fputs("<", stdout); +            } else { +                ungetc(c, stdin); +                for (i=0, p=token; +                    (c=getc(stdin)) != EOF && c != '>'; i++) { +                    if (i < sizeof(token)-1) +                        *p++ = isupper(c) ? tolower(c) : c; +                } +                *p = '\0'; +                if(c == EOF) break; +                if(strcmp(token, "/param") == 0) { +                    paramct--; +                    putc('>', stdout); +                } else if(paramct > 0) { +                    fputs("<", stdout); +                    fputs(token, stdout); +                    fputs(">", stdout); +                } else { +                    putc('<', stdout); +                    if(strcmp(token, "nofill") == 0) { +                        nofill++; +                        fputs("pre", stdout); +                    } else if(strcmp(token, "/nofill") == 0) { +                        nofill--; +                        fputs("/pre", stdout); +                    } else if(strcmp(token, "bold") == 0) { +                        fputs("b", stdout); +                    } else if(strcmp(token, "/bold") == 0) { +                        fputs("/b", stdout); +                    } else if(strcmp(token, "italic") == 0) { +                        fputs("i", stdout); +                    } else if(strcmp(token, "/italic") == 0) { +                        fputs("/i", stdout); +                    } else if(strcmp(token, "fixed") == 0) { +                        fputs("tt", stdout); +                    } else if(strcmp(token, "/fixed") == 0) { +                        fputs("/tt", stdout); +                    } else if(strcmp(token, "excerpt") == 0) { +                        fputs("blockquote", stdout); +                    } else if(strcmp(token, "/excerpt") == 0) { +                        fputs("/blockquote", stdout); +                    } else { + + + +Resnick & Walker             Informational                     [Page 20] + +RFC 1896            text/enriched MIME Content-type        February 1996 + + +                        putc('?', stdout); +                        fputs(token, stdout); +                        if(strcmp(token, "param") == 0) { +                            paramct++; +                            putc(' ', stdout); +                            continue; +                        } +                    } +                    putc('>', stdout); +                } +            } +        } else if(c == '>') { +            fputs(">", stdout); +        } else if (c == '&') { +            fputs("&", stdout); +        } else { +            if(c == '\n' && nofill <= 0 && paramct <= 0) { +                while((i=getc(stdin)) == '\n') fputs("<br>", stdout); +                ungetc(i, stdin); +            } +            putc(c, stdout); +        } +    } +    /* The following line is only needed with line-buffering */ +    putc('\n', stdout); +    exit(0); +} + + + + + + + + + + + + + + + + + + + + + + + + +Resnick & Walker             Informational                     [Page 21] +  |