diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1896.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc1896.txt')
-rw-r--r-- | doc/rfc/rfc1896.txt | 1179 |
1 files changed, 1179 insertions, 0 deletions
diff --git a/doc/rfc/rfc1896.txt b/doc/rfc/rfc1896.txt new file mode 100644 index 0000000..889794d --- /dev/null +++ b/doc/rfc/rfc1896.txt @@ -0,0 +1,1179 @@ + + + + + + +Network Working Group P. Resnick +Request for Comments: 1896 QUALCOMM +Obsoletes: 1523, 1563 A. Walker +Category: Informational InterCon + February 1996 + + + The text/enriched MIME Content-type + +Status of this Memo + + This memo provides information for the Internet community. This memo + does not specify an Internet standard of any kind. Distribution of + this memo is unlimited. + +Abstract + + MIME [RFC-1521] defines a format and general framework for the + representation of a wide variety of data types in Internet mail. This + document defines one particular type of MIME data, the text/enriched + MIME type. The text/enriched MIME type is intended to facilitate the + wider interoperation of simple enriched text across a wide variety of + hardware and software platforms. This document is only a minor + revision to the text/enriched MIME type that was first described in + [RFC-1523] and [RFC-1563], and is only intended to be used in the + short term until other MIME types for text formatting in Internet + mail are developed and deployed. + +The text/enriched MIME type + + In order to promote the wider interoperability of simple formatted + text, this document defines an extremely simple subtype of the MIME + content-type "text", the "text/enriched" subtype. The content-type + line for this type may have one optional parameter, the "charset" + parameter, with the same values permitted for the "text/plain" MIME + content-type. + + The text/enriched subtype was designed to meet the following + criteria: + + 1. The syntax must be extremely simple to parse, so that even + teletype-oriented mail systems can easily strip away the + formatting information and leave only the readable text. + + 2. The syntax must be extensible to allow for new formatting + commands that are deemed essential for some application. + + + + + +Resnick & Walker Informational [Page 1] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + 3. If the character set in use is ASCII or an 8-bit ASCII superset, + then the raw form of the data must be readable enough to be + largely unobjectionable in the event that it is displayed on the + screen of the user of a non-MIME-conformant mail reader. + + 4. The capabilities must be extremely limited, to ensure that it can + represent no more than is likely to be representable by the + user's primary word processor. While this limits what can be + sent, it increases the likelihood that what is sent can be + properly displayed. + + There are other text formatting standards which meet some of these + criteria. In particular, HTML and SGML have come into widespread use + on the Internet. However, there are two important reasons that this + document further promotes the use of text/enriched in Internet mail + over other such standards: + + 1. Most MIME-aware Internet mail applications are already able to + either properly format text/enriched mail or, at the very least, + are able to strip out the formatting commands and display the + readable text. The same is not true for HTML or SGML. + + 2. The current RFC on HTML [RFC-1866] and Internet Drafts on SGML + have many features which are not necessary for Internet mail, and + are missing a few capabilities that text/enriched already has. + + For these reasons, this document is promoting the use of + text/enriched until other Internet standards come into more + widespread use. For those who will want to use HTML, Appendix B of + this document contains a very simple C program that converts + text/enriched to HTML 2.0 described in [RFC-1866]. + +Syntax + + The syntax of "text/enriched" is very simple. It represents text in a + single character set--US-ASCII by default, although a different + character set can be specified by the use of the "charset" parameter. + (The semantics of text/enriched in non-ASCII character sets are + discussed later in this document.) All characters represent + themselves, with the exception of the "<" character (ASCII 60), which + is used to mark the beginning of a formatting command. A literal + less-than sign ("<") can be represented by a sequence of two such + characters, "<<". + + Formatting instructions consist of formatting commands surrounded by + angle brackets ("<>", ASCII 60 and 62). Each formatting command may + be no more than 60 characters in length, all in US-ASCII, restricted + to the alphanumeric and hyphen ("-") characters. Formatting commands + + + +Resnick & Walker Informational [Page 2] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + may be preceded by a solidus ("/", ASCII 47), making them negations, + and such negations must always exist to balance the initial opening + commands. Thus, if the formatting command "<bold>" appears at some + point, there must later be a "</bold>" to balance it. (NOTE: The 60 + character limit on formatting commands does NOT include the "<", ">", + or "/" characters that might be attached to such commands.) + Formatting commands are always case-insensitive. That is, "bold" and + "BoLd" are equivalent in effect, if not in good taste. + +Line break rules + + Line breaks (CRLF pairs in standard network representation) are + handled specially. In particular, isolated CRLF pairs are translated + into a single SPACE character. Sequences of N consecutive CRLF pairs, + however, are translated into N-1 actual line breaks. This permits + long lines of data to be represented in a natural looking manner + despite the frequency of line-wrapping in Internet mailers. When + preparing the data for mail transport, isolated line breaks should be + inserted wherever necessary to keep each line shorter than 80 + characters. When preparing such data for presentation to the user, + isolated line breaks should be replaced by a single SPACE character, + and N consecutive CRLF pairs should be presented to the user as N-1 + line breaks. + + Thus text/enriched data that looks like this: + + This is + a single + line + + This is the + next line. + + + This is the + next section. + + should be displayed by a text/enriched interpreter as follows: + + This is a single line + This is the next line. + + This is the next section. + + The formatting commands, not all of which will be implemented by all + implementations, are described in the following sections. + + + + + +Resnick & Walker Informational [Page 3] + +RFC 1896 text/enriched MIME Content-type February 1996 + + +Formatting Commands + + The text/enriched formatting commands all begin with <commandname> + and end with </commandname>, affecting the formatting of the text + between those two tokens. The commands are described here, grouped + according to type. + +Parameter Command + + Some of the formatting commands may require one or more associated + parameters. The "param" command is a special formatting command used + to include these parameters. + + Param + Marks the affected text as command parameters, to be + interpreted or ignored by the text/enriched interpreter, + but not to be shown to the reader. The "param" command + always immediately follows some other formatting command, + and the parameter data indicates some additional + information about the formatting that is to be done. The + syntax of the parameter data (whatever appears between + the initial "<param>" and the terminating "</param>") is + defined for each command that uses it. However, it is + always required that the format of such data must not + contain nested "param" commands, and either must not use + the "<" character or must use it in a way that is + compatible with text/enriched parsing. That is, the end + of the parameter data should be recognizable with either + of two algorithms: simply searching for the first + occurrence of "</param>" or parsing until a balanced + "</param>" command is found. In either case, however, the + parameter data should not be shown to the human reader. + +Font-Alteration Commands + + The following formatting commands are intended to alter the font in + which text is displayed, but not to alter the indentation or + justification state of the text: + + Bold + causes the affected text to be in a bold font. Nested + bold commands have the same effect as a single bold + command. + + Italic + causes the affected text to be in an italic font. Nested + italic commands have the same effect as a single italic + command. + + + +Resnick & Walker Informational [Page 4] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + Underline + causes the affected text to be underlined. Nested + underline commands have the same effect as a single + underline command. + + Fixed + causes the affected text to be in a fixed width font. + Nested fixed commands have the same effect as a single + fixed command. + + FontFamily + causes the affected text to be displayed in a specified + typeface. The "fontfamily" command requires a parameter + that is specified by using the "param" command. The + parameter data is a case-insensitive string containing + the name of a font family. Any currently available font + family name (e.g. Times, Palatino, Courier, etc.) may be + used. This includes font families defined by commercial + type foundries such as Adobe, BitStream, or any other + such foundry. Note that implementations should only use + the general font family name, not the specific font name + (e.g. use "Times", not "TimesRoman" nor + "TimesBoldItalic"). When nested, the inner "fontfamily" + command takes precedence. Also note that the "fontfamily" + command is advisory only; it should not be expected that + other implementations will honor the typeface information + in this command since the font capabilities of systems + vary drastically. + + Color + causes the affected text to be displayed in a specified + color. The "color" command requires a parameter that is + specified by using the "param" command. The parameter + data can be one of the following: + + red + blue + green + yellow + cyan + magenta + black + white + + or an RGB color value in the form: + + ####,####,#### + + + + +Resnick & Walker Informational [Page 5] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + where '#' is a hexadecimal digit '0' through '9', 'A' + through 'F', or 'a' through 'f'. The three 4-digit + hexadecimal values are the RGB values for red, green, and + blue respectively, where each component is expressed as + an unsigned value between 0 (0000) and 65535 (FFFF). The + default color for the message is unspecified, though + black is a common choice in many environments. When + nested, the inner "color" command takes precedence. + + Smaller + causes the affected text to be in a smaller font. It is + recommended that the font size be changed by two points, + but other amounts may be more appropriate in some + environments. Nested smaller commands produce ever + smaller fonts, to the limits of the implementation's + capacity to reasonably display them, after which further + smaller commands have no incremental effect. + + Bigger + causes the affected text to be in a bigger font. It is + recommended that the font size be changed by two points, + but other amounts may be more appropriate in some + environments. Nested bigger commands produce ever bigger + fonts, to the limits of the implementation's capacity to + reasonably display them, after which further bigger + commands have no incremental effect. + + While the "bigger" and "smaller" operators are effectively inverses, + it is not recommended, for example, that "<smaller>" be used to end + the effect of "<bigger>". This is properly done with "</bigger>". + + Since the capabilities of implementations will vary, it is to be + expected that some implementations will not be able to act on some of + the font-alteration commands. However, an implementation should still + display the text to the user in a reasonable fashion. In particular, + the lack of capability to display a particular font family, color, or + other text attribute does not mean that an implementation should fail + to display text. + +Fill/Justification/Indentation Commands + + Initially, text/enriched text is intended to be displayed fully + filled (that is, using the rules specified for replacing CRLF pairs + with spaces or removing them as appropriate) with appropriate kerning + and letter-tracking, and using the maximum available margins as suits + the capabilities of the receiving user agent software. + + + + + +Resnick & Walker Informational [Page 6] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + The following commands alter that state. Each of these commands force + a line break before and after the formatting environment if there is + not otherwise a line break. For example, if one of these commands + occurs anywhere other than the beginning of a line of text as + presented, a new line is begun. + + Center + causes the affected text to be centered. + + FlushLeft + causes the affected text to be left-justified with a + ragged right margin. + + FlushRight + causes the affected text to be right-justified with a + ragged left margin. + + FlushBoth + causes the affected text to be filled and padded so as to + create smooth left and right margins, i.e., to be fully + justified. + + ParaIndent + causes the running margins of the affected text to be + moved in. The recommended indentation change is the width + of four characters, but this may differ among + implementations. The "paraindent" command requires a + parameter that is specified by using the "param" command. + The parameter data is a comma-seperated list of one or + more of the following: + + Left + causes the running left margin to be moved to the + right. + + Right + causes the running right margin to be moved to the + left. + + In + causes the first line of the affected paragraph to + be indented in addition to the running margin. The + remaining lines remain flush to the running margin. + + Out + causes all lines except for the first line of the + affected paragraph to be indented in addition to the + running margin. The first line remains flush to the + + + +Resnick & Walker Informational [Page 7] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + running margin. + + Nofill + causes the affected text to be displayed without filling. + That is, the text is displayed without using the rules + for replacing CRLF pairs with spaces or removing + consecutive sequences of CRLF pairs. However, the current + state of the margins and justification is honored; any + indentation or justification commands are still applied + to the text within the scope of the "nofill". + + The "center", "flushleft", "flushright", and "flushboth" commands are + mutually exclusive, and, when nested, the inner command takes + precedence. + + The "nofill" command is mutually exclusive with the "in" and "out" + parameters of the "paraindent" command; when they occur in the same + scope, their behavior is undefined. + + The parameter data for the "paraindent" command may contain multiple + occurances of the same parameter (i.e. "left", "right", "in", or + "out"). Each occurance causes the text to be further indented in the + manner indicated by that parameter. Nested "paraindent" commands + cause the affected text to be further indented according to the + parameters. Note that the "in" and "out" parameters for "paraindent" + are mutually exclusive; when they appear together or when nested + "paraindent" commands contain both of them, their behavior is + undefined. + + For purposes of the "in" and "out" parameters, a paragraph is defined + as text that is delimited by line breaks after applying the rules for + replacing CRLF pairs with spaces or removing consecutive sequences of + CRLF pairs. For example, within the scope of an "out", the line + following each CRLF is made flush with the running margin, and + subsequent lines are indented. Within the scope of an "in", the first + line following each CRLF is indented, and subsequent lines remain + flush to the running margin. + + Whether or not text is justified by default (that is, whether the + default environment is "flushleft", "flushright", or "flushboth") is + unspecified, and depends on the preferences of the user, the + capabilities of the local software and hardware, and the nature of + the character set in use. On systems where full justification is + considered undesirable, the "flushboth" environment may be identical + to the default environment. Note that full justification should never + be performed inside of "center", "flushleft", "flushright", or + "nofill" environments. Note also that for some non-ASCII character + sets, full justification may be fundamentally inappropriate. + + + +Resnick & Walker Informational [Page 8] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + Note that [RFC-1563] defined two additional indentation commands, + "Indent" and "IndentRight". These commands did not force a line + break, and therefore their behavior was unpredictable since they + depended on the margins and character sizes that a particular + implementation used. Therefore, their use is deprecated and they + should be ignored just as other unrecognized commands. + +Markup Commands + + Commands in this section, unlike the other text/enriched commands are + declarative markup commands. Text/enriched is not intended as a full + markup language, but instead as a simple way to represent common + formatting commands. Therefore, markup commands are purposely kept to + a minimum. It is only because each was deemed so prevalent or + necessary in an e-mail environment that these particular commands + have been included at all. + + Excerpt + causes the affected text to be interpreted as a textual + excerpt from another source, probably a message being + responded to. Typically this will be displayed using + indentation and an alternate font, or by indenting lines + and preceding them with "> ", but such decisions are up + to the implementation. Note that as with the + justification commands, the excerpt command implicitly + begins and ends with a line break if one is not already + there. Nested "excerpt" commands are acceptable and + should be interpreted as meaning that the excerpted text + was excerpted from yet another source. Again, this can be + displayed using additional indentation, different colors, + etc. + + Optionally, the "excerpt" command can take a parameter by + using the "param" command. The format of the data is + unspecified, but it is intended to uniquely identify the + text from which the excerpt is taken. With this + information, an implementation should be able to uniquely + identify the source of any particular excerpt, especially + if two or more excerpts in the message are from the same + source, and display it in some way that makes this + apparent to the user. + + Lang + causes the affected text to be interpreted as belonging + to a particular language. This is most useful when two + different languages use the same character set, but may + require a different font or formatting depending on the + language. For instance, Chinese and Japanese share + + + +Resnick & Walker Informational [Page 9] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + similar character glyphs, and in some character sets like + UNICODE share common code points, but it is considered + very important that different fonts be used for the two + languages, especially if they appear together, so that + meaning is not lost. Also, language information can be + used to allow for fancier text handling, like spell + checking or hyphenation. + + The "lang" command requires a parameter using the "param" + command. The parameter data can be any of the language + tags specified in [RFC-1766], "Tags for the + Identification of Languages". These tags are the two + letter language codes taken from [ISO-639] or can be + other language codes that are registered according to the + instructions in the Langauge Tags RFC. Consult that memo + for further information. + +Balancing and Nesting of Formatting Commands + + Pairs of formatting commands must be properly balanced and nested. + Thus, a proper way to describe text in bold italics is: + + <bold><italic>the-text</italic></bold> + + or, alternately, + + <italic><bold>the-text</bold></italic> + + but, in particular, the following is illegal text/enriched: + + <bold><italic>the-text</bold></italic> + + The nesting requirement for formatting commands imposes a slightly + higher burden upon the composers of text/enriched bodies, but + potentially simplifies text/enriched displayers by allowing them to + be stack-based. The main goal of text/enriched is to be simple enough + to make multifont, formatted email widely readable, so that those + with the capability of sending it will be able to do so with + confidence. Thus slightly increased complexity in the composing + software was deemed a reasonable tradeoff for simplified reading + software. Nonetheless, implementors of text/enriched readers are + encouraged to follow the general Internet guidelines of being + conservative in what you send and liberal in what you accept. Those + implementations that can do so are encouraged to deal reasonably with + improperly nested text/enriched data. + + + + + + +Resnick & Walker Informational [Page 10] + +RFC 1896 text/enriched MIME Content-type February 1996 + + +Unrecognized formatting commands + + Implementations must regard any unrecognized formatting command as + "no-op" commands, that is, as commands having no effect, thus + facilitating future extensions to "text/enriched". Private extensions + may be defined using formatting commands that begin with "X-", by + analogy to Internet mail header field names. + + In order to formally define extended commands, a new Internet + document should be published. + +White Space in Text/enriched Data + + No special behavior is required for the SPACE or TAB (HT) character. + It is recommended, however, that, at least when fixed-width fonts are + in use, the common semantics of the TAB (HT) character should be + observed, namely that it moves to the next column position that is a + multiple of 8. (In other words, if a TAB (HT) occurs in column n, + where the leftmost column is column 0, then that TAB (HT) should be + replaced by 8-(n mod 8) SPACE characters.) It should also be noted + that some mail gateways are notorious for losing (or, less commonly, + adding) white space at the end of lines, so reliance on SPACE or TAB + characters at the end of a line is not recommended. + +Initial State of a text/enriched interpreter + + Text/enriched is assumed to begin with filled text in a variable- + width font in a normal typeface and a size that is average for the + current display and user. The left and right margins are assumed to + be maximal, that is, at the leftmost and rightmost acceptable + positions. + +Non-ASCII character sets + + One of the great benefits of MIME is the ability to use different + varieties of non-ASCII text in messages. To use non-ASCII text in a + message, normally a charset parameter is specified in the Content- + type line that indicates the character set being used. For purposes + of this RFC, any legal MIME charset parameter can be used with the + text/enriched Content-type. However, there are two difficulties that + arise with regard to the text/enriched Content-type when non-ASCII + text is desired. The first problem involves difficulties that occur + when the user wishes to create text which would normally require + multiple non-ASCII character sets in the same text/enriched message. + The second problem is an ambiguity that arises because of the + text/enriched use of the "<" character in formatting commands. + + + + + +Resnick & Walker Informational [Page 11] + +RFC 1896 text/enriched MIME Content-type February 1996 + + +Using multiple non-ASCII character sets + + Normally, if a user wishes to produce text which contains characters + from entirely different character sets within the same MIME message + (for example, using Russian Cyrillic characters from ISO 8859-5 and + Hebrew characters from ISO 8859-8), a multipart message is used. + Every time a new character set is desired, a new MIME body part is + started with different character sets specified in the charset + parameter of the Content-type line. However, using multiple character + sets this way in text/enriched messages introduces problems. Since a + change in the charset parameter requires a new part, text/enriched + formatting commands used in the first part would not be able to apply + to text that occurs in subsequent parts. It is not possible for + text/enriched formatting commands to apply across MIME body part + boundaries. + + [RFC-1341] attempted to get around this problem in the now obsolete + text/richtext format by introducing different character set + formatting commands like "iso-8859-5" and "us-ascii". But this, or + even a more general solution along the same lines, is still + undesirable: It is common for a MIME application to decide, for + example, what character font resources or character lookup tables it + will require based on the information provided by the charset + parameter of the Content-type line, before it even begins to + interpret or display the data in that body part. By allowing the + text/enriched interpreter to subsequently change the character set, + perhaps to one completely different from the charset specified in the + Content-type line (with potentially much different resource + requirements), too much burden would be placed on the text/enriched + interpreter itself. + + Therefore, if multiple types of non-ASCII characters are desired in a + text/enriched document, one of the following two methods must be + used: + + 1. For cases where the different types of non-ASCII text can be + limited to their own paragraphs with distinct formatting, a + multipart message can be used with each part having a + Content-Type of text/enriched and a different charset parameter. + The one caveat to using this method is that each new part must + start in the initial state for a text/enriched document. That + means that all of the text/enriched commands in the preceding + part must be properly balanced with ending commands before the + next text/enriched part begins. Also, each text/enriched part + must begin a new paragraph. + + + + + + +Resnick & Walker Informational [Page 12] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + 2. If different types of non-ASCII text are to appear in the same + line or paragraph, or if text/enriched formatting (e.g. margins, + typeface, justification) is required across several different + types of non-ASCII text, a single text/enriched body part should + be used with a character set specified that contains all of the + required characters. For example, a charset parameter of + "UNICODE-1-1-UTF-7" as specified in [RFC-1642] could be used for + such purposes. Not only does UNICODE contain all of the + characters that can be represented in all of the other registered + ISO 8859 MIME character sets, but UTF-7 is fully compatible with + other aspects of the text/enriched standard, including the use of + the "<" character referred to below. Any other character sets + that are specified for use in MIME which contain different types + of non-ASCII text can also be used in these instances. + +Use of the "<" character in formatting commands + + If the character set specified by the charset parameter on the + Content-type line is anything other than "US-ASCII", this means that + the text being described by text/enriched formatting commands is in a + non-ASCII character set. However, the commands themselves are still + the same ASCII commands that are defined in this document. This + creates an ambiguity only with reference to the "<" character, the + octet with numeric value 60. In single byte character sets, such as + the ISO-8859 family, this is not a problem; the octet 60 can be + quoted by including it twice, just as for ASCII. The problem is more + complicated, however, in the case of multi-byte character sets, where + the octet 60 might appear at any point in the byte sequence for any + of several characters. + + In practice, however, most multi-byte character sets address this + problem internally. For example, the UNICODE character sets can use + the UTF-7 encoding which preserves all of the important ASCII + characters in their single byte form. The ISO-2022 family of + character sets can use certain character sequences to switch back + into ASCII at any moment. Therefore it is specified that, before + text/enriched formatting commands, the prevailing character set + should be "switched back" into ASCII, and that only those characters + which would be interpreted as "<" in plain text should be interpreted + as token delimiters in text/enriched. + + The question of what to do for hypothetical future character sets + that do not subsume ASCII is not addressed in this memo. + + + + + + + + +Resnick & Walker Informational [Page 13] + +RFC 1896 text/enriched MIME Content-type February 1996 + + +Minimal text/enriched conformance + + A minimal text/enriched implementation is one that converts "<<" to + "<", removes everything between a <param> command and the next + balancing </param> command, removes all other formatting commands + (all text enclosed in angle brackets), and, outside of <nofill> + environments, converts any series of n CRLFs to n-1 CRLFs, and + converts any lone CRLF pairs to SPACE. + +Notes for Implementors + + It is recognized that implementors of future mail systems will want + rich text functionality far beyond that currently defined for + text/enriched. The intent of text/enriched is to provide a common + format for expressing that functionality in a form in which much of + it, at least, will be understood by interoperating software. Thus, in + particular, software with a richer notion of formatted text than + text/enriched can still use text/enriched as its basic + representation, but can extend it with new formatting commands and by + hiding information specific to that software system in text/enriched + <param> constructs. As such systems evolve, it is expected that the + definition of text/enriched will be further refined by future + published specifications, but text/enriched as defined here provides + a platform on which evolutionary refinements can be based. + + An expected common way that sophisticated mail programs will generate + text/enriched data is as part of a multipart/alternative construct. + For example, a mail agent that can generate enriched mail in ODA + format can generate that mail in a more widely interoperable form by + generating both text/enriched and ODA versions of the same data, + e.g.: + + Content-type: multipart/alternative; boundary=foo + + --foo + Content-type: text/enriched + + [text/enriched version of data] + --foo Content-type: application/oda + + [ODA version of data] + --foo-- + + If such a message is read using a MIME-conformant mail reader that + understands ODA, the ODA version will be displayed; otherwise, the + text/enriched version will be shown. + + + + + +Resnick & Walker Informational [Page 14] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + In some environments, it might be impossible to combine certain + text/enriched formatting commands, whereas in others they might be + combined easily. For example, the combination of <bold> and <italic> + might produce bold italics on systems that support such fonts, but + there exist systems that can make text bold or italicized, but not + both. In such cases, the most recently issued (innermost) recognized + formatting command should be preferred. + + One of the major goals in the design of text/enriched was to make it + so simple that even text-only mailers will implement enriched-to- + plain-text translators, thus increasing the likelihood that enriched + text will become "safe" to use very widely. To demonstrate this + simplicity, an extremely simple C program that converts text/enriched + input into plain text output is included in Appendix A. + +Extensions to text/enriched + + It is expected that various mail system authors will desire + extensions to text/enriched. The simple syntax of text/enriched, and + the specification that unrecognized formatting commands should simply + be ignored, are intended to promote such extensions. + +An Example + + Putting all this together, the following "text/enriched" body + fragment: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Content-type: text/enriched + + <bold>Now</bold> is the time for <italic>all</italic> + good men + <smaller>(and <<women>)</smaller> to + <ignoreme>come</ignoreme> + + to the aid of their + + + <color><param>red</param>beloved</color> + country. + + By the way, + I think that <paraindent><param>left</param><<smaller> + + </paraindent>should REALLY be called + + <paraindent><param>left</param><<tinier></paraindent> + + + +Resnick & Walker Informational [Page 15] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + and that I am always right. + + -- the end + + represents the following formatted text (which will, no doubt, look + somewhat cryptic in the text-only version of this document): + + Now is the time for all good men (and <women>) to come + to the aid of their + + beloved country. + By the way, I think that + <smaller> + should REALLY be called + <tinier> + and that I am always right. + -- the end + + where the word "beloved" would be in red on a color display. + + ti 0 Security Considerations + + Security issues are not discussed in this memo, as the mechanism + raises no security issues. + +Authors' Addresses + + For more information, the authors of this document may be contacted + via Internet mail: + + Peter W. Resnick + QUALCOMM Incorporated + 6455 Lusk Boulevard + San Diego, CA 92121-2779 + + Phone: +1 619 587 1121 + Fax: +1 619 658 2230 + EMail: presnick@qualcomm.com + + + Amanda Walker + InterCon Systems Corporation + 950 Herndon Parkway + Herndon, VA 22070 + + Phone: +1 703 709 5500 + Fax: +1 703 709 5555 + EMail: amanda@intercon.com + + + +Resnick & Walker Informational [Page 16] + +RFC 1896 text/enriched MIME Content-type February 1996 + + +Acknowledgements + + The authors gratefully acknowledge the input of many contributors, + readers, and implementors of the specification in this document. + Particular thanks are due to Nathaniel Borenstein, the original + author of RFC 1563. + +References + + [RFC-1341] + Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail + Extensions): Mechanisms for Specifying and Describing the Format + of Internet Message Bodies", 06/11/1992. + + [RFC-1521] + Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail + Extensions) Part One: Mechanisms for Specifying and Describing + the Format of Internet Message Bodies", 09/23/1993. + + [RFC-1523] + Borenstein, N., "The text/enriched MIME Content-type", + 09/23/1993. + + [RFC-1563] + Borenstein, N., "The text/enriched MIME Content-type", + 01/10/1994. + + [RFC-1642] + Goldsmith, D., Davis, M., "UTF-7 - A Mail-Safe Transformation + Format of Unicode", 07/13/1994. + + [RFC-1766] + Alvestrand, H., "Tags for the Identification of Languages", + 03/02/1995. + + [RFC-1866] + Berners-Lee, T., and D. Connolly, D., "Hypertext Markup Language + - 2.0", 11/03/1995. + + + + + + + + + + + + + +Resnick & Walker Informational [Page 17] + +RFC 1896 text/enriched MIME Content-type February 1996 + + +Appendix A--A Simple enriched-to-plain Translator in C + + One of the major goals in the design of the text/enriched subtype of + the text Content-Type is to make formatted text so simple that even + text-only mailers will implement enriched-to-plain-text translators, + thus increasing the likelihood that multifont text will become "safe" + to use very widely. To demonstrate this simplicity, what follows is a + simple C program that converts text/enriched input into plain text + output. Note that the local newline convention (the single character + represented by "\n") is assumed by this program, but that special + CRLF handling might be necessary on some systems. + +#include <ctype.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +main() { + int c, i, paramct=0, newlinect=0, nofill=0; + char token[62], *p; + + while ((c=getc(stdin)) != EOF) { + if (c == '<') { + if (newlinect == 1) putc(' ', stdout); + newlinect = 0; + c = getc(stdin); + if (c == '<') { + if (paramct <= 0) putc(c, stdout); + } else { + ungetc(c, stdin); + for (i=0, p=token; + (c=getc(stdin)) != EOF && c != '>'; i++) { + if (i < sizeof(token)-1) + *p++ = isupper(c) ? tolower(c) : c; + } + *p = '\0'; + if (c == EOF) break; + if (strcmp(token, "param") == 0) + paramct++; + else if (strcmp(token, "nofill") == 0) + nofill++; + else if (strcmp(token, "/param") == 0) + paramct--; + else if (strcmp(token, "/nofill") == 0) + nofill--; + } + } else { + if (paramct > 0) + + + +Resnick & Walker Informational [Page 18] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + ; /* ignore params */ + else if (c == '\n' && nofill <= 0) { + if (++newlinect > 1) putc(c, stdout); + } else { + if (newlinect == 1) putc(' ', stdout); + newlinect = 0; + putc(c, stdout); + } + } + } + /* The following line is only needed with line-buffering */ + putc('\n', stdout); + exit(0); +} + + It should be noted that one can do considerably better than this in + displaying text/enriched data on a dumb terminal. In particular, one + can replace font information such as "bold" with textual emphasis + (like *this* or _T_H_I_S_). One can also properly handle the + text/enriched formatting commands regarding indentation, + justification, and others. However, the above program is all that is + necessary in order to present text/enriched on a dumb terminal + without showing the user any formatting artifacts. + +Appendix B--A Simple enriched-to-HTML Translator in C + + It is fully expected that other text formatting standards like HTML + and SGML will supplant text/enriched in Internet mail. It is also + likely that as this happens, recipients of text/enriched mail will + wish to view such mail with an HTML viewer. To this end, the + following is a simple example of a C program to convert text/enriched + to HTML. Since the current version of HTML at the time of this + document's publication is HTML 2.0 defined in [RFC-1866], this + program converts to that standard. There are several text/enriched + commands that have no HTML 2.0 equivalent. In those cases, this + program simply puts those commands into processing instructions; that + is, surrounded by "<?" and ">". As in Appendix A, the local newline + convention (the single character represented by "\n") is assumed by + this program, but special CRLF handling might be necessary on some + systems. + +#include <ctype.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +main() { + int c, i, paramct=0, nofill=0; + + + +Resnick & Walker Informational [Page 19] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + char token[62], *p; + + while((c=getc(stdin)) != EOF) { + if(c == '<') { + c = getc(stdin); + if(c == '<') { + fputs("<", stdout); + } else { + ungetc(c, stdin); + for (i=0, p=token; + (c=getc(stdin)) != EOF && c != '>'; i++) { + if (i < sizeof(token)-1) + *p++ = isupper(c) ? tolower(c) : c; + } + *p = '\0'; + if(c == EOF) break; + if(strcmp(token, "/param") == 0) { + paramct--; + putc('>', stdout); + } else if(paramct > 0) { + fputs("<", stdout); + fputs(token, stdout); + fputs(">", stdout); + } else { + putc('<', stdout); + if(strcmp(token, "nofill") == 0) { + nofill++; + fputs("pre", stdout); + } else if(strcmp(token, "/nofill") == 0) { + nofill--; + fputs("/pre", stdout); + } else if(strcmp(token, "bold") == 0) { + fputs("b", stdout); + } else if(strcmp(token, "/bold") == 0) { + fputs("/b", stdout); + } else if(strcmp(token, "italic") == 0) { + fputs("i", stdout); + } else if(strcmp(token, "/italic") == 0) { + fputs("/i", stdout); + } else if(strcmp(token, "fixed") == 0) { + fputs("tt", stdout); + } else if(strcmp(token, "/fixed") == 0) { + fputs("/tt", stdout); + } else if(strcmp(token, "excerpt") == 0) { + fputs("blockquote", stdout); + } else if(strcmp(token, "/excerpt") == 0) { + fputs("/blockquote", stdout); + } else { + + + +Resnick & Walker Informational [Page 20] + +RFC 1896 text/enriched MIME Content-type February 1996 + + + putc('?', stdout); + fputs(token, stdout); + if(strcmp(token, "param") == 0) { + paramct++; + putc(' ', stdout); + continue; + } + } + putc('>', stdout); + } + } + } else if(c == '>') { + fputs(">", stdout); + } else if (c == '&') { + fputs("&", stdout); + } else { + if(c == '\n' && nofill <= 0 && paramct <= 0) { + while((i=getc(stdin)) == '\n') fputs("<br>", stdout); + ungetc(i, stdin); + } + putc(c, stdout); + } + } + /* The following line is only needed with line-buffering */ + putc('\n', stdout); + exit(0); +} + + + + + + + + + + + + + + + + + + + + + + + + +Resnick & Walker Informational [Page 21] + |