diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1563.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc1563.txt')
-rw-r--r-- | doc/rfc/rfc1563.txt | 899 |
1 files changed, 899 insertions, 0 deletions
diff --git a/doc/rfc/rfc1563.txt b/doc/rfc/rfc1563.txt new file mode 100644 index 0000000..86730fd --- /dev/null +++ b/doc/rfc/rfc1563.txt @@ -0,0 +1,899 @@ + + + + + + +Network Working Group N. Borenstein +Request for Comments: 1563 Bellcore +Obsoletes: 1523 January 1994 +Category: Informational + + + The text/enriched MIME Content-type + +Status of this Memo + + This memo provides information for the Internet community. This memo + does not specify an Internet standard of any kind. Distribution of + this memo is unlimited. + +Abstract + + MIME [RFC-1341, RFC-1521] defines a format and general framework for + the representation of a wide variety of data types in Internet mail. + This document defines one particular type of MIME data, the + text/enriched type, a refinement of the "text/richtext" type defined + in RFC 1341. The text/enriched MIME type is intended to facilitate + the wider interoperation of simple enriched text across a wide + variety of hardware and software platforms. + +Table of Contents + + The Text/enriched MIME type.............................. 2 + Formatting Commands...................................... 4 + Font-Alteration Commands........................... 4 + Fill/Justification Commands........................ 5 + Indentation Commands............................... 6 + Miscellaneous Commands............................. 6 + Balancing and Nesting of Formatting Commands....... 7 + Unrecognized formatting commands................... 8 + White Space in Text/enriched Data........................ 8 + Initial State of a text/enriched interpreter............. 8 + Non-ASCII character sets................................. 8 + Minimal text/enriched conformance........................ 9 + Notes for Implementors................................... 9 + Extensions to text/enriched.............................. 10 + An Example............................................... 11 + Security Considerations.................................. 12 + Author's Address......................................... 12 + Acknowledgements......................................... 12 + References............................................... 12 + Appendix A -- A Simple enriched-to-plain Translator in C. 13 + Appendix B -- Differences from RFC 1341 text/richtext.... 15 + + + + +Borenstein [Page 1] + +RFC 1563 A text/enriched type for MIME January 1994 + + +The Text/enriched MIME type + + In order to promote the wider interoperability of simple formatted + text, this document defines an extremely simple subtype of the MIME + content-type "text", the "text/enriched" subtype. This subtype was + designed to meet the following criteria: + + 1. The syntax must be extremely simple to parse, + so that even teletype-oriented mail systems can + easily strip away the formatting information and + leave only the readable text. + + 2. The syntax must be extensible to allow for new + formatting commands that are deemed essential for + some application. + + 3. If the character set in use is ASCII or an 8- + bit ASCII superset, then the raw form of the data + must be readable enough to be largely + unobjectionable in the event that it is displayed + on the screen of the user of a non-MIME-conformant + mail reader. + + 4. The capabilities must be extremely limited, to + ensure that it can represent no more than is + likely to be representable by the user's primary + word processor. While this limits what can be + sent, it increases the likelihood that what is + sent can be properly displayed. + + This document defines a new MIME content-type, "text/enriched". The + content-type line for this type may have one optional parameter, the + "charset" parameter, with the same values permitted for the + "text/plain" MIME content-type. + + The syntax of "text/enriched" is very simple. It represents text in + a single character set -- US-ASCII by default, although a different + character set can be specified by the use of the "charset" parameter. + (The semantics of text/enriched in non-ASCII character sets are + discussed later in this document.) All characters represent + themselves, with the exception of the "<" character (ASCII 60), which + is used to mark the beginning of a formatting command. Formatting + instructions consist of formatting commands surrounded by angle + brackets ("<>", ASCII 60 and 62). Each formatting command may be no + more than 60 characters in length, all in US-ASCII, restricted to the + alphanumeric and hyphen ("-") characters. Formatting commands may be + preceded by a solidus ("/", ASCII 47), making them negations, and + such negations must always exist to balance the initial opening + + + +Borenstein [Page 2] + +RFC 1563 A text/enriched type for MIME January 1994 + + + commands. Thus, if the formatting command "<bold>" appears at some + point, there must later be a "</bold>" to balance it. (NOTE: The 60 + character limit on formatting commands does NOT include the "<", ">", + or "/" characters that might be attached to such commands.) + + Formatting commands are always case-insensitive. That is, "bold" and + "BoLd" are equivalent in effect, if not in good taste. + + Beyond tokens delimited by "<" and ">", there are two other special + processing rules. First, a literal less-than sign ("<") can be + represented by a sequence of two such characters, "<<". Second, line + breaks (CRLF pairs in standard network representation) are handled + specially. In particular, isolated CRLF pairs are translated into a + single SPACE character. Sequences of N consecutive CRLF pairs, + however, are translated into N-1 actual line breaks. This permits + long lines of data to be represented in a natural- looking manner + despite the frequency of line-wrapping in Internet mailers. When + preparing the data for mail transport, isolated line breaks should be + inserted wherever necessary to keep each line shorter than 80 + characters. When preparing such data for presentation to the user, + isolated line breaks should be replaced by a single SPACE character, + and N consecutive CRLF pairs should be presented to the user as N-1 + line breaks. + + Thus text/enriched data that looks like this: + + This is + a single + line + + This is the + next line. + + + This is the + next paragraph. + + should be displayed by a text/enriched interpreter as follows: + + + This is a single line + This is the next line. + + This is the next paragraph. + + The formatting commands, not all of which will be implemented by all + implementations, are described in the following sections. + + + + +Borenstein [Page 3] + +RFC 1563 A text/enriched type for MIME January 1994 + + +Formatting Commands + + The text/enriched formatting commands all begin with <commandname> + and end with </commandname>, affecting the formatting of the text + between those two tokens. The commands are described here, grouped + according to type. + +Font-Alteration Commands + + The following formatting commands are intended to alter the font in + which text is displayed, but not to alter the indentation or + justification state of the text: + + Bold -- causes the affected text to be in a bold font. Nested + bold commands have the same effect as a single bold + command. + + Italic -- causes the affected text to be in an italic font. + Nested italic commands have the same effect as a single + italic command. + + Fixed -- causes the affected text to be in a fixed width font. + Nested fixed commands have the same effect as a single + fixed command. + + Smaller -- causes the affected text to be in a smaller font. + It is recommended that the font size be changed by two + points, but other amounts may be more appropriate in some + environments. Nested smaller commands produce ever- + smaller fonts, to the limits of the implementation's + capacity to reasonably display them, after which further + smaller commands have no incremental effect. + + Bigger -- causes the affected text to be in a bigger font. It + is recommended that the font size be changed by two + points, but other amounts may be more appropriate in some + environments. Nested bigger commands produce ever-bigger + fonts, to the limits of the implementation's capacity to + reasonably display them, after which further bigger + commands have no incremental effect. + + Underline -- causes the affected text to be underlined. Nested + underline commands have the same effect as a single + underline command. + + While the "bigger" and "smaller" operators are effectively inverses, + it is not recommended, for example, that "<smaller>" be used to end + the effect of "<bigger>". This is properly done with "</bigger>". + + + +Borenstein [Page 4] + +RFC 1563 A text/enriched type for MIME January 1994 + + +Fill/Justification Commands + + Initially, text/enriched text is intended to be displayed fully + filled with appropriate kerning and letter-tracking as suits the + capabilities of the receiving user agent software. Actual line width + is left to the discretion of the receiver, which is expected to fold + lines intelligently (preferring soft line breaks) to the best of its + ability. + + The following commands alter that state. Each of these commands + force a line break before and after the formatting environment if + there is not otherwise a line break. For example, if one of these + commands occurs anywhere other than the beginning of a line of text + as presented, a new line is begun. + + Center -- causes the affected text to be centered. + + FlushLeft -- causes the affected text to be left-justified with a + ragged right margin. + + FlushRight -- causes the affected text to be right-justified with a + ragged left margin. + + FlushBoth -- causes the affected text to be filled and padded so + as to create smooth left and right margins, i.e., to be + fully justified. + + Nofill -- causes the affected text to be displayed without filling + or justification. + + The center, flushleft, flushright, and flushboth commands are + mutually exclusive, and, when nested, the inner command takes + precedence. + + Whether or not text is justified by default (that is, whether the + default environment is flushleft, flushright, or flushboth) is + unspecified, and depends on the preferences of the user, the + capabilities of the local software and hardware, and the nature of + the character set in use. On systems where justification is + considered undesirable, the flushboth environment may be identical to + the default environment. Note that justification should never be + performed inside of center, flushleft, flushright, or nofill + environments. Note also that for some non-ASCII character sets, full + justification may be fundamentally inappropriate. + + + + + + + +Borenstein [Page 5] + +RFC 1563 A text/enriched type for MIME January 1994 + + +Indentation Commands + + Initially, text/enriched text is displayed using the maximum + available margins. Two formatting commands may be used to affect the + margins. + + Indent -- causes the running left margin to be moved to the + right. The recommended indentation change is the width of + four characters, but this may differ among + implementations. + + IndentRight -- causes the running right margin to be moved to + the left. The recommended indentation change is the width + of four characters, but this may differ among + implementations. + + A line break is NOT forced by a change of the margin, to permit the + description of "hanging" text. Thus for example the following text: + + Now <indent> is the time for all good horses to come to the aid of + their stable, assuming that </indent> any stable is really stable. + + would be displayed in a 40-character-wide window as follows: + + Now is the time for all good horses to + come to the aid of their stable, + assuming that any stable is + really stable. + +Miscellaneous Commands + + Excerpt -- causes the affected text to be interpreted as a + textual excerpt from another source, probably a message + being responded to. Typically this will be displayed + using indentation and an alternate font, or by indenting + lines and preceding them with "> ", but such decisions are + up to the implementation. (Note that this is the only + truly declarative markup construct in text/enriched, and + as such doesn't fit very well with the other facilities, + but it describes a type of markup that is very commonly + used in email and has no procedural analogue.) Note that + as with the justification commands, the excerpt command + implicitly begins and ends with a line break if one is not + already there. + + + + + + + +Borenstein [Page 6] + +RFC 1563 A text/enriched type for MIME January 1994 + + + Param -- Marks the affected text as command parameters, to be + interpreted or ignored by the text/enriched interpreter, + but NOT to be shown to the reader. The syntax of the + parameter data (whatever appears between the initial + "<param>" and the terminating "</param>") is left + undefined by this memo, to be defined by text/enriched + extensions in the future. However, the format of such + data must NOT contain nested <param> commands, and either + must NOT use the "<" character or must use it in a way + that is compatible with text/enriched parsing. That is, + the end of the parameter data should be recognizable with + EITHER of two algorithms: simply searching for the first + occurence of "</param>" or parsing until a balanced + "</param>" command is found. In either case, however, the + parameter data should NOT be shown to the human reader. + +Balancing and Nesting of Formatting Commands + + Pairs of formatting commands must be properly balanced and nested. + Thus, a proper way to describe text in bold italics is: + + <bold><italic>the-text</italic></bold> + + or, alternately, + + <italic><bold>the-text</bold></italic> + + but, in particular, the following is illegal + text/enriched: + + <bold><italic>the-text</bold></italic> + + The nesting requirement for formatting commands imposes a slightly + higher burden upon the composers of text/enriched bodies, but + potentially simplifies text/enriched displayers by allowing them to + be stack-based. The main goal of text/enriched is to be simple + enough to make multifont, formatted email widely readable, so that + those with the capability of sending it will be able to do so with + confidence. Thus slightly increased complexity in the composing + software was deemed a reasonable tradeoff for simplified reading + software. Nonetheless, implementors of text/enriched readers are + encouraged to follow the general Internet guidelines of being + conservative in what you send and liberal in what you accept. Those + implementations that can do so are encouraged to deal reasonably with + improperly nested text/enriched data. + + + + + + +Borenstein [Page 7] + +RFC 1563 A text/enriched type for MIME January 1994 + + +Unrecognized formatting commands + + Implementations must regard any unrecognized formatting command as + "no-op" commands, that is, as commands having no effect, thus + facilitating future extensions to "text/enriched". Private + extensions may be defined using formatting commands that begin with + "X-", by analogy to Internet mail header field names. + + In order to formally define extended commands, a new Internet + document should be published. + +White Space in Text/enriched Data + + No special behavior is required for the SPACE or TAB (HT) character. + It is recommended, however, that, at least when fixed-width fonts are + in use, the common semantics of the TAB (HT) character should be + observed, namely that it moves to the next column position that is a + multiple of 8. (In other words, if a TAB (HT) occurs in column n, + where the leftmost column is column 0, then that TAB (HT) should be + replaced by 8-(n mod 8) SPACE characters.) It should also be noted + that some mail gateways are notorious for losing (or, less commonly, + adding) white space at the end of lines, so reliance on SPACE or TAB + characters at the end of a line is not recommended. + +Initial State of a text/enriched interpreter + + Text/enriched is assumed to begin with filled text in a variable- + width font in a normal typeface and a size that is average for the + current display and user. The left and right margins are assumed to + be maximal, that is, at the leftmost and rightmost acceptable + positions. + +Non-ASCII character sets + + If the character set specified by the charset parameter on the + Content-type line is anything other than "US-ASCII", this means that + the text being described by text/enriched formatting commands is in a + non-ASCII character set. However, the commands themselves are still + the same ASCII commands that are defined in this document. This + creates an ambiguity only with reference to the "<" character, the + octet with numeric value 60. In single byte character sets, such as + the ISO-8859 family, this is not a problem; the octet 60 can be + quoted by including it twice, just as for ASCII. The problem is more + complicated, however, in the case of multi-byte character sets, where + the octet 60 might appear at any point in the byte sequence for any + of several characters. + + + + + +Borenstein [Page 8] + +RFC 1563 A text/enriched type for MIME January 1994 + + + In practice, however, most multibyte character sets address this + problem internally. For example, the ISO-2022 family of character + sets can switch back into ASCII at any moment. Therefore it is + specified that, before text/enriched formatting commands, the + prevailing character set should be "switched back" into ASCII, and + that only those characters which would be interpreted as "<" in plain + text should be interpreted as token delimiters in text/enriched. + + The question of what to do for hypothetical future character sets + that do NOT subsume ASCII is not addressed in this memo. + +Minimal text/enriched conformance + + A minimal text/enriched implementation is one that converts "<<" to + "<", removes everything between a <param> command and the next + balancing </param> command, removes all other formatting commands + (all text enclosed in angle brackets), and, outside of <nofill> + environments, converts any series of n CRLFs to n-1 CRLFs, and + converts any lone CRLF pairs to SPACE. + +Notes for Implementors + + It is recognized that implementors of future mail systems will want + rich text functionality far beyond that currently defined for + text/enriched. The intent of text/enriched is to provide a common + format for expressing that functionality in a form in which much of + it, at least, will be understood by interoperating software. Thus, + in particular, software with a richer notion of formatted text than + text/enriched can still use text/enriched as its basic + representation, but can extend it with new formatting commands and by + hiding information specific to that software system in text/enriched + <param> constructs. As such systems evolve, it is expected that the + definition of text/enriched will be further refined by future + published specifications, but text/enriched as defined here provides + a platform on which evolutionary refinements can be based. + + An expected common way that sophisticated mail programs will generate + text/enriched data is as part of a multipart/alternative construct. + For example, a mail agent that can generate enriched mail in ODA + format can generate that mail in a more widely interoperable form by + generating both text/enriched and ODA versions of the same data, + e.g.: + + + + + + + + + +Borenstein [Page 9] + +RFC 1563 A text/enriched type for MIME January 1994 + + + Content-type: multipart/alternative; boundary=foo + + --foo + Content-type: text/enriched + + [text/enriched version of data] + --foo + Content-type: application/oda + + [ODA version of data] + --foo-- + + If such a message is read using a MIME-conformant mail reader that + understands ODA, the ODA version will be displayed; otherwise, the + text/enriched version will be shown. + + In some environments, it might be impossible to combine certain + text/enriched formatting commands, whereas in others they might be + combined easily. For example, the combination of <bold> and <italic> + might produce bold italics on systems that support such fonts, but + there exist systems that can make text bold or italicized, but not + both. In such cases, the most recently issued (innermost) recognized + formatting command should be preferred. + + One of the major goals in the design of text/enriched was to make it + so simple that even text-only mailers will implement enriched-to- + plain-text translators, thus increasing the likelihood that enriched + text will become "safe" to use very widely. To demonstrate this + simplicity, an extremely simple C program that converts text/enriched + input into plain text output is included in Appendix A. + +Extensions to text/enriched + + It is expected that various mail system authors will desire + extensions to text/enriched. The simple syntax of text/enriched, and + the specification that unrecognized formatting commands should simply + be ignored, are intend to promote such extensions. + + Beyond simply defining new formatting commands, however, it may + sometimes be necessary to define formatting commands that can take + arguments. This is the intended use of the <param> construct. In + particular, software that wished to extend text/enriched to include + colored text might define an "x-color" environment which always began + with a color name parameter, to indicate the desired color for the + affected text. + + + + + + +Borenstein [Page 10] + +RFC 1563 A text/enriched type for MIME January 1994 + + +An Example + + Putting all this together, the following "text/enriched" body + fragment: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Content-type: text/enriched + + <bold>Now</bold> is the time for + <italic>all</italic> good men + <smaller>(and <<women>)</smaller> to + <ignoreme>come</ignoreme> + + to the aid of their + + + <x-color><param>red</param>beloved</x-color> + country. + + By the way, I think that <<smaller> + + should + + REALLY be called + + <<tinier> + and that I am always right. + + -- the end + + represents the following formatted text (which will, no doubt, look + somewhat cryptic in the text-only version of this document): + + Now is the time for all good men (and <women>) to + come + to the aid of their + + beloved country. + By the way, I think that <smaller> + should + REALLY be called + <tinier> + and that I am always right. + -- the end + + where the word "beloved" would be in red on a color display if the + receiving software implemented the "x-color" extension. + + + +Borenstein [Page 11] + +RFC 1563 A text/enriched type for MIME January 1994 + + +Security Considerations + + Security issues are not discussed in this memo, as the mechanism + raises no security issues. + +Author's Address + + For more information, the author of this document may be contacted + via Internet mail: + + Nathaniel S. Borenstein + MRE 2D-296, Bellcore + 445 South St. + Morristown, NJ 07962-1910 + + Phone: +1 201 829 4270 + Fax: +1 201 829 5963 + EMail: nsb@bellcore.com + +Acknowledgements + + This document reflects the input of many contributors, readers, and + implementors of the original MIME specification, RFC 1341. It also + reflects particular contributions and comments from Terry Crowley, + Rhys Weatherley, and John LoVerso. + +References + + [RFC-1341] Borenstein, N., and N. Freed, "MIME (Multipurpose + Internet Mail Extensions): Mechanisms for Specifying + and Describing the Format of Internet Message Bodies", + RFC 1341, Bellcore, Innosoft, June 1992. + + [RFC-1521] Borenstein, N., and N. Freed, "MIME (Multipurpose + Internet Mail Extensions) Part One: Mechanisms for + Specifying and Describing the Format of Internet + Message Bodies", RFC 1521, Bellcore, Innosoft, + September 1993. + + + + + + + + + + + + + +Borenstein [Page 12] + +RFC 1563 A text/enriched type for MIME January 1994 + + +Appendix A -- A Simple enriched-to-plain Translator in C + + One of the major goals in the design of the text/enriched subtype of + the text Content-Type is to make formatted text so simple that even + text-only mailers will implement enriched-to-plain-text translators, + thus increasing the likelihood that multifont text will become "safe" + to use very widely. To demonstrate this simplicity, what follows is + a simple C program that converts text/enriched input into plain text + output. Note that the local newline convention (the single character + represented by "\n") is assumed by this program, but that special + CRLF handling might be necessary on some systems. + +#include <stdio.h> +#include <ctype.h> + +main() { + int c, i, paramct=0, newlinect=0, nofill=0; + char token[62], *p; + + while ((c=getc(stdin)) != EOF) { + if (c == '<') { + if (newlinect == 1) putc(' ', stdout); + newlinect = 0; + c = getc(stdin); + if (c == '<') { + if (paramct <= 0) putc(c, stdout); + } else { + ungetc(c, stdin); + for (i=0, p=token; (c=getc(stdin)) != EOF && c != '>'; + i++) + { if (i < sizeof(token)-1) + *p++ = isupper(c) ? tolower(c) : c; + } + *p = '\0'; + if (c == EOF) break; + if (strcmp(token, "param") == 0) + paramct++; + else if (strcmp(token, "nofill") == 0) + nofill++; + else if (strcmp(token, "/param") == 0) + paramct--; + else if (strcmp(token, "/nofill") == 0) + nofill--; + } + } else { + if (paramct > 0) + ; /* ignore params */ + else if (c == '\n' && nofill <= 0) { + + + +Borenstein [Page 13] + +RFC 1563 A text/enriched type for MIME January 1994 + + + if (++newlinect > 1) putc(c, stdout); + } else { + if (newlinect == 1) putc(' ', stdout); + newlinect = 0; + putc(c, stdout); + } + } + } + /* The following line is only needed with line-buffering */ + putc('\n', stdout); + exit(0); +} + + It should be noted that one can do considerably better than this in + displaying text/enriched data on a dumb terminal. In particular, one + can replace font information such as "bold" with textual emphasis + (like *this* or _T_H_I_S_). One can also properly handle the + text/enriched formatting commands regarding indentation, + justification, and others. However, the above program is all that is + necessary in order to present text/enriched on a dumb terminal + without showing the user any formatting artifacts. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Borenstein [Page 14] + +RFC 1563 A text/enriched type for MIME January 1994 + + +Appendix B -- Differences from RFC 1341 text/richtext + + Text/enriched is a clarification, simplification, and refinement of + the type defined as text/richtext in RFC 1341. For the benefit of + those who are already familiar with text/richtext, or for those who + want to exploit the similarities to be able to display text/richtext + data with their text/enriched software, the differences between the + two are summarized here. Note, however, that text/enriched is + intended to make text/richtext obsolete, so it is not recommended + that new software generate text/richtext. + + 0. The name "richtext" was changed to "enriched", both to + differentiate the two versions and because "richtext" + created widespread confusion with Microsoft's Rich Text + Format (RTF). + + 1. Clarifications. Many things were ambiguous or + unspecified in the text/richtext definition, particularly + the initial state and the semantics of richtext with + multibyte character sets. However, such differences are + OPERATIONALLY irrelevant, since the clarifications offered + in this document are at least reasonable interpretations of + the text/richtext specification. + + 2. Newline semantics have changed. In text/richtext, all + CRLFs were mapped to spaces, and line breaks were indicated + by "<nl>". This has been replaced by the "n-1" rule for + CRLFs. + + 3. The representation of a literal "<" character was "<lt>" + in text/richtext, but is "<<" in text/enriched. + + 4. The "nofill" command did not exist in text/richtext. + + 5. The "param" command did not exist in text/richtext. + + 6. The following commands from text/richtext have been + REMOVED from text/enriched: <COMMENT>, <OUTDENT>, + <OUTDENTRIGHT>, <SAMEPAGE>, <SUBSCRIPT>, <SUPERSCRIPT>, + <HEADING>, <FOOTING>, <ISO-8859-[1-9]>, <US-ASCII>, + <PARAGRAPH>, <SIGNATURE>, <NO-OP>, <LT>, <NL>, and <NP>. + + 7. All claims of SGML compatibility have been dropped. + However, with the possible exceptions of the new semantics + for CRLF and "<<" can be implemented, text/enriched should + be no less SGML-friendly than text/richtext was. + + + + + +Borenstein [Page 15] + +RFC 1563 A text/enriched type for MIME January 1994 + + + 8. In text/richtext, there were three commands (<NL>, <NP>, + and <LT>) that did not use balanced closing delimiters. + Since all of these have been eliminated, there are NO + exceptions to the nesting/balancing rules in text/enriched. + + 9. The limit on the size of formatting tokens has been + increased from 40 to 60 characters. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Borenstein [Page 16] +
\ No newline at end of file |