diff options
Diffstat (limited to 'doc/rfc/rfc2046.txt')
-rw-r--r-- | doc/rfc/rfc2046.txt | 2467 |
1 files changed, 2467 insertions, 0 deletions
diff --git a/doc/rfc/rfc2046.txt b/doc/rfc/rfc2046.txt new file mode 100644 index 0000000..84d90c1 --- /dev/null +++ b/doc/rfc/rfc2046.txt @@ -0,0 +1,2467 @@ + + + + + + +Network Working Group N. Freed +Request for Comments: 2046 Innosoft +Obsoletes: 1521, 1522, 1590 N. Borenstein +Category: Standards Track First Virtual + November 1996 + + + Multipurpose Internet Mail Extensions + (MIME) Part Two: + Media Types + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + STD 11, RFC 822 defines a message representation protocol specifying + considerable detail about US-ASCII message headers, but which leaves + the message content, or message body, as flat US-ASCII text. This + set of documents, collectively called the Multipurpose Internet Mail + Extensions, or MIME, redefines the format of messages to allow for + + (1) textual message bodies in character sets other than + US-ASCII, + + (2) an extensible set of different formats for non-textual + message bodies, + + (3) multi-part message bodies, and + + (4) textual header information in character sets other than + US-ASCII. + + These documents are based on earlier work documented in RFC 934, STD + 11, and RFC 1049, but extends and revises them. Because RFC 822 said + so little about message bodies, these documents are largely + orthogonal to (rather than a revision of) RFC 822. + + The initial document in this set, RFC 2045, specifies the various + headers used to describe the structure of MIME messages. This second + document defines the general structure of the MIME media typing + system and defines an initial set of media types. The third document, + RFC 2047, describes extensions to RFC 822 to allow non-US-ASCII text + + + +Freed & Borenstein Standards Track [Page 1] + +RFC 2046 Media Types November 1996 + + + data in Internet mail header fields. The fourth document, RFC 2048, + specifies various IANA registration procedures for MIME-related + facilities. The fifth and final document, RFC 2049, describes MIME + conformance criteria as well as providing some illustrative examples + of MIME message formats, acknowledgements, and the bibliography. + + These documents are revisions of RFCs 1521 and 1522, which themselves + were revisions of RFCs 1341 and 1342. An appendix in RFC 2049 + describes differences and changes from previous versions. + +Table of Contents + + 1. Introduction ......................................... 3 + 2. Definition of a Top-Level Media Type ................. 4 + 3. Overview Of The Initial Top-Level Media Types ........ 4 + 4. Discrete Media Type Values ........................... 6 + 4.1 Text Media Type ..................................... 6 + 4.1.1 Representation of Line Breaks ..................... 7 + 4.1.2 Charset Parameter ................................. 7 + 4.1.3 Plain Subtype ..................................... 11 + 4.1.4 Unrecognized Subtypes ............................. 11 + 4.2 Image Media Type .................................... 11 + 4.3 Audio Media Type .................................... 11 + 4.4 Video Media Type .................................... 12 + 4.5 Application Media Type .............................. 12 + 4.5.1 Octet-Stream Subtype .............................. 13 + 4.5.2 PostScript Subtype ................................ 14 + 4.5.3 Other Application Subtypes ........................ 17 + 5. Composite Media Type Values .......................... 17 + 5.1 Multipart Media Type ................................ 17 + 5.1.1 Common Syntax ..................................... 19 + 5.1.2 Handling Nested Messages and Multiparts ........... 24 + 5.1.3 Mixed Subtype ..................................... 24 + 5.1.4 Alternative Subtype ............................... 24 + 5.1.5 Digest Subtype .................................... 26 + 5.1.6 Parallel Subtype .................................. 27 + 5.1.7 Other Multipart Subtypes .......................... 28 + 5.2 Message Media Type .................................. 28 + 5.2.1 RFC822 Subtype .................................... 28 + 5.2.2 Partial Subtype ................................... 29 + 5.2.2.1 Message Fragmentation and Reassembly ............ 30 + 5.2.2.2 Fragmentation and Reassembly Example ............ 31 + 5.2.3 External-Body Subtype ............................. 33 + 5.2.4 Other Message Subtypes ............................ 40 + 6. Experimental Media Type Values ....................... 40 + 7. Summary .............................................. 41 + 8. Security Considerations .............................. 41 + 9. Authors' Addresses ................................... 42 + + + +Freed & Borenstein Standards Track [Page 2] + +RFC 2046 Media Types November 1996 + + + A. Collected Grammar .................................... 43 + +1. Introduction + + The first document in this set, RFC 2045, defines a number of header + fields, including Content-Type. The Content-Type field is used to + specify the nature of the data in the body of a MIME entity, by + giving media type and subtype identifiers, and by providing auxiliary + information that may be required for certain media types. After the + type and subtype names, the remainder of the header field is simply a + set of parameters, specified in an attribute/value notation. The + ordering of parameters is not significant. + + In general, the top-level media type is used to declare the general + type of data, while the subtype specifies a specific format for that + type of data. Thus, a media type of "image/xyz" is enough to tell a + user agent that the data is an image, even if the user agent has no + knowledge of the specific image format "xyz". Such information can + be used, for example, to decide whether or not to show a user the raw + data from an unrecognized subtype -- such an action might be + reasonable for unrecognized subtypes of "text", but not for + unrecognized subtypes of "image" or "audio". For this reason, + registered subtypes of "text", "image", "audio", and "video" should + not contain embedded information that is really of a different type. + Such compound formats should be represented using the "multipart" or + "application" types. + + Parameters are modifiers of the media subtype, and as such do not + fundamentally affect the nature of the content. The set of + meaningful parameters depends on the media type and subtype. Most + parameters are associated with a single specific subtype. However, a + given top-level media type may define parameters which are applicable + to any subtype of that type. Parameters may be required by their + defining media type or subtype or they may be optional. MIME + implementations must also ignore any parameters whose names they do + not recognize. + + MIME's Content-Type header field and media type mechanism has been + carefully designed to be extensible, and it is expected that the set + of media type/subtype pairs and their associated parameters will grow + significantly over time. Several other MIME facilities, such as + transfer encodings and "message/external-body" access types, are + likely to have new values defined over time. In order to ensure that + the set of such values is developed in an orderly, well-specified, + and public manner, MIME sets up a registration process which uses the + Internet Assigned Numbers Authority (IANA) as a central registry for + MIME's various areas of extensibility. The registration process for + these areas is described in a companion document, RFC 2048. + + + +Freed & Borenstein Standards Track [Page 3] + +RFC 2046 Media Types November 1996 + + + The initial seven standard top-level media type are defined and + described in the remainder of this document. + +2. Definition of a Top-Level Media Type + + The definition of a top-level media type consists of: + + (1) a name and a description of the type, including + criteria for whether a particular type would qualify + under that type, + + (2) the names and definitions of parameters, if any, which + are defined for all subtypes of that type (including + whether such parameters are required or optional), + + (3) how a user agent and/or gateway should handle unknown + subtypes of this type, + + (4) general considerations on gatewaying entities of this + top-level type, if any, and + + (5) any restrictions on content-transfer-encodings for + entities of this top-level type. + +3. Overview Of The Initial Top-Level Media Types + + The five discrete top-level media types are: + + (1) text -- textual information. The subtype "plain" in + particular indicates plain text containing no + formatting commands or directives of any sort. Plain + text is intended to be displayed "as-is". No special + software is required to get the full meaning of the + text, aside from support for the indicated character + set. Other subtypes are to be used for enriched text in + forms where application software may enhance the + appearance of the text, but such software must not be + required in order to get the general idea of the + content. Possible subtypes of "text" thus include any + word processor format that can be read without + resorting to software that understands the format. In + particular, formats that employ embeddded binary + formatting information are not considered directly + readable. A very simple and portable subtype, + "richtext", was defined in RFC 1341, with a further + revision in RFC 1896 under the name "enriched". + + + + + +Freed & Borenstein Standards Track [Page 4] + +RFC 2046 Media Types November 1996 + + + (2) image -- image data. "Image" requires a display device + (such as a graphical display, a graphics printer, or a + FAX machine) to view the information. An initial + subtype is defined for the widely-used image format + JPEG. . subtypes are defined for two widely-used image + formats, jpeg and gif. + + (3) audio -- audio data. "Audio" requires an audio output + device (such as a speaker or a telephone) to "display" + the contents. An initial subtype "basic" is defined in + this document. + + (4) video -- video data. "Video" requires the capability + to display moving images, typically including + specialized hardware and software. An initial subtype + "mpeg" is defined in this document. + + (5) application -- some other kind of data, typically + either uninterpreted binary data or information to be + processed by an application. The subtype "octet- + stream" is to be used in the case of uninterpreted + binary data, in which case the simplest recommended + action is to offer to write the information into a file + for the user. The "PostScript" subtype is also defined + for the transport of PostScript material. Other + expected uses for "application" include spreadsheets, + data for mail-based scheduling systems, and languages + for "active" (computational) messaging, and word + processing formats that are not directly readable. + Note that security considerations may exist for some + types of application data, most notably + "application/PostScript" and any form of active + messaging. These issues are discussed later in this + document. + + The two composite top-level media types are: + + (1) multipart -- data consisting of multiple entities of + independent data types. Four subtypes are initially + defined, including the basic "mixed" subtype specifying + a generic mixed set of parts, "alternative" for + representing the same data in multiple formats, + "parallel" for parts intended to be viewed + simultaneously, and "digest" for multipart entities in + which each part has a default type of "message/rfc822". + + + + + + +Freed & Borenstein Standards Track [Page 5] + +RFC 2046 Media Types November 1996 + + + (2) message -- an encapsulated message. A body of media + type "message" is itself all or a portion of some kind + of message object. Such objects may or may not in turn + contain other entities. The "rfc822" subtype is used + when the encapsulated content is itself an RFC 822 + message. The "partial" subtype is defined for partial + RFC 822 messages, to permit the fragmented transmission + of bodies that are thought to be too large to be passed + through transport facilities in one piece. Another + subtype, "external-body", is defined for specifying + large bodies by reference to an external data source. + + It should be noted that the list of media type values given here may + be augmented in time, via the mechanisms described above, and that + the set of subtypes is expected to grow substantially. + +4. Discrete Media Type Values + + Five of the seven initial media type values refer to discrete bodies. + The content of these types must be handled by non-MIME mechanisms; + they are opaque to MIME processors. + +4.1. Text Media Type + + The "text" media type is intended for sending material which is + principally textual in form. A "charset" parameter may be used to + indicate the character set of the body text for "text" subtypes, + notably including the subtype "text/plain", which is a generic + subtype for plain text. Plain text does not provide for or allow + formatting commands, font attribute specifications, processing + instructions, interpretation directives, or content markup. Plain + text is seen simply as a linear sequence of characters, possibly + interrupted by line breaks or page breaks. Plain text may allow the + stacking of several characters in the same position in the text. + Plain text in scripts like Arabic and Hebrew may also include + facilitites that allow the arbitrary mixing of text segments with + opposite writing directions. + + Beyond plain text, there are many formats for representing what might + be known as "rich text". An interesting characteristic of many such + representations is that they are to some extent readable even without + the software that interprets them. It is useful, then, to + distinguish them, at the highest level, from such unreadable data as + images, audio, or text represented in an unreadable form. In the + absence of appropriate interpretation software, it is reasonable to + show subtypes of "text" to the user, while it is not reasonable to do + so with most nontextual data. Such formatted textual data should be + represented using subtypes of "text". + + + +Freed & Borenstein Standards Track [Page 6] + +RFC 2046 Media Types November 1996 + + +4.1.1. Representation of Line Breaks + + The canonical form of any MIME "text" subtype MUST always represent a + line break as a CRLF sequence. Similarly, any occurrence of CRLF in + MIME "text" MUST represent a line break. Use of CR and LF outside of + line break sequences is also forbidden. + + This rule applies regardless of format or character set or sets + involved. + + NOTE: The proper interpretation of line breaks when a body is + displayed depends on the media type. In particular, while it is + appropriate to treat a line break as a transition to a new line when + displaying a "text/plain" body, this treatment is actually incorrect + for other subtypes of "text" like "text/enriched" [RFC-1896]. + Similarly, whether or not line breaks should be added during display + operations is also a function of the media type. It should not be + necessary to add any line breaks to display "text/plain" correctly, + whereas proper display of "text/enriched" requires the appropriate + addition of line breaks. + + NOTE: Some protocols defines a maximum line length. E.g. SMTP [RFC- + 821] allows a maximum of 998 octets before the next CRLF sequence. + To be transported by such protocols, data which includes too long + segments without CRLF sequences must be encoded with a suitable + content-transfer-encoding. + +4.1.2. Charset Parameter + + A critical parameter that may be specified in the Content-Type field + for "text/plain" data is the character set. This is specified with a + "charset" parameter, as in: + + Content-type: text/plain; charset=iso-8859-1 + + Unlike some other parameter values, the values of the charset + parameter are NOT case sensitive. The default character set, which + must be assumed in the absence of a charset parameter, is US-ASCII. + + The specification for any future subtypes of "text" must specify + whether or not they will also utilize a "charset" parameter, and may + possibly restrict its values as well. For other subtypes of "text" + than "text/plain", the semantics of the "charset" parameter should be + defined to be identical to those specified here for "text/plain", + i.e., the body consists entirely of characters in the given charset. + In particular, definers of future "text" subtypes should pay close + attention to the implications of multioctet character sets for their + subtype definitions. + + + +Freed & Borenstein Standards Track [Page 7] + +RFC 2046 Media Types November 1996 + + + The charset parameter for subtypes of "text" gives a name of a + character set, as "character set" is defined in RFC 2045. The rules + regarding line breaks detailed in the previous section must also be + observed -- a character set whose definition does not conform to + these rules cannot be used in a MIME "text" subtype. + + An initial list of predefined character set names can be found at the + end of this section. Additional character sets may be registered + with IANA. + + Other media types than subtypes of "text" might choose to employ the + charset parameter as defined here, but with the CRLF/line break + restriction removed. Therefore, all character sets that conform to + the general definition of "character set" in RFC 2045 can be + registered for MIME use. + + Note that if the specified character set includes 8-bit characters + and such characters are used in the body, a Content-Transfer-Encoding + header field and a corresponding encoding on the data are required in + order to transmit the body via some mail transfer protocols, such as + SMTP [RFC-821]. + + The default character set, US-ASCII, has been the subject of some + confusion and ambiguity in the past. Not only were there some + ambiguities in the definition, there have been wide variations in + practice. In order to eliminate such ambiguity and variations in the + future, it is strongly recommended that new user agents explicitly + specify a character set as a media type parameter in the Content-Type + header field. "US-ASCII" does not indicate an arbitrary 7-bit + character set, but specifies that all octets in the body must be + interpreted as characters according to the US-ASCII character set. + National and application-oriented versions of ISO 646 [ISO-646] are + usually NOT identical to US-ASCII, and in that case their use in + Internet mail is explicitly discouraged. The omission of the ISO 646 + character set from this document is deliberate in this regard. The + character set name of "US-ASCII" explicitly refers to the character + set defined in ANSI X3.4-1986 [US- ASCII]. The new international + reference version (IRV) of the 1991 edition of ISO 646 is identical + to US-ASCII. The character set name "ASCII" is reserved and must not + be used for any purpose. + + NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier + version of the American Standard. Insofar as one of the purposes of + specifying a media type and character set is to permit the receiver + to unambiguously determine how the sender intended the coded message + to be interpreted, assuming anything other than "strict ASCII" as the + default would risk unintentional and incompatible changes to the + semantics of messages now being transmitted. This also implies that + + + +Freed & Borenstein Standards Track [Page 8] + +RFC 2046 Media Types November 1996 + + + messages containing characters coded according to other versions of + ISO 646 than US-ASCII and the 1991 IRV, or using code-switching + procedures (e.g., those of ISO 2022), as well as 8bit or multiple + octet character encodings MUST use an appropriate character set + specification to be consistent with MIME. + + The complete US-ASCII character set is listed in ANSI X3.4- 1986. + Note that the control characters including DEL (0-31, 127) have no + defined meaning in apart from the combination CRLF (US-ASCII values + 13 and 10) indicating a new line. Two of the characters have de + facto meanings in wide use: FF (12) often means "start subsequent + text on the beginning of a new page"; and TAB or HT (9) often (though + not always) means "move the cursor to the next available column after + the current position where the column number is a multiple of 8 + (counting the first column as column 0)." Aside from these + conventions, any use of the control characters or DEL in a body must + either occur + + (1) because a subtype of text other than "plain" + specifically assigns some additional meaning, or + + (2) within the context of a private agreement between the + sender and recipient. Such private agreements are + discouraged and should be replaced by the other + capabilities of this document. + + NOTE: An enormous proliferation of character sets exist beyond US- + ASCII. A large number of partially or totally overlapping character + sets is NOT a good thing. A SINGLE character set that can be used + universally for representing all of the world's languages in Internet + mail would be preferrable. Unfortunately, existing practice in + several communities seems to point to the continued use of multiple + character sets in the near future. A small number of standard + character sets are, therefore, defined for Internet use in this + document. + + The defined charset values are: + + (1) US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII]. + + (2) ISO-8859-X -- where "X" is to be replaced, as + necessary, for the parts of ISO-8859 [ISO-8859]. Note + that the ISO 646 character sets have deliberately been + omitted in favor of their 8859 replacements, which are + the designated character sets for Internet mail. As of + the publication of this document, the legitimate values + for "X" are the digits 1 through 10. + + + + +Freed & Borenstein Standards Track [Page 9] + +RFC 2046 Media Types November 1996 + + + Characters in the range 128-159 has no assigned meaning in ISO-8859- + X. Characters with values below 128 in ISO-8859-X have the same + assigned meaning as they do in US-ASCII. + + Part 6 of ISO 8859 (Latin/Arabic alphabet) and part 8 (Latin/Hebrew + alphabet) includes both characters for which the normal writing + direction is right to left and characters for which it is left to + right, but do not define a canonical ordering method for representing + bi-directional text. The charset values "ISO-8859-6" and "ISO-8859- + 8", however, specify that the visual method is used [RFC-1556]. + + All of these character sets are used as pure 7bit or 8bit sets + without any shift or escape functions. The meaning of shift and + escape sequences in these character sets is not defined. + + The character sets specified above are the ones that were relatively + uncontroversial during the drafting of MIME. This document does not + endorse the use of any particular character set other than US-ASCII, + and recognizes that the future evolution of world character sets + remains unclear. + + Note that the character set used, if anything other than US- ASCII, + must always be explicitly specified in the Content-Type field. + + No character set name other than those defined above may be used in + Internet mail without the publication of a formal specification and + its registration with IANA, or by private agreement, in which case + the character set name must begin with "X-". + + Implementors are discouraged from defining new character sets unless + absolutely necessary. + + The "charset" parameter has been defined primarily for the purpose of + textual data, and is described in this section for that reason. + However, it is conceivable that non-textual data might also wish to + specify a charset value for some purpose, in which case the same + syntax and values should be used. + + In general, composition software should always use the "lowest common + denominator" character set possible. For example, if a body contains + only US-ASCII characters, it SHOULD be marked as being in the US- + ASCII character set, not ISO-8859-1, which, like all the ISO-8859 + family of character sets, is a superset of US-ASCII. More generally, + if a widely-used character set is a subset of another character set, + and a body contains only characters in the widely-used subset, it + should be labelled as being in that subset. This will increase the + chances that the recipient will be able to view the resulting entity + correctly. + + + +Freed & Borenstein Standards Track [Page 10] + +RFC 2046 Media Types November 1996 + + +4.1.3. Plain Subtype + + The simplest and most important subtype of "text" is "plain". This + indicates plain text that does not contain any formatting commands or + directives. Plain text is intended to be displayed "as-is", that is, + no interpretation of embedded formatting commands, font attribute + specifications, processing instructions, interpretation directives, + or content markup should be necessary for proper display. The + default media type of "text/plain; charset=us-ascii" for Internet + mail describes existing Internet practice. That is, it is the type + of body defined by RFC 822. + + No other "text" subtype is defined by this document. + +4.1.4. Unrecognized Subtypes + + Unrecognized subtypes of "text" should be treated as subtype "plain" + as long as the MIME implementation knows how to handle the charset. + Unrecognized subtypes which also specify an unrecognized charset + should be treated as "application/octet- stream". + +4.2. Image Media Type + + A media type of "image" indicates that the body contains an image. + The subtype names the specific image format. These names are not + case sensitive. An initial subtype is "jpeg" for the JPEG format + using JFIF encoding [JPEG]. + + The list of "image" subtypes given here is neither exclusive nor + exhaustive, and is expected to grow as more types are registered with + IANA, as described in RFC 2048. + + Unrecognized subtypes of "image" should at a miniumum be treated as + "application/octet-stream". Implementations may optionally elect to + pass subtypes of "image" that they do not specifically recognize to a + secure and robust general-purpose image viewing application, if such + an application is available. + + NOTE: Using of a generic-purpose image viewing application this way + inherits the security problems of the most dangerous type supported + by the application. + +4.3. Audio Media Type + + A media type of "audio" indicates that the body contains audio data. + Although there is not yet a consensus on an "ideal" audio format for + use with computers, there is a pressing need for a format capable of + providing interoperable behavior. + + + +Freed & Borenstein Standards Track [Page 11] + +RFC 2046 Media Types November 1996 + + + The initial subtype of "basic" is specified to meet this requirement + by providing an absolutely minimal lowest common denominator audio + format. It is expected that richer formats for higher quality and/or + lower bandwidth audio will be defined by a later document. + + The content of the "audio/basic" subtype is single channel audio + encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz. + + Unrecognized subtypes of "audio" should at a miniumum be treated as + "application/octet-stream". Implementations may optionally elect to + pass subtypes of "audio" that they do not specifically recognize to a + robust general-purpose audio playing application, if such an + application is available. + +4.4. Video Media Type + + A media type of "video" indicates that the body contains a time- + varying-picture image, possibly with color and coordinated sound. + The term 'video' is used in its most generic sense, rather than with + reference to any particular technology or format, and is not meant to + preclude subtypes such as animated drawings encoded compactly. The + subtype "mpeg" refers to video coded according to the MPEG standard + [MPEG]. + + Note that although in general this document strongly discourages the + mixing of multiple media in a single body, it is recognized that many + so-called video formats include a representation for synchronized + audio, and this is explicitly permitted for subtypes of "video". + + Unrecognized subtypes of "video" should at a minumum be treated as + "application/octet-stream". Implementations may optionally elect to + pass subtypes of "video" that they do not specifically recognize to a + robust general-purpose video display application, if such an + application is available. + +4.5. Application Media Type + + The "application" media type is to be used for discrete data which do + not fit in any of the other categories, and particularly for data to + be processed by some type of application program. This is + information which must be processed by an application before it is + viewable or usable by a user. Expected uses for the "application" + media type include file transfer, spreadsheets, data for mail-based + scheduling systems, and languages for "active" (computational) + material. (The latter, in particular, can pose security problems + which must be understood by implementors, and are considered in + detail in the discussion of the "application/PostScript" media type.) + + + + +Freed & Borenstein Standards Track [Page 12] + +RFC 2046 Media Types November 1996 + + + For example, a meeting scheduler might define a standard + representation for information about proposed meeting dates. An + intelligent user agent would use this information to conduct a dialog + with the user, and might then send additional material based on that + dialog. More generally, there have been several "active" messaging + languages developed in which programs in a suitably specialized + language are transported to a remote location and automatically run + in the recipient's environment. + + Such applications may be defined as subtypes of the "application" + media type. This document defines two subtypes: + + octet-stream, and PostScript. + + The subtype of "application" will often be either the name or include + part of the name of the application for which the data are intended. + This does not mean, however, that any application program name may be + used freely as a subtype of "application". + +4.5.1. Octet-Stream Subtype + + The "octet-stream" subtype is used to indicate that a body contains + arbitrary binary data. The set of currently defined parameters is: + + (1) TYPE -- the general type or category of binary data. + This is intended as information for the human recipient + rather than for any automatic processing. + + (2) PADDING -- the number of bits of padding that were + appended to the bit-stream comprising the actual + contents to produce the enclosed 8bit byte-oriented + data. This is useful for enclosing a bit-stream in a + body when the total number of bits is not a multiple of + 8. + + Both of these parameters are optional. + + An additional parameter, "CONVERSIONS", was defined in RFC 1341 but + has since been removed. RFC 1341 also defined the use of a "NAME" + parameter which gave a suggested file name to be used if the data + were to be written to a file. This has been deprecated in + anticipation of a separate Content-Disposition header field, to be + defined in a subsequent RFC. + + The recommended action for an implementation that receives an + "application/octet-stream" entity is to simply offer to put the data + in a file, with any Content-Transfer-Encoding undone, or perhaps to + use it as input to a user-specified process. + + + +Freed & Borenstein Standards Track [Page 13] + +RFC 2046 Media Types November 1996 + + + To reduce the danger of transmitting rogue programs, it is strongly + recommended that implementations NOT implement a path-search + mechanism whereby an arbitrary program named in the Content-Type + parameter (e.g., an "interpreter=" parameter) is found and executed + using the message body as input. + +4.5.2. PostScript Subtype + + A media type of "application/postscript" indicates a PostScript + program. Currently two variants of the PostScript language are + allowed; the original level 1 variant is described in [POSTSCRIPT] + and the more recent level 2 variant is described in [POSTSCRIPT2]. + + PostScript is a registered trademark of Adobe Systems, Inc. Use of + the MIME media type "application/postscript" implies recognition of + that trademark and all the rights it entails. + + The PostScript language definition provides facilities for internal + labelling of the specific language features a given program uses. + This labelling, called the PostScript document structuring + conventions, or DSC, is very general and provides substantially more + information than just the language level. The use of document + structuring conventions, while not required, is strongly recommended + as an aid to interoperability. Documents which lack proper + structuring conventions cannot be tested to see whether or not they + will work in a given environment. As such, some systems may assume + the worst and refuse to process unstructured documents. + + The execution of general-purpose PostScript interpreters entails + serious security risks, and implementors are discouraged from simply + sending PostScript bodies to "off- the-shelf" interpreters. While it + is usually safe to send PostScript to a printer, where the potential + for harm is greatly constrained by typical printer environments, + implementors should consider all of the following before they add + interactive display of PostScript bodies to their MIME readers. + + The remainder of this section outlines some, though probably not all, + of the possible problems with the transport of PostScript entities. + + (1) Dangerous operations in the PostScript language + include, but may not be limited to, the PostScript + operators "deletefile", "renamefile", "filenameforall", + and "file". "File" is only dangerous when applied to + something other than standard input or output. + Implementations may also define additional nonstandard + file operators; these may also pose a threat to + security. "Filenameforall", the wildcard file search + operator, may appear at first glance to be harmless. + + + +Freed & Borenstein Standards Track [Page 14] + +RFC 2046 Media Types November 1996 + + + Note, however, that this operator has the potential to + reveal information about what files the recipient has + access to, and this information may itself be + sensitive. Message senders should avoid the use of + potentially dangerous file operators, since these + operators are quite likely to be unavailable in secure + PostScript implementations. Message receiving and + displaying software should either completely disable + all potentially dangerous file operators or take + special care not to delegate any special authority to + their operation. These operators should be viewed as + being done by an outside agency when interpreting + PostScript documents. Such disabling and/or checking + should be done completely outside of the reach of the + PostScript language itself; care should be taken to + insure that no method exists for re-enabling full- + function versions of these operators. + + (2) The PostScript language provides facilities for exiting + the normal interpreter, or server, loop. Changes made + in this "outer" environment are customarily retained + across documents, and may in some cases be retained + semipermanently in nonvolatile memory. The operators + associated with exiting the interpreter loop have the + potential to interfere with subsequent document + processing. As such, their unrestrained use + constitutes a threat of service denial. PostScript + operators that exit the interpreter loop include, but + may not be limited to, the exitserver and startjob + operators. Message sending software should not + generate PostScript that depends on exiting the + interpreter loop to operate, since the ability to exit + will probably be unavailable in secure PostScript + implementations. Message receiving and displaying + software should completely disable the ability to make + retained changes to the PostScript environment by + eliminating or disabling the "startjob" and + "exitserver" operations. If these operations cannot be + eliminated or completely disabled the password + associated with them should at least be set to a hard- + to-guess value. + + (3) PostScript provides operators for setting system-wide + and device-specific parameters. These parameter + settings may be retained across jobs and may + potentially pose a threat to the correct operation of + the interpreter. The PostScript operators that set + system and device parameters include, but may not be + + + +Freed & Borenstein Standards Track [Page 15] + +RFC 2046 Media Types November 1996 + + + limited to, the "setsystemparams" and "setdevparams" + operators. Message sending software should not + generate PostScript that depends on the setting of + system or device parameters to operate correctly. The + ability to set these parameters will probably be + unavailable in secure PostScript implementations. + Message receiving and displaying software should + disable the ability to change system and device + parameters. If these operators cannot be completely + disabled the password associated with them should at + least be set to a hard-to-guess value. + + (4) Some PostScript implementations provide nonstandard + facilities for the direct loading and execution of + machine code. Such facilities are quite obviously open + to substantial abuse. Message sending software should + not make use of such features. Besides being totally + hardware-specific, they are also likely to be + unavailable in secure implementations of PostScript. + Message receiving and displaying software should not + allow such operators to be used if they exist. + + (5) PostScript is an extensible language, and many, if not + most, implementations of it provide a number of their + own extensions. This document does not deal with such + extensions explicitly since they constitute an unknown + factor. Message sending software should not make use + of nonstandard extensions; they are likely to be + missing from some implementations. Message receiving + and displaying software should make sure that any + nonstandard PostScript operators are secure and don't + present any kind of threat. + + (6) It is possible to write PostScript that consumes huge + amounts of various system resources. It is also + possible to write PostScript programs that loop + indefinitely. Both types of programs have the + potential to cause damage if sent to unsuspecting + recipients. Message-sending software should avoid the + construction and dissemination of such programs, which + is antisocial. Message receiving and displaying + software should provide appropriate mechanisms to abort + processing after a reasonable amount of time has + elapsed. In addition, PostScript interpreters should be + limited to the consumption of only a reasonable amount + of any given system resource. + + + + + +Freed & Borenstein Standards Track [Page 16] + +RFC 2046 Media Types November 1996 + + + (7) It is possible to include raw binary information inside + PostScript in various forms. This is not recommended + for use in Internet mail, both because it is not + supported by all PostScript interpreters and because it + significantly complicates the use of a MIME Content- + Transfer-Encoding. (Without such binary, PostScript + may typically be viewed as line-oriented data. The + treatment of CRLF sequences becomes extremely + problematic if binary and line-oriented data are mixed + in a single Postscript data stream.) + + (8) Finally, bugs may exist in some PostScript interpreters + which could possibly be exploited to gain unauthorized + access to a recipient's system. Apart from noting this + possibility, there is no specific action to take to + prevent this, apart from the timely correction of such + bugs if any are found. + +4.5.3. Other Application Subtypes + + It is expected that many other subtypes of "application" will be + defined in the future. MIME implementations must at a minimum treat + any unrecognized subtypes as being equivalent to "application/octet- + stream". + +5. Composite Media Type Values + + The remaining two of the seven initial Content-Type values refer to + composite entities. Composite entities are handled using MIME + mechanisms -- a MIME processor typically handles the body directly. + +5.1. Multipart Media Type + + In the case of multipart entities, in which one or more different + sets of data are combined in a single body, a "multipart" media type + field must appear in the entity's header. The body must then contain + one or more body parts, each preceded by a boundary delimiter line, + and the last one followed by a closing boundary delimiter line. + After its boundary delimiter line, each body part then consists of a + header area, a blank line, and a body area. Thus a body part is + similar to an RFC 822 message in syntax, but different in meaning. + + A body part is an entity and hence is NOT to be interpreted as + actually being an RFC 822 message. To begin with, NO header fields + are actually required in body parts. A body part that starts with a + blank line, therefore, is allowed and is a body part for which all + default values are to be assumed. In such a case, the absence of a + Content-Type header usually indicates that the corresponding body has + + + +Freed & Borenstein Standards Track [Page 17] + +RFC 2046 Media Types November 1996 + + + a content-type of "text/plain; charset=US-ASCII". + + The only header fields that have defined meaning for body parts are + those the names of which begin with "Content-". All other header + fields may be ignored in body parts. Although they should generally + be retained if at all possible, they may be discarded by gateways if + necessary. Such other fields are permitted to appear in body parts + but must not be depended on. "X-" fields may be created for + experimental or private purposes, with the recognition that the + information they contain may be lost at some gateways. + + NOTE: The distinction between an RFC 822 message and a body part is + subtle, but important. A gateway between Internet and X.400 mail, + for example, must be able to tell the difference between a body part + that contains an image and a body part that contains an encapsulated + message, the body of which is a JPEG image. In order to represent + the latter, the body part must have "Content-Type: message/rfc822", + and its body (after the blank line) must be the encapsulated message, + with its own "Content-Type: image/jpeg" header field. The use of + similar syntax facilitates the conversion of messages to body parts, + and vice versa, but the distinction between the two must be + understood by implementors. (For the special case in which parts + actually are messages, a "digest" subtype is also defined.) + + As stated previously, each body part is preceded by a boundary + delimiter line that contains the boundary delimiter. The boundary + delimiter MUST NOT appear inside any of the encapsulated parts, on a + line by itself or as the prefix of any line. This implies that it is + crucial that the composing agent be able to choose and specify a + unique boundary parameter value that does not contain the boundary + parameter value of an enclosing multipart as a prefix. + + All present and future subtypes of the "multipart" type must use an + identical syntax. Subtypes may differ in their semantics, and may + impose additional restrictions on syntax, but must conform to the + required syntax for the "multipart" type. This requirement ensures + that all conformant user agents will at least be able to recognize + and separate the parts of any multipart entity, even those of an + unrecognized subtype. + + As stated in the definition of the Content-Transfer-Encoding field + [RFC 2045], no encoding other than "7bit", "8bit", or "binary" is + permitted for entities of type "multipart". The "multipart" boundary + delimiters and header fields are always represented as 7bit US-ASCII + in any case (though the header fields may encode non-US-ASCII header + text as per RFC 2047) and data within the body parts can be encoded + on a part-by-part basis, with Content-Transfer-Encoding fields for + each appropriate body part. + + + +Freed & Borenstein Standards Track [Page 18] + +RFC 2046 Media Types November 1996 + + +5.1.1. Common Syntax + + This section defines a common syntax for subtypes of "multipart". + All subtypes of "multipart" must use this syntax. A simple example + of a multipart message also appears in this section. An example of a + more complex multipart message is given in RFC 2049. + + The Content-Type field for multipart entities requires one parameter, + "boundary". The boundary delimiter line is then defined as a line + consisting entirely of two hyphen characters ("-", decimal value 45) + followed by the boundary parameter value from the Content-Type header + field, optional linear whitespace, and a terminating CRLF. + + NOTE: The hyphens are for rough compatibility with the earlier RFC + 934 method of message encapsulation, and for ease of searching for + the boundaries in some implementations. However, it should be noted + that multipart messages are NOT completely compatible with RFC 934 + encapsulations; in particular, they do not obey RFC 934 quoting + conventions for embedded lines that begin with hyphens. This + mechanism was chosen over the RFC 934 mechanism because the latter + causes lines to grow with each level of quoting. The combination of + this growth with the fact that SMTP implementations sometimes wrap + long lines made the RFC 934 mechanism unsuitable for use in the event + that deeply-nested multipart structuring is ever desired. + + WARNING TO IMPLEMENTORS: The grammar for parameters on the Content- + type field is such that it is often necessary to enclose the boundary + parameter values in quotes on the Content-type line. This is not + always necessary, but never hurts. Implementors should be sure to + study the grammar carefully in order to avoid producing invalid + Content-type fields. Thus, a typical "multipart" Content-Type header + field might look like this: + + Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p + + But the following is not valid: + + Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p + + (because of the colon) and must instead be represented as + + Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p" + + This Content-Type value indicates that the content consists of one or + more parts, each with a structure that is syntactically identical to + an RFC 822 message, except that the header area is allowed to be + completely empty, and that the parts are each preceded by the line + + + + +Freed & Borenstein Standards Track [Page 19] + +RFC 2046 Media Types November 1996 + + + --gc0pJq0M:08jU534c0p + + The boundary delimiter MUST occur at the beginning of a line, i.e., + following a CRLF, and the initial CRLF is considered to be attached + to the boundary delimiter line rather than part of the preceding + part. The boundary may be followed by zero or more characters of + linear whitespace. It is then terminated by either another CRLF and + the header fields for the next part, or by two CRLFs, in which case + there are no header fields for the next part. If no Content-Type + field is present it is assumed to be "message/rfc822" in a + "multipart/digest" and "text/plain" otherwise. + + NOTE: The CRLF preceding the boundary delimiter line is conceptually + attached to the boundary so that it is possible to have a part that + does not end with a CRLF (line break). Body parts that must be + considered to end with line breaks, therefore, must have two CRLFs + preceding the boundary delimiter line, the first of which is part of + the preceding body part, and the second of which is part of the + encapsulation boundary. + + Boundary delimiters must not appear within the encapsulated material, + and must be no longer than 70 characters, not counting the two + leading hyphens. + + The boundary delimiter line following the last body part is a + distinguished delimiter that indicates that no further body parts + will follow. Such a delimiter line is identical to the previous + delimiter lines, with the addition of two more hyphens after the + boundary parameter value. + + --gc0pJq0M:08jU534c0p-- + + NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the + boundary value with the beginning of each candidate line. An exact + match of the entire candidate line is not required; it is sufficient + that the boundary appear in its entirety following the CRLF. + + There appears to be room for additional information prior to the + first boundary delimiter line and following the final boundary + delimiter line. These areas should generally be left blank, and + implementations must ignore anything that appears before the first + boundary delimiter line or after the last one. + + NOTE: These "preamble" and "epilogue" areas are generally not used + because of the lack of proper typing of these parts and the lack of + clear semantics for handling these areas at gateways, particularly + X.400 gateways. However, rather than leaving the preamble area + blank, many MIME implementations have found this to be a convenient + + + +Freed & Borenstein Standards Track [Page 20] + +RFC 2046 Media Types November 1996 + + + place to insert an explanatory note for recipients who read the + message with pre-MIME software, since such notes will be ignored by + MIME-compliant software. + + NOTE: Because boundary delimiters must not appear in the body parts + being encapsulated, a user agent must exercise care to choose a + unique boundary parameter value. The boundary parameter value in the + example above could have been the result of an algorithm designed to + produce boundary delimiters with a very low probability of already + existing in the data to be encapsulated without having to prescan the + data. Alternate algorithms might result in more "readable" boundary + delimiters for a recipient with an old user agent, but would require + more attention to the possibility that the boundary delimiter might + appear at the beginning of some line in the encapsulated part. The + simplest boundary delimiter line possible is something like "---", + with a closing boundary delimiter line of "-----". + + As a very simple example, the following multipart message has two + parts, both of them plain text, one of them explicitly typed and one + of them implicitly typed: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST) + Subject: Sample message + MIME-Version: 1.0 + Content-type: multipart/mixed; boundary="simple boundary" + + This is the preamble. It is to be ignored, though it + is a handy place for composition agents to include an + explanatory note to non-MIME conformant readers. + + --simple boundary + + This is implicitly typed plain US-ASCII text. + It does NOT end with a linebreak. + --simple boundary + Content-type: text/plain; charset=us-ascii + + This is explicitly typed plain US-ASCII text. + It DOES end with a linebreak. + + --simple boundary-- + + This is the epilogue. It is also to be ignored. + + + + + + +Freed & Borenstein Standards Track [Page 21] + +RFC 2046 Media Types November 1996 + + + The use of a media type of "multipart" in a body part within another + "multipart" entity is explicitly allowed. In such cases, for obvious + reasons, care must be taken to ensure that each nested "multipart" + entity uses a different boundary delimiter. See RFC 2049 for an + example of nested "multipart" entities. + + The use of the "multipart" media type with only a single body part + may be useful in certain contexts, and is explicitly permitted. + + NOTE: Experience has shown that a "multipart" media type with a + single body part is useful for sending non-text media types. It has + the advantage of providing the preamble as a place to include + decoding instructions. In addition, a number of SMTP gateways move + or remove the MIME headers, and a clever MIME decoder can take a good + guess at multipart boundaries even in the absence of the Content-Type + header and thereby successfully decode the message. + + The only mandatory global parameter for the "multipart" media type is + the boundary parameter, which consists of 1 to 70 characters from a + set of characters known to be very robust through mail gateways, and + NOT ending with white space. (If a boundary delimiter line appears to + end with white space, the white space must be presumed to have been + added by a gateway, and must be deleted.) It is formally specified + by the following BNF: + + boundary := 0*69<bchars> bcharsnospace + + bchars := bcharsnospace / " " + + bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / + "+" / "_" / "," / "-" / "." / + "/" / ":" / "=" / "?" + + Overall, the body of a "multipart" entity may be specified as + follows: + + dash-boundary := "--" boundary + ; boundary taken from the value of + ; boundary parameter of the + ; Content-Type field. + + multipart-body := [preamble CRLF] + dash-boundary transport-padding CRLF + body-part *encapsulation + close-delimiter transport-padding + [CRLF epilogue] + + + + + +Freed & Borenstein Standards Track [Page 22] + +RFC 2046 Media Types November 1996 + + + transport-padding := *LWSP-char + ; Composers MUST NOT generate + ; non-zero length transport + ; padding, but receivers MUST + ; be able to handle padding + ; added by message transports. + + encapsulation := delimiter transport-padding + CRLF body-part + + delimiter := CRLF dash-boundary + + close-delimiter := delimiter "--" + + preamble := discard-text + + epilogue := discard-text + + discard-text := *(*text CRLF) *text + ; May be ignored or discarded. + + body-part := MIME-part-headers [CRLF *OCTET] + ; Lines in a body-part must not start + ; with the specified dash-boundary and + ; the delimiter must not appear anywhere + ; in the body part. Note that the + ; semantics of a body-part differ from + ; the semantics of a message, as + ; described in the text. + + OCTET := <any 0-255 octet value> + + IMPORTANT: The free insertion of linear-white-space and RFC 822 + comments between the elements shown in this BNF is NOT allowed since + this BNF does not specify a structured header field. + + NOTE: In certain transport enclaves, RFC 822 restrictions such as + the one that limits bodies to printable US-ASCII characters may not + be in force. (That is, the transport domains may exist that resemble + standard Internet mail transport as specified in RFC 821 and assumed + by RFC 822, but without certain restrictions.) The relaxation of + these restrictions should be construed as locally extending the + definition of bodies, for example to include octets outside of the + US-ASCII range, as long as these extensions are supported by the + transport and adequately documented in the Content- Transfer-Encoding + header field. However, in no event are headers (either message + headers or body part headers) allowed to contain anything other than + US-ASCII characters. + + + +Freed & Borenstein Standards Track [Page 23] + +RFC 2046 Media Types November 1996 + + + NOTE: Conspicuously missing from the "multipart" type is a notion of + structured, related body parts. It is recommended that those wishing + to provide more structured or integrated multipart messaging + facilities should define subtypes of multipart that are syntactically + identical but define relationships between the various parts. For + example, subtypes of multipart could be defined that include a + distinguished part which in turn is used to specify the relationships + between the other parts, probably referring to them by their + Content-ID field. Old implementations will not recognize the new + subtype if this approach is used, but will treat it as + multipart/mixed and will thus be able to show the user the parts that + are recognized. + +5.1.2. Handling Nested Messages and Multiparts + + The "message/rfc822" subtype defined in a subsequent section of this + document has no terminating condition other than running out of data. + Similarly, an improperly truncated "multipart" entity may not have + any terminating boundary marker, and can turn up operationally due to + mail system malfunctions. + + It is essential that such entities be handled correctly when they are + themselves imbedded inside of another "multipart" structure. MIME + implementations are therefore required to recognize outer level + boundary markers at ANY level of inner nesting. It is not sufficient + to only check for the next expected marker or other terminating + condition. + +5.1.3. Mixed Subtype + + The "mixed" subtype of "multipart" is intended for use when the body + parts are independent and need to be bundled in a particular order. + Any "multipart" subtypes that an implementation does not recognize + must be treated as being of subtype "mixed". + +5.1.4. Alternative Subtype + + The "multipart/alternative" type is syntactically identical to + "multipart/mixed", but the semantics are different. In particular, + each of the body parts is an "alternative" version of the same + information. + + Systems should recognize that the content of the various parts are + interchangeable. Systems should choose the "best" type based on the + local environment and references, in some cases even through user + interaction. As with "multipart/mixed", the order of body parts is + significant. In this case, the alternatives appear in an order of + increasing faithfulness to the original content. In general, the + + + +Freed & Borenstein Standards Track [Page 24] + +RFC 2046 Media Types November 1996 + + + best choice is the LAST part of a type supported by the recipient + system's local environment. + + "Multipart/alternative" may be used, for example, to send a message + in a fancy text format in such a way that it can easily be displayed + anywhere: + + From: Nathaniel Borenstein <nsb@bellcore.com> + To: Ned Freed <ned@innosoft.com> + Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST) + Subject: Formatted text mail + MIME-Version: 1.0 + Content-Type: multipart/alternative; boundary=boundary42 + + --boundary42 + Content-Type: text/plain; charset=us-ascii + + ... plain text version of message goes here ... + + --boundary42 + Content-Type: text/enriched + + ... RFC 1896 text/enriched version of same message + goes here ... + + --boundary42 + Content-Type: application/x-whatever + + ... fanciest version of same message goes here ... + + --boundary42-- + + In this example, users whose mail systems understood the + "application/x-whatever" format would see only the fancy version, + while other users would see only the enriched or plain text version, + depending on the capabilities of their system. + + In general, user agents that compose "multipart/alternative" entities + must place the body parts in increasing order of preference, that is, + with the preferred format last. For fancy text, the sending user + agent should put the plainest format first and the richest format + last. Receiving user agents should pick and display the last format + they are capable of displaying. In the case where one of the + alternatives is itself of type "multipart" and contains unrecognized + sub-parts, the user agent may choose either to show that alternative, + an earlier alternative, or both. + + + + + +Freed & Borenstein Standards Track [Page 25] + +RFC 2046 Media Types November 1996 + + + NOTE: From an implementor's perspective, it might seem more sensible + to reverse this ordering, and have the plainest alternative last. + However, placing the plainest alternative first is the friendliest + possible option when "multipart/alternative" entities are viewed + using a non-MIME-conformant viewer. While this approach does impose + some burden on conformant MIME viewers, interoperability with older + mail readers was deemed to be more important in this case. + + It may be the case that some user agents, if they can recognize more + than one of the formats, will prefer to offer the user the choice of + which format to view. This makes sense, for example, if a message + includes both a nicely- formatted image version and an easily-edited + text version. What is most critical, however, is that the user not + automatically be shown multiple versions of the same data. Either + the user should be shown the last recognized version or should be + given the choice. + + THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each part of a + "multipart/alternative" entity represents the same data, but the + mappings between the two are not necessarily without information + loss. For example, information is lost when translating ODA to + PostScript or plain text. It is recommended that each part should + have a different Content-ID value in the case where the information + content of the two parts is not identical. And when the information + content is identical -- for example, where several parts of type + "message/external-body" specify alternate ways to access the + identical data -- the same Content-ID field value should be used, to + optimize any caching mechanisms that might be present on the + recipient's end. However, the Content-ID values used by the parts + should NOT be the same Content-ID value that describes the + "multipart/alternative" as a whole, if there is any such Content-ID + field. That is, one Content-ID value will refer to the + "multipart/alternative" entity, while one or more other Content-ID + values will refer to the parts inside it. + +5.1.5. Digest Subtype + + This document defines a "digest" subtype of the "multipart" Content- + Type. This type is syntactically identical to "multipart/mixed", but + the semantics are different. In particular, in a digest, the default + Content-Type value for a body part is changed from "text/plain" to + "message/rfc822". This is done to allow a more readable digest + format that is largely compatible (except for the quoting convention) + with RFC 934. + + Note: Though it is possible to specify a Content-Type value for a + body part in a digest which is other than "message/rfc822", such as a + "text/plain" part containing a description of the material in the + + + +Freed & Borenstein Standards Track [Page 26] + +RFC 2046 Media Types November 1996 + + + digest, actually doing so is undesireble. The "multipart/digest" + Content-Type is intended to be used to send collections of messages. + If a "text/plain" part is needed, it should be included as a seperate + part of a "multipart/mixed" message. + + A digest in this format might, then, look something like this: + + From: Moderator-Address + To: Recipient-List + Date: Mon, 22 Mar 1994 13:34:51 +0000 + Subject: Internet Digest, volume 42 + MIME-Version: 1.0 + Content-Type: multipart/mixed; + boundary="---- main boundary ----" + + ------ main boundary ---- + + ...Introductory text or table of contents... + + ------ main boundary ---- + Content-Type: multipart/digest; + boundary="---- next message ----" + + ------ next message ---- + + From: someone-else + Date: Fri, 26 Mar 1993 11:13:32 +0200 + Subject: my opinion + + ...body goes here ... + + ------ next message ---- + + From: someone-else-again + Date: Fri, 26 Mar 1993 10:07:13 -0500 + Subject: my different opinion + + ... another body goes here ... + + ------ next message ------ + + ------ main boundary ------ + +5.1.6. Parallel Subtype + + This document defines a "parallel" subtype of the "multipart" + Content-Type. This type is syntactically identical to + "multipart/mixed", but the semantics are different. In particular, + + + +Freed & Borenstein Standards Track [Page 27] + +RFC 2046 Media Types November 1996 + + + in a parallel entity, the order of body parts is not significant. + + A common presentation of this type is to display all of the parts + simultaneously on hardware and software that are capable of doing so. + However, composing agents should be aware that many mail readers will + lack this capability and will show the parts serially in any event. + +5.1.7. Other Multipart Subtypes + + Other "multipart" subtypes are expected in the future. MIME + implementations must in general treat unrecognized subtypes of + "multipart" as being equivalent to "multipart/mixed". + +5.2. Message Media Type + + It is frequently desirable, in sending mail, to encapsulate another + mail message. A special media type, "message", is defined to + facilitate this. In particular, the "rfc822" subtype of "message" is + used to encapsulate RFC 822 messages. + + NOTE: It has been suggested that subtypes of "message" might be + defined for forwarded or rejected messages. However, forwarded and + rejected messages can be handled as multipart messages in which the + first part contains any control or descriptive information, and a + second part, of type "message/rfc822", is the forwarded or rejected + message. Composing rejection and forwarding messages in this manner + will preserve the type information on the original message and allow + it to be correctly presented to the recipient, and hence is strongly + encouraged. + + Subtypes of "message" often impose restrictions on what encodings are + allowed. These restrictions are described in conjunction with each + specific subtype. + + Mail gateways, relays, and other mail handling agents are commonly + known to alter the top-level header of an RFC 822 message. In + particular, they frequently add, remove, or reorder header fields. + These operations are explicitly forbidden for the encapsulated + headers embedded in the bodies of messages of type "message." + +5.2.1. RFC822 Subtype + + A media type of "message/rfc822" indicates that the body contains an + encapsulated message, with the syntax of an RFC 822 message. + However, unlike top-level RFC 822 messages, the restriction that each + "message/rfc822" body must include a "From", "Date", and at least one + destination header is removed and replaced with the requirement that + at least one of "From", "Subject", or "Date" must be present. + + + +Freed & Borenstein Standards Track [Page 28] + +RFC 2046 Media Types November 1996 + + + It should be noted that, despite the use of the numbers "822", a + "message/rfc822" entity isn't restricted to material in strict + conformance to RFC822, nor are the semantics of "message/rfc822" + objects restricted to the semantics defined in RFC822. More + specifically, a "message/rfc822" message could well be a News article + or a MIME message. + + No encoding other than "7bit", "8bit", or "binary" is permitted for + the body of a "message/rfc822" entity. The message header fields are + always US-ASCII in any case, and data within the body can still be + encoded, in which case the Content-Transfer-Encoding header field in + the encapsulated message will reflect this. Non-US-ASCII text in the + headers of an encapsulated message can be specified using the + mechanisms described in RFC 2047. + +5.2.2. Partial Subtype + + The "partial" subtype is defined to allow large entities to be + delivered as several separate pieces of mail and automatically + reassembled by a receiving user agent. (The concept is similar to IP + fragmentation and reassembly in the basic Internet Protocols.) This + mechanism can be used when intermediate transport agents limit the + size of individual messages that can be sent. The media type + "message/partial" thus indicates that the body contains a fragment of + a larger entity. + + Because data of type "message" may never be encoded in base64 or + quoted-printable, a problem might arise if "message/partial" entities + are constructed in an environment that supports binary or 8bit + transport. The problem is that the binary data would be split into + multiple "message/partial" messages, each of them requiring binary + transport. If such messages were encountered at a gateway into a + 7bit transport environment, there would be no way to properly encode + them for the 7bit world, aside from waiting for all of the fragments, + reassembling the inner message, and then encoding the reassembled + data in base64 or quoted-printable. Since it is possible that + different fragments might go through different gateways, even this is + not an acceptable solution. For this reason, it is specified that + entities of type "message/partial" must always have a content- + transfer-encoding of 7bit (the default). In particular, even in + environments that support binary or 8bit transport, the use of a + content- transfer-encoding of "8bit" or "binary" is explicitly + prohibited for MIME entities of type "message/partial". This in turn + implies that the inner message must not use "8bit" or "binary" + encoding. + + + + + + +Freed & Borenstein Standards Track [Page 29] + +RFC 2046 Media Types November 1996 + + + Because some message transfer agents may choose to automatically + fragment large messages, and because such agents may use very + different fragmentation thresholds, it is possible that the pieces of + a partial message, upon reassembly, may prove themselves to comprise + a partial message. This is explicitly permitted. + + Three parameters must be specified in the Content-Type field of type + "message/partial": The first, "id", is a unique identifier, as close + to a world-unique identifier as possible, to be used to match the + fragments together. (In general, the identifier is essentially a + message-id; if placed in double quotes, it can be ANY message-id, in + accordance with the BNF for "parameter" given in RFC 2045.) The + second, "number", an integer, is the fragment number, which indicates + where this fragment fits into the sequence of fragments. The third, + "total", another integer, is the total number of fragments. This + third subfield is required on the final fragment, and is optional + (though encouraged) on the earlier fragments. Note also that these + parameters may be given in any order. + + Thus, the second piece of a 3-piece message may have either of the + following header fields: + + Content-Type: Message/Partial; number=2; total=3; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com" + + Content-Type: Message/Partial; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; + number=2 + + But the third piece MUST specify the total number of fragments: + + Content-Type: Message/Partial; number=3; total=3; + id="oc=jpbe0M2Yt4s@thumper.bellcore.com" + + Note that fragment numbering begins with 1, not 0. + + When the fragments of an entity broken up in this manner are put + together, the result is always a complete MIME entity, which may have + its own Content-Type header field, and thus may contain any other + data type. + +5.2.2.1. Message Fragmentation and Reassembly + + The semantics of a reassembled partial message must be those of the + "inner" message, rather than of a message containing the inner + message. This makes it possible, for example, to send a large audio + message as several partial messages, and still have it appear to the + recipient as a simple audio message rather than as an encapsulated + + + +Freed & Borenstein Standards Track [Page 30] + +RFC 2046 Media Types November 1996 + + + message containing an audio message. That is, the encapsulation of + the message is considered to be "transparent". + + When generating and reassembling the pieces of a "message/partial" + message, the headers of the encapsulated message must be merged with + the headers of the enclosing entities. In this process the following + rules must be observed: + + (1) Fragmentation agents must split messages at line + boundaries only. This restriction is imposed because + splits at points other than the ends of lines in turn + depends on message transports being able to preserve + the semantics of messages that don't end with a CRLF + sequence. Many transports are incapable of preserving + such semantics. + + (2) All of the header fields from the initial enclosing + message, except those that start with "Content-" and + the specific header fields "Subject", "Message-ID", + "Encrypted", and "MIME-Version", must be copied, in + order, to the new message. + + (3) The header fields in the enclosed message which start + with "Content-", plus the "Subject", "Message-ID", + "Encrypted", and "MIME-Version" fields, must be + appended, in order, to the header fields of the new + message. Any header fields in the enclosed message + which do not start with "Content-" (except for the + "Subject", "Message-ID", "Encrypted", and "MIME- + Version" fields) will be ignored and dropped. + + (4) All of the header fields from the second and any + subsequent enclosing messages are discarded by the + reassembly process. + +5.2.2.2. Fragmentation and Reassembly Example + + If an audio message is broken into two pieces, the first piece might + look something like this: + + X-Weird-Header-1: Foo + From: Bill@host.com + To: joe@otherhost.com + Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) + Subject: Audio mail (part 1 of 2) + Message-ID: <id1@host.com> + MIME-Version: 1.0 + Content-type: message/partial; id="ABC@host.com"; + + + +Freed & Borenstein Standards Track [Page 31] + +RFC 2046 Media Types November 1996 + + + number=1; total=2 + + X-Weird-Header-1: Bar + X-Weird-Header-2: Hello + Message-ID: <anotherid@foo.com> + Subject: Audio mail + MIME-Version: 1.0 + Content-type: audio/basic + Content-transfer-encoding: base64 + + ... first half of encoded audio data goes here ... + + and the second half might look something like this: + + From: Bill@host.com + To: joe@otherhost.com + Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) + Subject: Audio mail (part 2 of 2) + MIME-Version: 1.0 + Message-ID: <id2@host.com> + Content-type: message/partial; + id="ABC@host.com"; number=2; total=2 + + ... second half of encoded audio data goes here ... + + Then, when the fragmented message is reassembled, the resulting + message to be displayed to the user should look something like this: + + X-Weird-Header-1: Foo + From: Bill@host.com + To: joe@otherhost.com + Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) + Subject: Audio mail + Message-ID: <anotherid@foo.com> + MIME-Version: 1.0 + Content-type: audio/basic + Content-transfer-encoding: base64 + + ... first half of encoded audio data goes here ... + ... second half of encoded audio data goes here ... + + The inclusion of a "References" field in the headers of the second + and subsequent pieces of a fragmented message that references the + Message-Id on the previous piece may be of benefit to mail readers + that understand and track references. However, the generation of + such "References" fields is entirely optional. + + + + + +Freed & Borenstein Standards Track [Page 32] + +RFC 2046 Media Types November 1996 + + + Finally, it should be noted that the "Encrypted" header field has + been made obsolete by Privacy Enhanced Messaging (PEM) [RFC-1421, + RFC-1422, RFC-1423, RFC-1424], but the rules above are nevertheless + believed to describe the correct way to treat it if it is encountered + in the context of conversion to and from "message/partial" fragments. + +5.2.3. External-Body Subtype + + The external-body subtype indicates that the actual body data are not + included, but merely referenced. In this case, the parameters + describe a mechanism for accessing the external data. + + When a MIME entity is of type "message/external-body", it consists of + a header, two consecutive CRLFs, and the message header for the + encapsulated message. If another pair of consecutive CRLFs appears, + this of course ends the message header for the encapsulated message. + However, since the encapsulated message's body is itself external, it + does NOT appear in the area that follows. For example, consider the + following message: + + Content-type: message/external-body; + access-type=local-file; + name="/u/nsb/Me.jpeg" + + Content-type: image/jpeg + Content-ID: <id42@guppylake.bellcore.com> + Content-Transfer-Encoding: binary + + THIS IS NOT REALLY THE BODY! + + The area at the end, which might be called the "phantom body", is + ignored for most external-body messages. However, it may be used to + contain auxiliary information for some such messages, as indeed it is + when the access-type is "mail- server". The only access-type defined + in this document that uses the phantom body is "mail-server", but + other access-types may be defined in the future in other + specifications that use this area. + + The encapsulated headers in ALL "message/external-body" entities MUST + include a Content-ID header field to give a unique identifier by + which to reference the data. This identifier may be used for caching + mechanisms, and for recognizing the receipt of the data when the + access-type is "mail-server". + + Note that, as specified here, the tokens that describe external-body + data, such as file names and mail server commands, are required to be + in the US-ASCII character set. + + + + +Freed & Borenstein Standards Track [Page 33] + +RFC 2046 Media Types November 1996 + + + If this proves problematic in practice, a new mechanism may be + required as a future extension to MIME, either as newly defined + access-types for "message/external-body" or by some other mechanism. + + As with "message/partial", MIME entities of type "message/external- + body" MUST have a content-transfer-encoding of 7bit (the default). + In particular, even in environments that support binary or 8bit + transport, the use of a content- transfer-encoding of "8bit" or + "binary" is explicitly prohibited for entities of type + "message/external-body". + +5.2.3.1. General External-Body Parameters + + The parameters that may be used with any "message/external- body" + are: + + (1) ACCESS-TYPE -- A word indicating the supported access + mechanism by which the file or data may be obtained. + This word is not case sensitive. Values include, but + are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL- + FILE", and "MAIL-SERVER". Future values, except for + experimental values beginning with "X-", must be + registered with IANA, as described in RFC 2048. + This parameter is unconditionally mandatory and MUST be + present on EVERY "message/external-body". + + (2) EXPIRATION -- The date (in the RFC 822 "date-time" + syntax, as extended by RFC 1123 to permit 4 digits in + the year field) after which the existence of the + external data is not guaranteed. This parameter may be + used with ANY access-type and is ALWAYS optional. + + (3) SIZE -- The size (in octets) of the data. The intent + of this parameter is to help the recipient decide + whether or not to expend the necessary resources to + retrieve the external data. Note that this describes + the size of the data in its canonical form, that is, + before any Content-Transfer-Encoding has been applied + or after the data have been decoded. This parameter + may be used with ANY access-type and is ALWAYS + optional. + + (4) PERMISSION -- A case-insensitive field that indicates + whether or not it is expected that clients might also + attempt to overwrite the data. By default, or if + permission is "read", the assumption is that they are + not, and that if the data is retrieved once, it is + never needed again. If PERMISSION is "read-write", + + + +Freed & Borenstein Standards Track [Page 34] + +RFC 2046 Media Types November 1996 + + + this assumption is invalid, and any local copy must be + considered no more than a cache. "Read" and "Read- + write" are the only defined values of permission. This + parameter may be used with ANY access-type and is + ALWAYS optional. + + The precise semantics of the access-types defined here are described + in the sections that follow. + +5.2.3.2. The 'ftp' and 'tftp' Access-Types + + An access-type of FTP or TFTP indicates that the message body is + accessible as a file using the FTP [RFC-959] or TFTP [RFC- 783] + protocols, respectively. For these access-types, the following + additional parameters are mandatory: + + (1) NAME -- The name of the file that contains the actual + body data. + + (2) SITE -- A machine from which the file may be obtained, + using the given protocol. This must be a fully + qualified domain name, not a nickname. + + (3) Before any data are retrieved, using FTP, the user will + generally need to be asked to provide a login id and a + password for the machine named by the site parameter. + For security reasons, such an id and password are not + specified as content-type parameters, but must be + obtained from the user. + + In addition, the following parameters are optional: + + (1) DIRECTORY -- A directory from which the data named by + NAME should be retrieved. + + (2) MODE -- A case-insensitive string indicating the mode + to be used when retrieving the information. The valid + values for access-type "TFTP" are "NETASCII", "OCTET", + and "MAIL", as specified by the TFTP protocol [RFC- + 783]. The valid values for access-type "FTP" are + "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a + decimal integer, typically 8. These correspond to the + representation types "A" "E" "I" and "L n" as specified + by the FTP protocol [RFC-959]. Note that "BINARY" and + "TENEX" are not valid values for MODE and that "OCTET" + or "IMAGE" or "LOCAL8" should be used instead. IF MODE + is not specified, the default value is "NETASCII" for + TFTP and "ASCII" otherwise. + + + +Freed & Borenstein Standards Track [Page 35] + +RFC 2046 Media Types November 1996 + + +5.2.3.3. The 'anon-ftp' Access-Type + + The "anon-ftp" access-type is identical to the "ftp" access type, + except that the user need not be asked to provide a name and password + for the specified site. Instead, the ftp protocol will be used with + login "anonymous" and a password that corresponds to the user's mail + address. + +5.2.3.4. The 'local-file' Access-Type + + An access-type of "local-file" indicates that the actual body is + accessible as a file on the local machine. Two additional parameters + are defined for this access type: + + (1) NAME -- The name of the file that contains the actual + body data. This parameter is mandatory for the + "local-file" access-type. + + (2) SITE -- A domain specifier for a machine or set of + machines that are known to have access to the data + file. This optional parameter is used to describe the + locality of reference for the data, that is, the site + or sites at which the file is expected to be visible. + Asterisks may be used for wildcard matching to a part + of a domain name, such as "*.bellcore.com", to indicate + a set of machines on which the data should be directly + visible, while a single asterisk may be used to + indicate a file that is expected to be universally + available, e.g., via a global file system. + +5.2.3.5. The 'mail-server' Access-Type + + The "mail-server" access-type indicates that the actual body is + available from a mail server. Two additional parameters are defined + for this access-type: + + (1) SERVER -- The addr-spec of the mail server from which + the actual body data can be obtained. This parameter + is mandatory for the "mail-server" access-type. + + (2) SUBJECT -- The subject that is to be used in the mail + that is sent to obtain the data. Note that keying mail + servers on Subject lines is NOT recommended, but such + mail servers are known to exist. This is an optional + parameter. + + + + + + +Freed & Borenstein Standards Track [Page 36] + +RFC 2046 Media Types November 1996 + + + Because mail servers accept a variety of syntaxes, some of which is + multiline, the full command to be sent to a mail server is not + included as a parameter in the content-type header field. Instead, + it is provided as the "phantom body" when the media type is + "message/external-body" and the access-type is mail-server. + + Note that MIME does not define a mail server syntax. Rather, it + allows the inclusion of arbitrary mail server commands in the phantom + body. Implementations must include the phantom body in the body of + the message it sends to the mail server address to retrieve the + relevant data. + + Unlike other access-types, mail-server access is asynchronous and + will happen at an unpredictable time in the future. For this reason, + it is important that there be a mechanism by which the returned data + can be matched up with the original "message/external-body" entity. + MIME mail servers must use the same Content-ID field on the returned + message that was used in the original "message/external-body" + entities, to facilitate such matching. + +5.2.3.6. External-Body Security Issues + + "Message/external-body" entities give rise to two important security + issues: + + (1) Accessing data via a "message/external-body" reference + effectively results in the message recipient performing + an operation that was specified by the message + originator. It is therefore possible for the message + originator to trick a recipient into doing something + they would not have done otherwise. For example, an + originator could specify a action that attempts + retrieval of material that the recipient is not + authorized to obtain, causing the recipient to + unwittingly violate some security policy. For this + reason, user agents capable of resolving external + references must always take steps to describe the + action they are to take to the recipient and ask for + explicit permisssion prior to performing it. + + The 'mail-server' access-type is particularly + vulnerable, in that it causes the recipient to send a + new message whose contents are specified by the + original message's originator. Given the potential for + abuse, any such request messages that are constructed + should contain a clear indication that they were + generated automatically (e.g. in a Comments: header + field) in an attempt to resolve a MIME + + + +Freed & Borenstein Standards Track [Page 37] + +RFC 2046 Media Types November 1996 + + + "message/external-body" reference. + + (2) MIME will sometimes be used in environments that + provide some guarantee of message integrity and + authenticity. If present, such guarantees may apply + only to the actual direct content of messages -- they + may or may not apply to data accessed through MIME's + "message/external-body" mechanism. In particular, it + may be possible to subvert certain access mechanisms + even when the messaging system itself is secure. + + It should be noted that this problem exists either with + or without the availabilty of MIME mechanisms. A + casual reference to an FTP site containing a document + in the text of a secure message brings up similar + issues -- the only difference is that MIME provides for + automatic retrieval of such material, and users may + place unwarranted trust is such automatic retrieval + mechanisms. + +5.2.3.7. Examples and Further Explanations + + When the external-body mechanism is used in conjunction with the + "multipart/alternative" media type it extends the functionality of + "multipart/alternative" to include the case where the same entity is + provided in the same format but via different accces mechanisms. + When this is done the originator of the message must order the parts + first in terms of preferred formats and then by preferred access + mechanisms. The recipient's viewer should then evaluate the list + both in terms of format and access mechanisms. + + With the emerging possibility of very wide-area file systems, it + becomes very hard to know in advance the set of machines where a file + will and will not be accessible directly from the file system. + Therefore it may make sense to provide both a file name, to be tried + directly, and the name of one or more sites from which the file is + known to be accessible. An implementation can try to retrieve remote + files using FTP or any other protocol, using anonymous file retrieval + or prompting the user for the necessary name and password. If an + external body is accessible via multiple mechanisms, the sender may + include multiple entities of type "message/external-body" within the + body parts of an enclosing "multipart/alternative" entity. + + However, the external-body mechanism is not intended to be limited to + file retrieval, as shown by the mail-server access-type. Beyond + this, one can imagine, for example, using a video server for external + references to video clips. + + + + +Freed & Borenstein Standards Track [Page 38] + +RFC 2046 Media Types November 1996 + + + The embedded message header fields which appear in the body of the + "message/external-body" data must be used to declare the media type + of the external body if it is anything other than plain US-ASCII + text, since the external body does not have a header section to + declare its type. Similarly, any Content-transfer-encoding other + than "7bit" must also be declared here. Thus a complete + "message/external-body" message, referring to an object in PostScript + format, might look like this: + + From: Whomever + To: Someone + Date: Whenever + Subject: whatever + MIME-Version: 1.0 + Message-ID: <id1@host.com> + Content-Type: multipart/alternative; boundary=42 + Content-ID: <id001@guppylake.bellcore.com> + + --42 + Content-Type: message/external-body; name="BodyFormats.ps"; + site="thumper.bellcore.com"; mode="image"; + access-type=ANON-FTP; directory="pub"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + Content-ID: <id42@guppylake.bellcore.com> + + --42 + Content-Type: message/external-body; access-type=local-file; + name="/u/nsb/writing/rfcs/RFC-MIME.ps"; + site="thumper.bellcore.com"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + Content-ID: <id42@guppylake.bellcore.com> + + --42 + Content-Type: message/external-body; + access-type=mail-server + server="listserv@bogus.bitnet"; + expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" + + Content-type: application/postscript + Content-ID: <id42@guppylake.bellcore.com> + + get RFC-MIME.DOC + + --42-- + + + +Freed & Borenstein Standards Track [Page 39] + +RFC 2046 Media Types November 1996 + + + Note that in the above examples, the default Content-transfer- + encoding of "7bit" is assumed for the external postscript data. + + Like the "message/partial" type, the "message/external-body" media + type is intended to be transparent, that is, to convey the data type + in the external body rather than to convey a message with a body of + that type. Thus the headers on the outer and inner parts must be + merged using the same rules as for "message/partial". In particular, + this means that the Content-type and Subject fields are overridden, + but the From field is preserved. + + Note that since the external bodies are not transported along with + the external body reference, they need not conform to transport + limitations that apply to the reference itself. In particular, + Internet mail transports may impose 7bit and line length limits, but + these do not automatically apply to binary external body references. + Thus a Content-Transfer-Encoding is not generally necessary, though + it is permitted. + + Note that the body of a message of type "message/external-body" is + governed by the basic syntax for an RFC 822 message. In particular, + anything before the first consecutive pair of CRLFs is header + information, while anything after it is body information, which is + ignored for most access-types. + +5.2.4. Other Message Subtypes + + MIME implementations must in general treat unrecognized subtypes of + "message" as being equivalent to "application/octet-stream". + + Future subtypes of "message" intended for use with email should be + restricted to "7bit" encoding. A type other than "message" should be + used if restriction to "7bit" is not possible. + +6. Experimental Media Type Values + + A media type value beginning with the characters "X-" is a private + value, to be used by consenting systems by mutual agreement. Any + format without a rigorous and public definition must be named with an + "X-" prefix, and publicly specified values shall never begin with + "X-". (Older versions of the widely used Andrew system use the "X- + BE2" name, so new systems should probably choose a different name.) + + In general, the use of "X-" top-level types is strongly discouraged. + Implementors should invent subtypes of the existing types whenever + possible. In many cases, a subtype of "application" will be more + appropriate than a new top-level type. + + + + +Freed & Borenstein Standards Track [Page 40] + +RFC 2046 Media Types November 1996 + + +7. Summary + + The five discrete media types provide provide a standardized + mechanism for tagging entities as "audio", "image", or several other + kinds of data. The composite "multipart" and "message" media types + allow mixing and hierarchical structuring of entities of different + types in a single message. A distinguished parameter syntax allows + further specification of data format details, particularly the + specification of alternate character sets. Additional optional + header fields provide mechanisms for certain extensions deemed + desirable by many implementors. Finally, a number of useful media + types are defined for general use by consenting user agents, notably + "message/partial" and "message/external-body". + +9. Security Considerations + + Security issues are discussed in the context of the + "application/postscript" type, the "message/external-body" type, and + in RFC 2048. Implementors should pay special attention to the + security implications of any media types that can cause the remote + execution of any actions in the recipient's environment. In such + cases, the discussion of the "application/postscript" type may serve + as a model for considering other media types with remote execution + capabilities. + + + + + + + + + + + + + + + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 41] + +RFC 2046 Media Types November 1996 + + +9. Authors' Addresses + + For more information, the authors of this document are best contacted + via Internet mail: + + Ned Freed + Innosoft International, Inc. + 1050 East Garvey Avenue South + West Covina, CA 91790 + USA + + Phone: +1 818 919 3600 + Fax: +1 818 919 3614 + EMail: ned@innosoft.com + + + Nathaniel S. Borenstein + First Virtual Holdings + 25 Washington Avenue + Morristown, NJ 07960 + USA + + Phone: +1 201 540 8967 + Fax: +1 201 993 3032 + EMail: nsb@nsb.fv.com + + + MIME is a result of the work of the Internet Engineering Task Force + Working Group on RFC 822 Extensions. The chairman of that group, + Greg Vaudreuil, may be reached at: + + Gregory M. Vaudreuil + Octel Network Services + 17080 Dallas Parkway + Dallas, TX 75248-1905 + USA + + EMail: Greg.Vaudreuil@Octel.Com + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 42] + +RFC 2046 Media Types November 1996 + + +Appendix A -- Collected Grammar + + This appendix contains the complete BNF grammar for all the syntax + specified by this document. + + By itself, however, this grammar is incomplete. It refers by name to + several syntax rules that are defined by RFC 822. Rather than + reproduce those definitions here, and risk unintentional differences + between the two, this document simply refers the reader to RFC 822 + for the remaining definitions. Wherever a term is undefined, it + refers to the RFC 822 definition. + + boundary := 0*69<bchars> bcharsnospace + + bchars := bcharsnospace / " " + + bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / + "+" / "_" / "," / "-" / "." / + "/" / ":" / "=" / "?" + + body-part := <"message" as defined in RFC 822, with all + header fields optional, not starting with the + specified dash-boundary, and with the + delimiter not occurring anywhere in the + body part. Note that the semantics of a + part differ from the semantics of a message, + as described in the text.> + + close-delimiter := delimiter "--" + + dash-boundary := "--" boundary + ; boundary taken from the value of + ; boundary parameter of the + ; Content-Type field. + + delimiter := CRLF dash-boundary + + discard-text := *(*text CRLF) + ; May be ignored or discarded. + + encapsulation := delimiter transport-padding + CRLF body-part + + epilogue := discard-text + + multipart-body := [preamble CRLF] + dash-boundary transport-padding CRLF + body-part *encapsulation + + + +Freed & Borenstein Standards Track [Page 43] + +RFC 2046 Media Types November 1996 + + + close-delimiter transport-padding + [CRLF epilogue] + + preamble := discard-text + + transport-padding := *LWSP-char + ; Composers MUST NOT generate + ; non-zero length transport + ; padding, but receivers MUST + ; be able to handle padding + ; added by message transports. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Freed & Borenstein Standards Track [Page 44] + |