diff options
Diffstat (limited to 'doc/rfc/rfc1049.txt')
-rw-r--r-- | doc/rfc/rfc1049.txt | 451 |
1 files changed, 451 insertions, 0 deletions
diff --git a/doc/rfc/rfc1049.txt b/doc/rfc/rfc1049.txt new file mode 100644 index 0000000..151afc9 --- /dev/null +++ b/doc/rfc/rfc1049.txt @@ -0,0 +1,451 @@ + + + + + + +Network Working Group M. Sirbu +Request for Comments: 1049 CMU + March 1988 + + A CONTENT-TYPE HEADER FIELD FOR INTERNET MESSAGES + +STATUS OF THIS MEMO + + This RFC suggests proposed additions to the Internet Mail Protocol, + RFC-822, for the Internet community, and requests discussion and + suggestions for improvements. Distribution of this memo is + unlimited. + +ABSTRACT + + A standardized Content-type field allows mail reading systems to + automatically identify the type of a structured message body and to + process it for display accordingly. The structured message body must + still conform to the RFC-822 requirements concerning allowable + characters. A mail reading system need not take any specific action + upon receiving a message with a valid Content-Type header field. The + ability to recognize this field and invoke the appropriate display + process accordingly will, however, improve the readability of + messages, and allow the exchange of messages containing mathematical + symbols, or foreign language characters. + + Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1 + 2. Problems with Structured Messages . . . . . . . . . . . . . . . 3 + 3. The Content-type Header Field . . . . . . . . . . . . . . . . . 5 + 3.1. Type Values . . . . . . . . . . . . . . . . . . . . . . 5 + 3.2. Version Number . . . . . . . . . . . . . . . . . . . . . 6 + 3.3. Resource Reference . . . . . . . . . . . . . . . . . . . 6 + 3.4. Comment. . . . . . . . . . . . . . . . . . . . . . . . . 7 + 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 7 + +1. Introduction + + As defined in RFC-822, [2], an electronic mail message consists of a + number of defined header fields, some containing structured + information (e.g., date, addresses), and a message body consisting of + an unstructured string of ASCII characters. + + The success of the Internet mail system has led to a desire to use + the mail system for sending around information with a greater degree + of structure, while remaining within the constraints imposed by the + limited character set. A prime example is the use of mail to send a + + + +Sirbu [Page 1] + +RFC 1049 Mail Content Type March 1988 + + + document with embedded TROFF formatting commands. A more + sophisticated example would be a message body encoded in a Page + Description Language (PDL) such as Postscript. In both cases, simply + mapping the ASCII characters to the screen or printer in the usual + fashion will not render the document image intended by the sender; an + additional processing step is required to produce an image of the + message text on a display device or a piece of paper. + + In both of these examples, the message body contains only the legal + character set, but the content has a structure which produces some + desirable result after appropriate processing by the recipient. If a + message header field could be used to indicate the structuring + technique used in the message body, then a sophisticated mail system + could use such a field to automatically invoke the appropriate + processing of the message body. For example, a header field which + indicated that the message body was encoded using Postscript could be + used to direct a mail system running under Sun Microsystem's NEWS + window manager to process the Postscript to produce the appropriate + page image on the screen. + + Private header fields (beginning with "X-") are already being used by + some systems to affect such a result (e.g., the Andrew Message System + developed at Carnegie Mellon University). However, the widespread + use of such techniques will require general agreement on the name and + allowed parameter values for a header field to be used for this + purpose. + + We propose that a new header field, "Content-type:" be recognized as + the standard field for indicating the structure of the message body. + The contents of the "Content-Type:" field are parameters which + specify what type of structure is used in the message body. + + Note that we are not proposing that the message body contain anything + other than ASCII characters as specified in RFC-822. Whatever + structuring is contained in the message body must be represented + using only the allowed ASCII characters. Thus, this proposal should + have no impact on existing mailers, only on mail reading systems. + + At the same time, this restriction eliminates the use of more general + structuring techniques such as Abstract Syntax Notation, (CCITT + Recommendation X.409) as used in the X.400 messaging standard, which + are octet-oriented. + + This is not the first proposal for structuring message bodies. + RFC-767 discusses a proposed technique for structuring multi-media + mail messages. We are also aware that many users already employ mail + to send TROFF, SCRIBE, TEX, Postscript or other structured + information. Such postprocessing as is required must be invoked + + + +Sirbu [Page 2] + +RFC 1049 Mail Content Type March 1988 + + + manually by the message recipient who looks at the message text + displayed as conventional ASCII and recognizes that it is structured + in some way that requires additional processing to be properly + rendered. Our proposal is designed to facilitate automatic + processing of messages by a mail reading system. + +2. Problems with Structured Messages + + Once we introduce the notion that a message body might require some + processing other than simply painting the characters to the screen we + raise a number of fundamental questions. These generally arise due + to the certainty that some receiving systems will have the facilities + to process the received message and some will not. The problem is + what to do in the presence of systems with different levels of + capability. + + First, we must recognize that the purpose of structured messages is + to be able to send types of information, ultimately intended for + human consumption, not expressable in plain ASCII. Thus, there is no + way in plain ASCII to send the italics, boldface, or greek characters + that can be expressed in Postscript. If some different processing is + necessary to render these glyphs, then that is the minimum price to + be paid in order to send them at all. + + Second, by insisting that the message body contain only ASCII, we + insure that it will not "break" current mail reading systems which + are not equipped to process the structure; the result on the screen + may not be readily interpretable by the human reader, however. + + If a message sender knows that the recipient cannot process + Postscript, he or she may prefer that the message be revised to + eliminate the use of italics and boldface, rather than appear + incomprehensible. If Postscript is being used because the message + contains passages in Greek, there may be no suitable ASCII + equivalent, however. + + Ideally, the details of structuring the message (or not) to conform + to the capabilities of the recipient system could be completely + hidden from the message sender. The distributed Internet mail system + would somehow determine the capabilities of the recipient system, and + convert the message automatically; or, if there was no way to send + Greek text in ASCII, inform the sender that his message could not be + transmitted. + + + + + + + + +Sirbu [Page 3] + +RFC 1049 Mail Content Type March 1988 + + + In practice, this is a difficult task. There are three possible + approaches: + + 1. Each mail system maintains a database of capabilities of + remote systems it knows how to send to. Such a database + would be very difficult to keep up to date. + + 2. The mail transport service negotiates with the receiving + system as to its capabilities. If the receiving system + cannot support the specified content type, the mail is + transformed into conventional ASCII before transmission. + This would require changes to all existing SMTP + implementations, and could not be implemented in the case + where RFC-822 type messages are being forwarded via Bitnet or + other networks which do not implement SMTP. + + 3. An expanded directory service maintains information on mail + processing capabilities of receiving hosts. This eliminates + the need for real-time negotiation with the final + destination, but still requires direct interaction with the + directory service. Since directory querying is part of mail + sending as opposed to mail composing/reading systems, this + requires changes to existing mailers as well as a major + change to the domain name directory service. + + We note in passing that the X.400 protocol implements approach number + 2, and that the Draft Recommendations for X.DS, the Directory + Service, would support option 3. + + In the interest of facilitating early usage of structured messages, + we choose not to recommend any of the three approaches described + above at the present time. In a forthcoming RFC we will propose a + solution based on option 2, requiring modification to mailers to + support negotiation over capabilities. For the present, then, users + would be obliged to keep their own private list of capabilities of + recipients and to take care that they do not send Postscript, TROFF + or other structured messages to recipients who cannot process them. + The penalty for failure to do so will be the frustration of the + recipient in trying to read a raw Postscript or TROFF file painted on + his or her screen. Some System Administrators may attempt to + implement option 1 for the benefit of their users, but this does not + impose a requirement for changes on any other mail system. + + We recognize that the long-term solution must require changes to + mailers. However, in order to begin now to standardize the header + fields, and to facilitate experimentation, we issue the present RFC. + + + + + +Sirbu [Page 4] + +RFC 1049 Mail Content Type March 1988 + + +3. The Content-type Header Field + + Whatever structuring technique is specified by the Content-type + field, it must be known precisely to both the sender and the + recipient of the message in order for the message to be properly + interpreted. In general, this means that the allowed parameter + values for the Content-type: field must identify a well-defined, + standardized, document structuring technique. We do not preclude, + however, the use of a Content-type: parameter value to specify a + private structuring technique known only to the sender and the + recipient. + + More precisely, we propose that the Content-type: header field + consist of up to four parameter values. The first, or type parameter + names the structuring technique; the second, optional, parameter is a + version number, ver-num, which indicates a particular version or + revision of the standardized structuring technique. The third + parameter is a resource reference, resource-ref, which may indicate a + standard database of information to be used in interpreting the + structured document. The last parameter is a comment. + + In the Extended Backus Naur Form of RFC-822, we have: + + Content-Type:= type [";" ver-num [";" 1#resource-ref]] [comment] + +3.1. Type Values + + Initially, the type parameter would be limited to the following set + of values: + + type:= "POSTSCRIPT"/"SCRIBE"/"SGML"/"TEX"/"TROFF"/ + "DVI"/"X-"atom + + These values are not case sensitive. POSTSCRIPT, Postscript, and + POStscriPT are all equivalent. + + POSTSCRIPT Indicates the enclosed document consists of + information encoded using the Postscript Page + Definition Language developed by Adobe Systems, + Inc. [1] + + SCRIBE Indicates the document contains embedded formatting + information according to the syntax used by the + Scribe document formatting language distributed by + the Unilogic Corporation. [6] + + SGML Indicates the document contains structuring + information to according the rules specified for + + + +Sirbu [Page 5] + +RFC 1049 Mail Content Type March 1988 + + + the Standard Generalized Markup Language, IS 8879, + as published by the International Organization for + Standardization. [3] Documents structured according + to the ISO DIS 8613--Office Docment Architecture and + Interchange Format--may also be encoded using SGML + syntax. + + TEX Indicates the document contains embedded formatting + information according to the syntax of the TEX + document production language. [4] + + TROFF Indicates the document contains embedded formatting + information according to the syntax specified for the + TROFF formatting package developed by AT&T Bell + Laboratories. [5] + + DVI Indicates the document contains information according + to the device independent file format produced by + TROFF or TEX. + + "X-"atom Any type value beginning with the characters "X-" is + a private value. + +3.2. Version Number + + Since standard structuring techniques in fact evolve over time, we + leave room for specifying a version number for the content type. + Valid values will depend upon the type parameter. + + ver-num:= local-part + + In particular, we have the following valid values: + + For type=POSTSCRIPT + + ver-num:= "1.0"/"2.0"/"null" + + For type=SCRIBE + + ver-num:= "3"/"4"/"5"/"null" + + For type=SGML + + ver-num:="IS.8879.1986"/"null" + +3.3. Resource Reference + + resource-ref:= local-part + + + +Sirbu [Page 6] + +RFC 1049 Mail Content Type March 1988 + + + As Apple has demonstrated with their implementation of the + Laserwriter, a very general document structuring technique can be + made more efficient by defining a set of macros or other similar + resources to be used in interpreting any transmitted stream. The + Macintosh transmits a LaserPrep file to the Laserwriter containing + font and macro definitions which can be called upon by subsequent + documents. The result is that documents as sent to the Laserwriter + are considerably more compact than if they had to include the + LaserPrep file each time. The Resource Reference parameter allows + specification of a well known resource, such as a LaserPrep file, + which should be used by the receiving system when processing the + message. + + Resource references could also include macro packages for use with + TEX or references to preprocessors such as eqn and tbl for use with + troff. Allowed values will vary according to the type parameter. + + In particular, we propose the following values: + + For type = POSTSCRIPT + + resource-ref:= "laserprep2.9"/"laserprep3.0"/"laserprep3.1"/ + "laserprep4.0"/local-part + + For type = TROFF + + resource-ref:= "eqn"/"tbl"/"me"/local-part + +3.4. Comment + + The comment field can be any additional comment text the user + desires. Comments are enclosed in parentheses as specified in + RFC-822. + +4. Conclusion + + A standardized Content-type field allows mail reading systems to + automatically identify the type of a structured message body and to + process it for display accordingly. The strcutured message body must + still conform to the RFC-822 requirements concerning allowable + characters. A mail reading system need not take any specific action + upon receiving a message with valid Content-Type header field. The + ability to recognize this field and invoke the appropriate display + process accordingly will, however, improve the readability of + messages, and allow the exchange of messages containing mathematical + symbols, or foreign language characters. + + + + + +Sirbu [Page 7] + +RFC 1049 Mail Content Type March 1988 + + + In the near term, the major use of a Content-Type: header field is + likely to be for designating the message body as containing a Page + Definition Language representation such as Postscript. + + Additional type values shall be registered with Internet Assigned + Numbers Coordinator at USC-ISI. Please contact: + + Joyce K. Reynolds + USC Information Sciences Institute + 4676 Admiralty Way + Marina del Rey, CA 90292-6695 + + 213-822-1511 JKReynolds@ISI.EDU + + REFERENCES + + 1. Adobe Systems, Inc. Postscript Language Reference Manual. + Addison-Wesley, Reading, Mass., 1985. + + 2. Crocker, David H. RFC-822: Standard for the Format of ARPA + Internet Text Messages. Network Information Center, + August 13, 1982. + + 3. ISO TC97/SC18. Standard Generalized Markup Language. + Tech. Rept. DIS 8879, ISO, 1986. + + 4. Knuth, Donald E. The TEXbook. Addison-Wesley, Reading, Mass., + 1984. + + 5. Ossanna, Joseph F. NROFF/TROFF User's Manual. Bell + Laboratories, Murray Hill, New Jersey, 1976. Computing Science + Technical Report No.54. + + 6. Unilogic. SCRIBE Document Production Software. Unilogic, 1985. + Fourth Edition. + + + + + + + + + + + + + + + + +Sirbu [Page 8] +
\ No newline at end of file |