summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc3023.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc3023.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc3023.txt')
-rw-r--r--doc/rfc/rfc3023.txt2187
1 files changed, 2187 insertions, 0 deletions
diff --git a/doc/rfc/rfc3023.txt b/doc/rfc/rfc3023.txt
new file mode 100644
index 0000000..7507859
--- /dev/null
+++ b/doc/rfc/rfc3023.txt
@@ -0,0 +1,2187 @@
+
+
+
+
+
+
+Network Working Group M. Murata
+Request for Comments: 3023 IBM Tokyo Research Laboratory
+Obsoletes: 2376 S. St.Laurent
+Updates: 2048 simonstl.com
+Category: Standards Track D. Kohn
+ Skymoon Ventures
+ January 2001
+
+
+ XML Media Types
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2001). All Rights Reserved.
+
+Abstract
+
+ This document standardizes five new media types -- text/xml,
+ application/xml, text/xml-external-parsed-entity, application/xml-
+ external-parsed-entity, and application/xml-dtd -- for use in
+ exchanging network entities that are related to the Extensible Markup
+ Language (XML). This document also standardizes a convention (using
+ the suffix '+xml') for naming media types outside of these five types
+ when those media types represent XML MIME (Multipurpose Internet Mail
+ Extensions) entities. XML MIME entities are currently exchanged via
+ the HyperText Transfer Protocol on the World Wide Web, are an
+ integral part of the WebDAV protocol for remote web authoring, and
+ are expected to have utility in many domains.
+
+ Major differences from RFC 2376 are (1) the addition of text/xml-
+ external-parsed-entity, application/xml-external-parsed-entity, and
+ application/xml-dtd, (2) the '+xml' suffix convention (which also
+ updates the RFC 2048 registration process), and (3) the discussion of
+ "utf-16le" and "utf-16be".
+
+
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 1]
+
+RFC 3023 XML Media Types January 2001
+
+
+Table of Contents
+
+ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
+ 2. Notational Conventions . . . . . . . . . . . . . . . . . . . 4
+ 3. XML Media Types . . . . . . . . . . . . . . . . . . . . . . 5
+ 3.1 Text/xml Registration . . . . . . . . . . . . . . . . . . . 7
+ 3.2 Application/xml Registration . . . . . . . . . . . . . . . . 9
+ 3.3 Text/xml-external-parsed-entity Registration . . . . . . . . 11
+ 3.4 Application/xml-external-parsed-entity Registration . . . . 12
+ 3.5 Application/xml-dtd Registration . . . . . . . . . . . . . . 13
+ 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 14
+ 4. The Byte Order Mark (BOM) and Conversions to/from the UTF-16
+ Charset . . . . . . . . . . . . . . . . . . . . . . . . . . 15
+ 5. Fragment Identifiers . . . . . . . . . . . . . . . . . . . . 15
+ 6. The Base URI . . . . . . . . . . . . . . . . . . . . . . . . 15
+ 7. A Naming Convention for XML-Based Media Types . . . . . . . 16
+ 7.1 Referencing . . . . . . . . . . . . . . . . . . . . . . . . 18
+ 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 18
+ 8.1 Text/xml with UTF-8 Charset . . . . . . . . . . . . . . . . 19
+ 8.2 Text/xml with UTF-16 Charset . . . . . . . . . . . . . . . . 19
+ 8.3 Text/xml with UTF-16BE Charset . . . . . . . . . . . . . . . 19
+ 8.4 Text/xml with ISO-2022-KR Charset . . . . . . . . . . . . . 20
+ 8.5 Text/xml with Omitted Charset . . . . . . . . . . . . . . . 20
+ 8.6 Application/xml with UTF-16 Charset . . . . . . . . . . . . 20
+ 8.7 Application/xml with UTF-16BE Charset . . . . . . . . . . . 21
+ 8.8 Application/xml with ISO-2022-KR Charset . . . . . . . . . . 21
+ 8.9 Application/xml with Omitted Charset and UTF-16 XML MIME
+ Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
+ 8.10 Application/xml with Omitted Charset and UTF-8 Entity . . . 22
+ 8.11 Application/xml with Omitted Charset and Internal Encoding
+ Declaration . . . . . . . . . . . . . . . . . . . . . . . . 22
+ 8.12 Text/xml-external-parsed-entity with UTF-8 Charset . . . . . 22
+ 8.13 Application/xml-external-parsed-entity with UTF-16 Charset . 23
+ 8.14 Application/xml-external-parsed-entity with UTF-16BE Charset 23
+ 8.15 Application/xml-dtd . . . . . . . . . . . . . . . . . . . . 23
+ 8.16 Application/mathml+xml . . . . . . . . . . . . . . . . . . . 24
+ 8.17 Application/xslt+xml . . . . . . . . . . . . . . . . . . . . 24
+ 8.18 Application/rdf+xml . . . . . . . . . . . . . . . . . . . . 24
+ 8.19 Image/svg+xml . . . . . . . . . . . . . . . . . . . . . . . 24
+ 8.20 INCONSISTENT EXAMPLE: Text/xml with UTF-8 Charset . . . . . 25
+ 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . 25
+ 10. Security Considerations . . . . . . . . . . . . . . . . . . 25
+ References . . . . . . . . . . . . . . . . . . . . . . . . . 27
+ Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 31
+ A. Why Use the '+xml' Suffix for XML-Based MIME Types? . . . . 32
+ A.1 Why not just use text/xml or application/xml and let the XML
+ processor dispatch to the correct application based on the
+ referenced DTD? . . . . . . . . . . . . . . . . . . . . . . 32
+
+
+
+Murata, et al. Standards Track [Page 2]
+
+RFC 3023 XML Media Types January 2001
+
+
+ A.2 Why not create a new subtree (e.g., image/xml.svg) to
+ represent XML MIME types? . . . . . . . . . . . . . . . . . 32
+ A.3 Why not create a new top-level MIME type for XML-based media
+ types? . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
+ A.4 Why not just have the MIME processor 'sniff' the content to
+ determine whether it is XML? . . . . . . . . . . . . . . . . 33
+ A.5 Why not use a MIME parameter to specify that a media type
+ uses XML syntax? . . . . . . . . . . . . . . . . . . . . . . 33
+ A.6 How about labeling with parameters in the other direction
+ (e.g., application/xml; Content-Feature=iotp)? . . . . . . . 34
+ A.7 How about a new superclass MIME parameter that is defined to
+ apply to all MIME types (e.g., Content-Type:
+ application/iotp; $superclass=xml)? . . . . . . . . . . . . 34
+ A.8 What about adding a new parameter to the Content-Disposition
+ header or creating a new Content-Structure header to
+ indicate XML syntax? . . . . . . . . . . . . . . . . . . . . 35
+ A.9 How about a new Alternative-Content-Type header? . . . . . . 35
+ A.10 How about using a conneg tag instead (e.g., accept-features:
+ (syntax=xml))? . . . . . . . . . . . . . . . . . . . . . . . 35
+ A.11 How about a third-level content-type, such as text/xml/rdf? 35
+ A.12 Why use the plus ('+') character for the suffix '+xml'? . . 36
+ A.13 What is the semantic difference between application/foo and
+ application/foo+xml? . . . . . . . . . . . . . . . . . . . . 36
+ A.14 What happens when an even better markup language (e.g.,
+ EBML) is defined, or a new category of data? . . . . . . . . 36
+ A.15 Why must I use the '+xml' suffix for my new XML-based media
+ type? . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
+ B. Changes from RFC 2376 . . . . . . . . . . . . . . . . . . . 37
+ C. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 38
+ Full Copyright Statement . . . . . . . . . . . . . . . . . . 39
+
+1. Introduction
+
+ The World Wide Web Consortium has issued Extensible Markup Language
+ (XML) 1.0 (Second Edition)[XML]. To enable the exchange of XML
+ network entities, this document standardizes five new media types --
+ text/xml, application/xml, text/xml-external-parsed-entity,
+ application/xml-external-parsed-entity, and application/xml-dtd -- as
+ well as a naming convention for identifying XML-based MIME media
+ types.
+
+ XML entities are currently exchanged on the World Wide Web, and XML
+ is also used for property values and parameter marshalling by the
+ WebDAV[RFC2518] protocol for remote web authoring. Thus, there is a
+ need for a media type to properly label the exchange of XML network
+ entities.
+
+
+
+
+
+Murata, et al. Standards Track [Page 3]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Although XML is a subset of the Standard Generalized Markup Language
+ (SGML) ISO 8879[SGML], which has been assigned the media types
+ text/sgml and application/sgml, there are several reasons why use of
+ text/sgml or application/sgml to label XML is inappropriate. First,
+ there exist many applications that can process XML, but that cannot
+ process SGML, due to SGML's larger feature set. Second, SGML
+ applications cannot always process XML entities, because XML uses
+ features of recent technical corrigenda to SGML. Third, the
+ definition of text/sgml and application/sgml in [RFC1874] includes
+ parameters for SGML bit combination transformation format (SGML-
+ bctf), and SGML boot attribute (SGML-boot). Since XML does not use
+ these parameters, it would be ambiguous if such parameters were given
+ for an XML MIME entity. For these reasons, the best approach for
+ labeling XML network entities is to provide new media types for XML.
+
+ Since XML is an integral part of the WebDAV Distributed Authoring
+ Protocol, and since World Wide Web Consortium Recommendations have
+ conventionally been assigned IETF tree media types, and since similar
+ media types (HTML, SGML) have been assigned IETF tree media types,
+ the XML media types also belong in the IETF media types tree.
+
+ Similarly, XML will be used as a foundation for other media types,
+ including types in every branch of the IETF media types tree. To
+ facilitate the processing of such types, media types based on XML,
+ but that are not identified using text/xml or application/xml, SHOULD
+ be named using a suffix of '+xml' as described in Section 7. This
+ will allow XML-based tools -- browsers, editors, search engines, and
+ other processors -- to work with all XML-based media types.
+
+2. Notational Conventions
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in [RFC2119].
+
+ As defined in [RFC2781], the three charsets "utf-16", "utf-16le", and
+ "utf-16be" are used to label UTF-16 text. In this document, "the
+ UTF-16 family" refers to those three charsets. By contrast, the
+ phrases "utf-16" or UTF-16 in this document refer specifically to the
+ single charset "utf-16".
+
+ As sometimes happens between two communities, both MIME and XML have
+ defined the term entity, with different meanings. Section 2.4 of
+ [RFC2045] says:
+
+ "The term 'entity' refers specifically to the MIME-defined header
+ fields and contents of either a message or one of the parts in the
+ body of a multipart entity."
+
+
+
+Murata, et al. Standards Track [Page 4]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Section 4 of [XML] says:
+
+ "An XML document may consist of one or many storage units" called
+ entities that "have content" and are normally "identified by
+ name".
+
+ In this document, "XML MIME entity" is defined as the latter (an XML
+ entity) encapsulated in the former (a MIME entity).
+
+3. XML Media Types
+
+ This document standardizes five media types related to XML MIME
+ entities: text/xml, application/xml, text/xml-external-parsed-entity,
+ application/xml-external-parsed-entity, and application/xml-dtd.
+ Registration information for these media types is described in the
+ sections below.
+
+ Within the XML specification, XML MIME entities can be classified
+ into four types. In the XML terminology, they are called "document
+ entities", "external DTD subsets", "external parsed entities", and
+ "external parameter entities". The media types text/xml and
+ application/xml MAY be used for "document entities", while text/xml-
+ external-parsed-entity or application/xml-external-parsed-entity
+ SHOULD be used for "external parsed entities". The media type
+ application/xml-dtd SHOULD be used for "external DTD subsets" or
+ "external parameter entities". application/xml and text/xml MUST NOT
+ be used for "external parameter entities" or "external DTD subsets",
+ and MUST NOT be used for "external parsed entities" unless they are
+ also well-formed "document entities" and are referenced as such.
+ Note that [RFC2376] (which this document obsoletes) allowed such
+ usage, although in practice it is likely to have been rare.
+
+ Neither external DTD subsets nor external parameter entities parse as
+ XML documents, and while some XML document entities may be used as
+ external parsed entities and vice versa, there are many cases where
+ the two are not interchangeable. XML also has unparsed entities,
+ internal parsed entities, and internal parameter entities, but they
+ are not XML MIME entities.
+
+ If an XML document -- that is, the unprocessed, source XML document
+ -- is readable by casual users, text/xml is preferable to
+ application/xml. MIME user agents (and web user agents) that do not
+ have explicit support for text/xml will treat it as text/plain, for
+ example, by displaying the XML MIME entity as plain text.
+ Application/xml is preferable when the XML MIME entity is unreadable
+ by casual users. Similarly, text/xml-external-parsed-entity is
+
+
+
+
+
+Murata, et al. Standards Track [Page 5]
+
+RFC 3023 XML Media Types January 2001
+
+
+ preferable when an external parsed entity is readable by casual
+ users, but application/xml-external-parsed-entity is preferable when
+ a plain text display is inappropriate.
+
+ NOTE: Users are in general not used to text containing tags such
+ as <price>, and often find such tags quite disorienting or
+ annoying. If one is not sure, the conservative principle would
+ suggest using application/* instead of text/* so as not to put
+ information in front of users that they will quite likely not
+ understand.
+
+ The top-level media type "text" has some restrictions on MIME
+ entities and they are described in [RFC2045] and [RFC2046]. In
+ particular, the UTF-16 family, UCS-4, and UTF-32 are not allowed
+ (except over HTTP[RFC2616], which uses a MIME-like mechanism). Thus,
+ if an XML document or external parsed entity is encoded in such
+ character encoding schemes, it cannot be labeled as text/xml or
+ text/xml-external-parsed-entity (except for HTTP).
+
+ Text/xml and application/xml behave differently when the charset
+ parameter is not explicitly specified. If the default charset (i.e.,
+ US-ASCII) for text/xml is inconvenient for some reason (e.g., bad web
+ servers), application/xml provides an alternative (see "Optional
+ parameters" of application/xml registration in Section 3.2). The
+ same rules apply to the distinction between text/xml-external-
+ parsed-entity and application/xml-external-parsed-entity.
+
+ XML provides a general framework for defining sequences of structured
+ data. In some cases, it may be desirable to define new media types
+ that use XML but define a specific application of XML, perhaps due to
+ domain-specific security considerations or runtime information.
+ Furthermore, such media types may allow UTF-8 or UTF-16 only and
+ prohibit other charsets. This document does not prohibit such media
+ types and in fact expects them to proliferate. However, developers
+ of such media types are STRONGLY RECOMMENDED to use this document as
+ a basis for their registration. In particular, the charset parameter
+ SHOULD be used in the same manner, as described in Section 7.1, in
+ order to enhance interoperability.
+
+ An XML document labeled as text/xml or application/xml might contain
+ namespace declarations, stylesheet-linking processing instructions
+ (PIs), schema information, or other declarations that might be used
+ to suggest how the document is to be processed. For example, a
+ document might have the XHTML namespace and a reference to a CSS
+ stylesheet. Such a document might be handled by applications that
+ would use this information to dispatch the document for appropriate
+ processing.
+
+
+
+
+Murata, et al. Standards Track [Page 6]
+
+RFC 3023 XML Media Types January 2001
+
+
+3.1 Text/xml Registration
+
+ MIME media type name: text
+
+ MIME subtype name: xml
+
+ Mandatory parameters: none
+
+ Optional parameters: charset
+
+ Although listed as an optional parameter, the use of the charset
+ parameter is STRONGLY RECOMMENDED, since this information can be
+ used by XML processors to determine authoritatively the character
+ encoding of the XML MIME entity. The charset parameter can also
+ be used to provide protocol-specific operations, such as charset-
+ based content negotiation in HTTP. "utf-8" [RFC2279] is the
+ recommended value, representing the UTF-8 charset. UTF-8 is
+ supported by all conforming processors of [XML].
+
+ If the XML MIME entity is transmitted via HTTP, which uses a
+ MIME-like mechanism that is exempt from the restrictions on the
+ text top-level type (see section 19.4.1 of [RFC2616]), "utf-16"
+ [RFC2781]) is also recommended. UTF-16 is supported by all
+ conforming processors of [XML]. Since the handling of CR, LF and
+ NUL for text types in most MIME applications would cause undesired
+ transformations of individual octets in UTF-16 multi-octet
+ characters, gateways from HTTP to these MIME applications MUST
+ transform the XML MIME entity from text/xml; charset="utf-16" to
+ application/xml; charset="utf-16".
+
+ Conformant with [RFC2046], if a text/xml entity is received with
+ the charset parameter omitted, MIME processors and XML processors
+ MUST use the default charset value of "us-ascii"[ASCII]. In cases
+ where the XML MIME entity is transmitted via HTTP, the default
+ charset value is still "us-ascii". (Note: There is an
+ inconsistency between this specification and HTTP/1.1, which uses
+ ISO-8859-1[ISO8859] as the default for a historical reason. Since
+ XML is a new format, a new default should be chosen for better
+ I18N. US-ASCII was chosen, since it is the intersection of UTF-8
+ and ISO-8859-1 and since it is already used by MIME.)
+
+ There are several reasons that the charset parameter is
+ authoritative. First, some MIME processing engines do transcoding
+ of MIME bodies of the top-level media type "text" without
+ reference to any of the internal content. Thus, it is possible
+ that some agent might change text/xml; charset="iso-2022-jp" to
+ text/xml; charset="utf-8" without modifying the encoding
+ declaration of an XML document. Second, text/xml must be
+
+
+
+Murata, et al. Standards Track [Page 7]
+
+RFC 3023 XML Media Types January 2001
+
+
+ compatible with text/plain, since MIME agents that do not
+ understand text/xml will fallback to handling it as text/plain.
+ If the charset parameter for text/xml were not authoritative, such
+ fallback would cause data corruption. Third, recent web servers
+ have been improved so that users can specify the charset
+ parameter. Fourth, [RFC2130] specifies that the recommended
+ specification scheme is the "charset" parameter.
+
+ Since the charset parameter is authoritative, the charset is not
+ always declared within an XML encoding declaration. Thus, special
+ care is needed when the recipient strips the MIME header and
+ provides persistent storage of the received XML MIME entity (e.g.,
+ in a file system). Unless the charset is UTF-8 or UTF-16, the
+ recipient SHOULD also persistently store information about the
+ charset, perhaps by embedding a correct XML encoding declaration
+ within the XML MIME entity.
+
+ Encoding considerations: This media type MAY be encoded as
+ appropriate for the charset and the capabilities of the underlying
+ MIME transport. For 7-bit transports, data in UTF-8 MUST be
+ encoded in quoted-printable or base64. For 8-bit clean transport
+ (e.g., 8BITMIME[RFC1652] ESMTP or NNTP[RFC0977]), UTF-8 does not
+ need to be encoded. Over HTTP[RFC2616], no content-transfer-
+ encoding is necessary and UTF-16 may also be used.
+
+ Security considerations: See Section 10.
+
+ Interoperability considerations: XML has proven to be interoperable
+ across WebDAV clients and servers, and for import and export from
+ multiple XML authoring tools. For maximum interoperability,
+ validating processors are recommended. Although non-validating
+ processors may be more efficient, they are not required to handle
+ all features of XML. For further information, see sub-section 2.9
+ "Standalone Document Declaration" and section 5 "Conformance" of
+ [XML].
+
+ Published specification: Extensible Markup Language (XML) 1.0 (Second
+ Edition)[XML].
+
+ Applications which use this media type: XML is device-, platform-,
+ and vendor-neutral and is supported by a wide range of Web user
+ agents, WebDAV[RFC2518] clients and servers, as well as XML
+ authoring tools.
+
+ Additional information:
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 8]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Magic number(s): None.
+
+ Although no byte sequences can be counted on to always be
+ present, XML MIME entities in ASCII-compatible charsets
+ (including UTF-8) often begin with hexadecimal 3C 3F 78 6D 6C
+ ("<?xml"), and those in UTF-16 often begin with hexadecimal FE
+ FF 00 3C 00 3F 00 78 00 6D 00 6C or FF FE 3C 00 3F 00 78 00 6D
+ 00 6C 00 (the Byte Order Mark (BOM) followed by "<?xml"). For
+ more information, see Appendix F of [XML].
+
+ File extension(s): .xml
+
+ Macintosh File Type Code(s): "TEXT"
+
+ Person and email address for further information:
+
+ MURATA Makoto (FAMILY Given) <mmurata@trl.ibm.co.jp>
+
+ Simon St.Laurent <simonstl@simonstl.com>
+
+ Daniel Kohn <dan@dankohn.com>
+
+ Intended usage: COMMON
+
+ Author/Change controller: The XML specification is a work product of
+ the World Wide Web Consortium's XML Working Group, and was edited
+ by:
+
+ Tim Bray <tbray@textuality.com>
+
+ Jean Paoli <jeanpa@microsoft.com>
+
+ C. M. Sperberg-McQueen <cmsmcq@uic.edu>
+
+ Eve Maler <eve.maler@east.sun.com>
+
+ The W3C, and the W3C XML Core Working Group, have change control
+ over the XML specification.
+
+3.2 Application/xml Registration
+
+ MIME media type name: application
+
+ MIME subtype name: xml
+
+ Mandatory parameters: none
+
+
+
+
+
+Murata, et al. Standards Track [Page 9]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Optional parameters: charset
+
+ Although listed as an optional parameter, the use of the charset
+ parameter is STRONGLY RECOMMENDED, since this information can be
+ used by XML processors to determine authoritatively the charset of
+ the XML MIME entity. The charset parameter can also be used to
+ provide protocol-specific operations, such as charset-based
+ content negotiation in HTTP.
+
+ "utf-8" [RFC2279] and "utf-16" [RFC2781] are the recommended
+ values, representing the UTF-8 and UTF-16 charsets, respectively.
+ These charsets are preferred since they are supported by all
+ conforming processors of [XML].
+
+ If an application/xml entity is received where the charset
+ parameter is omitted, no information is being provided about the
+ charset by the MIME Content-Type header. Conforming XML
+ processors MUST follow the requirements in section 4.3.3 of [XML]
+ that directly address this contingency. However, MIME processors
+ that are not XML processors SHOULD NOT assume a default charset if
+ the charset parameter is omitted from an application/xml entity.
+
+ There are several reasons that the charset parameter is
+ authoritative. First, recent web servers have been improved so
+ that users can specify the charset parameter. Second, [RFC2130]
+ specifies that the recommended specification scheme is the
+ "charset" parameter.
+
+ On the other hand, it has been argued that the charset parameter
+ should be omitted and the mechanism described in Appendix F of
+ [XML] (which is non-normative) should be solely relied on. This
+ approach would allow users to avoid configuration of the charset
+ parameter; an XML document stored in a file is likely to contain a
+ correct encoding declaration or BOM (if necessary), since the
+ operating system does not typically provide charset information
+ for files. If users would like to rely on the encoding
+ declaration or BOM and to hide charset information from protocols,
+ they may determine not to use the parameter.
+
+ Since the charset parameter is authoritative, the charset is not
+ always declared within an XML encoding declaration. Thus, special
+ care is needed when the recipient strips the MIME header and
+ provides persistent storage of the received XML MIME entity (e.g.,
+ in a file system). Unless the charset is UTF-8 or UTF-16, the
+ recipient SHOULD also persistently store information about the
+ charset, perhaps by embedding a correct XML encoding declaration
+ within the XML MIME entity.
+
+
+
+
+Murata, et al. Standards Track [Page 10]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Encoding considerations: This media type MAY be encoded as
+ appropriate for the charset and the capabilities of the underlying
+ MIME transport. For 7-bit transports, data in either UTF-8 or
+ UTF-16 MUST be encoded in quoted-printable or base64. For 8-bit
+ clean transport (e.g., 8BITMIME[RFC1652] ESMTP or NNTP[RFC0977]),
+ UTF-8 is not encoded, but the UTF-16 family MUST be encoded in
+ base64. For binary clean transports (e.g., HTTP[RFC2616]), no
+ content-transfer-encoding is necessary.
+
+ Security considerations: See Section 10.
+
+ Interoperability considerations: Same as Section 3.1.
+
+ Published specification: Same as Section 3.1.
+
+ Applications which use this media type: Same as Section 3.1.
+
+ Additional information: Same as Section 3.1.
+
+ Person and email address for further information: Same as Section
+ 3.1.
+
+ Intended usage: COMMON
+
+ Author/Change controller: Same as Section 3.1.
+
+3.3 Text/xml-external-parsed-entity Registration
+
+ MIME media type name: text
+
+ MIME subtype name: xml-external-parsed-entity
+
+ Mandatory parameters: none
+
+ Optional parameters: charset
+
+ The charset parameter of text/xml-external-parsed-entity is
+ handled the same as that of text/xml as described in Section 3.1.
+
+ Encoding considerations: Same as Section 3.1.
+
+ Security considerations: See Section 10.
+
+ Interoperability considerations: XML external parsed entities are as
+ interoperable as XML documents, though they have a less tightly
+ constrained structure and therefore need to be referenced by XML
+ documents for proper handling by XML processors. Similarly, XML
+ documents cannot be reliably used as external parsed entities
+
+
+
+Murata, et al. Standards Track [Page 11]
+
+RFC 3023 XML Media Types January 2001
+
+
+ because external parsed entities are prohibited from having
+ standalone document declarations or DTDs. Identifying XML
+ external parsed entities with their own content type should
+ enhance interoperability of both XML documents and XML external
+ parsed entities.
+
+ Published specification: Same as Section 3.1.
+
+ Applications which use this media type: Same as Section 3.1.
+
+ Additional information:
+
+ Magic number(s): Same as Section 3.1.
+
+ File extension(s): .xml or .ent
+
+ Macintosh File Type Code(s): "TEXT"
+
+ Person and email address for further information: Same as Section
+ 3.1.
+
+ Intended usage: COMMON
+
+ Author/Change controller: Same as Section 3.1.
+
+3.4 Application/xml-external-parsed-entity Registration
+
+ MIME media type name: application
+
+ MIME subtype name: xml-external-parsed-entity
+
+ Mandatory parameters: none
+
+ Optional parameters: charset
+
+ The charset parameter of application/xml-external-parsed-entity is
+ handled the same as that of application/xml as described in
+ Section 3.2.
+
+ Encoding considerations: Same as Section 3.2.
+
+ Security considerations: See Section 10.
+
+ Interoperability considerations: Same as those for text/xml-
+ external-parsed-entity as described in Section 3.3.
+
+ Published specification: Same as text/xml as described in Section
+ 3.1.
+
+
+
+Murata, et al. Standards Track [Page 12]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Applications which use this media type: Same as Section 3.1.
+
+ Additional information:
+
+ Magic number(s): Same as Section 3.1.
+
+ File extension(s): .xml or .ent
+
+ Macintosh File Type Code(s): "TEXT"
+
+ Person and email address for further information: Same as Section
+ 3.1.
+
+ Intended usage: COMMON
+
+ Author/Change controller: Same as Section 3.1.
+
+3.5 Application/xml-dtd Registration
+
+ MIME media type name: application
+
+ MIME subtype name: xml-dtd
+
+ Mandatory parameters: none
+
+ Optional parameters: charset
+
+ The charset parameter of application/xml-dtd is handled the same
+ as that of application/xml as described in Section 3.2.
+
+ Encoding considerations: Same as Section 3.2.
+
+ Security considerations: See Section 10.
+
+ Interoperability considerations: XML DTDs have proven to be
+ interoperable by DTD authoring tools and XML browsers, among
+ others.
+
+ Published specification: Same as text/xml as described in Section
+ 3.1.
+
+ Applications which use this media type: DTD authoring tools handle
+ external DTD subsets as well as external parameter entities. XML
+ browsers may also access external DTD subsets and external
+ parameter entities.
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 13]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Additional information:
+
+ Magic number(s): Same as Section 3.1.
+
+ File extension(s): .dtd or .mod
+
+ Macintosh File Type Code(s): "TEXT"
+
+ Person and email address for further information: Same as Section
+ 3.1.
+
+ Intended usage: COMMON
+
+ Author/Change controller: Same as Section 3.1.
+
+3.6 Summary
+
+ The following list applies to text/xml, text/xml-external-parsed-
+ entity, and XML-based media types under the top-level type "text"
+ that define the charset parameter according to this specification:
+
+ o Charset parameter is strongly recommended.
+
+ o If the charset parameter is not specified, the default is "us-
+ ascii". The default of "iso-8859-1" in HTTP is explicitly
+ overridden.
+
+ o No error handling provisions.
+
+ o An encoding declaration, if present, is irrelevant, but when
+ saving a received resource as a file, the correct encoding
+ declaration SHOULD be inserted.
+
+ The next list applies to application/xml, application/xml-external-
+ parsed-entity, application/xml-dtd, and XML-based media types under
+ top-level types other than "text" that define the charset parameter
+ according to this specification:
+
+ o Charset parameter is strongly recommended, and if present, it
+ takes precedence.
+
+ o If the charset parameter is omitted, conforming XML processors
+ MUST follow the requirements in section 4.3.3 of [XML].
+
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 14]
+
+RFC 3023 XML Media Types January 2001
+
+
+4. The Byte Order Mark (BOM) and Conversions to/from the UTF-16 Charset
+
+ Section 4.3.3 of [XML] specifies that XML MIME entities in the
+ charset "utf-16" MUST begin with a byte order mark (BOM), which is a
+ hexadecimal octet sequence 0xFE 0xFF (or 0xFF 0xFE, depending on
+ endian). The XML Recommendation further states that the BOM is an
+ encoding signature, and is not part of either the markup or the
+ character data of the XML document.
+
+ Due to the presence of the BOM, applications that convert XML from
+ "utf-16" to a non-Unicode encoding MUST strip the BOM before
+ conversion. Similarly, when converting from another encoding into
+ "utf-16", the BOM MUST be added after conversion is complete.
+
+ In addition to the charset "utf-16", [RFC2781] introduces "utf-16le"
+ (little endian) and "utf-16be" (big endian) as well. The BOM is
+ prohibited for these charsets. When an XML MIME entity is encoded in
+ "utf-16le" or "utf-16be", it MUST NOT begin with the BOM but SHOULD
+ contain an encoding declaration. Conversion from "utf-16" to "utf-
+ 16be" or "utf-16le" and conversion in the other direction MUST strip
+ or add the BOM, respectively.
+
+5. Fragment Identifiers
+
+ Section 4.1 of [RFC2396] notes that the semantics of a fragment
+ identifier (the part of a URI after a "#") is a property of the data
+ resulting from a retrieval action, and that the format and
+ interpretation of fragment identifiers is dependent on the media type
+ of the retrieval result.
+
+ As of today, no established specifications define identifiers for XML
+ media types. However, a working draft published by W3C, namely "XML
+ Pointer Language (XPointer)", attempts to define fragment identifiers
+ for text/xml and application/xml. The current specification for
+ XPointer is available at http://www.w3.org/TR/xptr.
+
+6. The Base URI
+
+ Section 5.1 of [RFC2396] specifies that the semantics of a relative
+ URI reference embedded in a MIME entity is dependent on the base URI.
+ The base URI is either (1) the base URI embedded in the MIME entity,
+ (2) the base URI of the encapsulating MIME entity, (3) the URI used
+ to retrieve the MIME entity, or (4) the application-dependent default
+ base URI, where (1) has the highest precedence. [RFC2396] further
+ specifies that the mechanism for embedding the base URI is dependent
+ on the media type.
+
+
+
+
+
+Murata, et al. Standards Track [Page 15]
+
+RFC 3023 XML Media Types January 2001
+
+
+ As of today, no established specifications define mechanisms for
+ embedding the base URI in XML MIME entities. However, a Proposed
+ Recommendation published by W3C, namely "XML Base", attempts to
+ define such a mechanism for text/xml, application/xml, text/xml-
+ external-parsed-entity, and application/xml-external-parsed-entity.
+ The current specification for XML Base is available at
+ http://www.w3.org/TR/xmlbase.
+
+7. A Naming Convention for XML-Based Media Types
+
+ This document recommends the use of a naming convention (a suffix of
+ '+xml') for identifying XML-based MIME media types, whatever their
+ particular content may represent. This allows the use of generic XML
+ processors and technologies on a wide variety of different XML
+ document types at a minimum cost, using existing frameworks for media
+ type registration.
+
+ Although the use of a suffix was not considered as part of the
+ original MIME architecture, this choice is considered to provide the
+ most functionality with the least potential for interoperability
+ problems or lack of future extensibility. The alternatives to the '
+ +xml' suffix and the reason for its selection are described in
+ Appendix A.
+
+ As XML development continues, new XML document types are appearing
+ rapidly. Many of these XML document types would benefit from the
+ identification possibilities of a more specific MIME media type than
+ text/xml or application/xml can provide, and it is likely that many
+ new media types for XML-based document types will be registered in
+ the near and ongoing future.
+
+ While the benefits of specific MIME types for particular types of XML
+ documents are significant, all XML documents share common structures
+ and syntax that make possible common processing.
+
+ Some areas where 'generic' processing is useful include:
+
+ o Browsing - An XML browser can display any XML document with a
+ provided [CSS] or [XSLT] style sheet, whatever the vocabulary of
+ that document.
+
+ o Editing - Any XML editor can read, modify, and save any XML
+ document.
+
+ o Fragment identification - XPointers (work in progress) can work
+ with any XML document, whatever vocabulary it uses and whether or
+ not it uses XPointer for its own fragment identification.
+
+
+
+
+Murata, et al. Standards Track [Page 16]
+
+RFC 3023 XML Media Types January 2001
+
+
+ o Hypertext linking - XLink (work in progress) hypertext linking is
+ designed to connect any XML documents, regardless of vocabulary.
+
+ o Searching - XML-oriented search engines, web crawlers, agents, and
+ query tools should be able to read XML documents and extract the
+ names and content of elements and attributes even if the tools are
+ ignorant of the particular vocabulary used for elements and
+ attributes.
+
+ o Storage - XML-oriented storage systems, which keep XML documents
+ internally in a parsed form, should similarly be able to process,
+ store, and recreate any XML document.
+
+ o Well-formedness and validity checking - An XML processor can
+ confirm that any XML document is well-formed and that it is valid
+ (i.e., conforms to its declared DTD or Schema).
+
+ When a new media type is introduced for an XML-based format, the name
+ of the media type SHOULD end with '+xml'. This convention will allow
+ applications that can process XML generically to detect that the MIME
+ entity is supposed to be an XML document, verify this assumption by
+ invoking some XML processor, and then process the XML document
+ accordingly. Applications may match for types that represent XML
+ MIME entities by comparing the subtype to the pattern '*/*+xml'. (Of
+ course, 4 of the 5 media types defined in this document -- text/xml,
+ application/xml, text/xml-external-parsed-entity, and
+ application/xml-external-parsed-entity -- also represent XML MIME
+ entities while not conforming to the '*/*+xml' pattern.)
+
+ NOTE: Section 14.1 of HTTP[RFC2616] does not support Accept
+ headers of the form "Accept: */*+xml" and so this header MUST NOT
+ be used in this way. Instead, content negotiation[RFC2703] could
+ potentially be used if an XML-based MIME type were needed.
+
+ XML generic processing is not always appropriate for XML-based media
+ types. For example, authors of some such media types may wish that
+ the types remain entirely opaque except to applications that are
+ specifically designed to deal with that media type. By NOT following
+ the naming convention '+xml', such media types can avoid XML-generic
+ processing. Since generic processing will be useful in many cases,
+ however -- including in some situations that are difficult to predict
+ ahead of time -- those registering media types SHOULD use the '+xml'
+ convention unless they have a particularly compelling reason not to.
+
+ The registration process for these media types is described in
+ [RFC2048]. The registrar for the IETF tree will encourage new XML-
+ based media type registrations in the IETF tree to follow this
+ guideline. Registrars for other trees SHOULD follow this convention
+
+
+
+Murata, et al. Standards Track [Page 17]
+
+RFC 3023 XML Media Types January 2001
+
+
+ in order to ensure maximum interoperability of their XML-based
+ documents. Similarly, media subtypes that do not represent XML MIME
+ entities MUST NOT be allowed to register with a '+xml' suffix.
+
+7.1 Referencing
+
+ Registrations for new XML-based media types under the top-level type
+ "text" SHOULD, in specifying the charset parameter and encoding
+ considerations, define them as: "Same as [charset parameter /
+ encoding considerations] of text/xml as specified in RFC 3023."
+
+ Registrations for new XML-based media types under top-level types
+ other than "text" SHOULD, in specifying the charset parameter and
+ encoding considerations, define them as: "Same as [charset parameter
+ / encoding considerations] of application/xml as specified in RFC
+ 3023."
+
+ The use of the charset parameter is STRONGLY RECOMMENDED, since this
+ information can be used by XML processors to determine
+ authoritatively the charset of the XML MIME entity.
+
+ These registrations SHOULD specify that the XML-based media type
+ being registered has all of the security considerations described in
+ RFC 3023 plus any additional considerations specific to that media
+ type.
+
+ These registrations SHOULD also make reference to RFC 3023 in
+ specifying magic numbers, fragment identifiers, base URIs, and use of
+ the BOM.
+
+ These registrations MAY reference the text/xml registration in RFC
+ 3023 in specifying interoperability considerations, if these
+ considerations are not overridden by issues specific to that media
+ type.
+
+8. Examples
+
+ The examples below give the value of the MIME Content-type header and
+ the XML declaration (which includes the encoding declaration) inside
+ the XML MIME entity. For UTF-16 examples, the Byte Order Mark
+ character is denoted as "{BOM}", and the XML declaration is assumed
+ to come at the beginning of the XML MIME entity, immediately
+ following the BOM. Note that other MIME headers may be present, and
+ the XML MIME entity may contain other data in addition to the XML
+ declaration; the examples focus on the Content-type header and the
+ encoding declaration for clarity.
+
+
+
+
+
+Murata, et al. Standards Track [Page 18]
+
+RFC 3023 XML Media Types January 2001
+
+
+8.1 Text/xml with UTF-8 Charset
+
+ Content-type: text/xml; charset="utf-8"
+
+ <?xml version="1.0" encoding="utf-8"?>
+
+ This is the recommended charset value for use with text/xml. Since
+ the charset parameter is provided, MIME and XML processors MUST treat
+ the enclosed entity as UTF-8 encoded.
+
+ If sent using a 7-bit transport (e.g., SMTP[RFC0821]), the XML MIME
+ entity MUST use a content-transfer-encoding of either quoted-
+ printable or base64. For an 8-bit clean transport (e.g., 8BITMIME
+ ESMTP or NNTP), or a binary clean transport (e.g., HTTP), no
+ content-transfer-encoding is necessary.
+
+8.2 Text/xml with UTF-16 Charset
+
+ Content-type: text/xml; charset="utf-16"
+
+ {BOM}<?xml version='1.0' encoding='utf-16'?>
+
+ or
+
+ {BOM}<?xml version='1.0'?>
+
+ This is possible only when the XML MIME entity is transmitted via
+ HTTP, which uses a MIME-like mechanism and is a binary-clean
+ protocol, hence does not perform CR and LF transformations and allows
+ NUL octets. As described in [RFC2781], the UTF-16 family MUST NOT be
+ used with media types under the top-level type "text" except over
+ HTTP (see section 19.4.1 of [RFC2616] for details).
+
+ Since HTTP is binary clean, no content-transfer-encoding is
+ necessary.
+
+8.3 Text/xml with UTF-16BE Charset
+
+ Content-type: text/xml; charset="utf-16be"
+
+ <?xml version='1.0' encoding='utf-16be'?>
+
+ Observe that the BOM does not exist. This is again possible only
+ when the XML MIME entity is transmitted via HTTP.
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 19]
+
+RFC 3023 XML Media Types January 2001
+
+
+8.4 Text/xml with ISO-2022-KR Charset
+
+ Content-type: text/xml; charset="iso-2022-kr"
+
+ <?xml version="1.0" encoding='iso-2022-kr'?>
+
+ This example shows text/xml with a Korean charset (e.g., Hangul)
+ encoded following the specification in [RFC1557]. Since the charset
+ parameter is provided, MIME and XML processors MUST treat the
+ enclosed entity as encoded per RFC 1557.
+
+ Since ISO-2022-KR has been defined to use only 7 bits of data, no
+ content-transfer-encoding is necessary with any transport.
+
+8.5 Text/xml with Omitted Charset
+
+ Content-type: text/xml
+
+ {BOM}<?xml version="1.0" encoding="utf-16"?>
+
+ or
+
+ {BOM}<?xml version="1.0"?>
+
+ This example shows text/xml with the charset parameter omitted. In
+ this case, MIME and XML processors MUST assume the charset is "us-
+ ascii", the default charset value for text media types specified in
+ [RFC2046]. The default of "us-ascii" holds even if the text/xml
+ entity is transported using HTTP.
+
+ Omitting the charset parameter is NOT RECOMMENDED for text/xml. For
+ example, even if the contents of the XML MIME entity are UTF-16 or
+ UTF-8, or the XML MIME entity has an explicit encoding declaration,
+ XML and MIME processors MUST assume the charset is "us-ascii".
+
+8.6 Application/xml with UTF-16 Charset
+
+ Content-type: application/xml; charset="utf-16"
+
+ {BOM}<?xml version="1.0" encoding="utf-16"?>
+
+ or
+
+ {BOM}<?xml version="1.0"?>
+
+ This is a recommended charset value for use with application/xml.
+ Since the charset parameter is provided, MIME and XML processors MUST
+ treat the enclosed entity as UTF-16 encoded.
+
+
+
+Murata, et al. Standards Track [Page 20]
+
+RFC 3023 XML Media Types January 2001
+
+
+ If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean
+ transport (e.g., 8BITMIME ESMTP or NNTP), the XML MIME entity MUST be
+ encoded in quoted-printable or base64. For a binary clean transport
+ (e.g., HTTP), no content-transfer-encoding is necessary.
+
+8.7 Application/xml with UTF-16BE Charset
+
+ Content-type: application/xml; charset="utf-16be"
+
+ <?xml version='1.0' encoding='utf-16be'?>
+
+ Observe that the BOM does not exist. Since the charset parameter is
+ provided, MIME and XML processors MUST treat the enclosed entity as
+ UTF-16BE encoded.
+
+8.8 Application/xml with ISO-2022-KR Charset
+
+ Content-type: application/xml; charset="iso-2022-kr"
+
+ <?xml version="1.0" encoding="iso-2022-kr"?>
+
+ This example shows application/xml with a Korean charset (e.g.,
+ Hangul) encoded following the specification in [RFC1557]. Since the
+ charset parameter is provided, MIME and XML processors MUST treat the
+ enclosed entity as encoded per RFC 1557, independent of whether the
+ XML MIME entity has an internal encoding declaration (this example
+ does show such a declaration, which agrees with the charset
+ parameter).
+
+ Since ISO-2022-KR has been defined to use only 7 bits of data, no
+ content-transfer-encoding is necessary with any transport.
+
+8.9 Application/xml with Omitted Charset and UTF-16 XML MIME Entity
+
+ Content-type: application/xml
+
+ {BOM}<?xml version='1.0' encoding="utf-16"?>
+
+ or
+
+ {BOM}<?xml version='1.0'?>
+
+ For this example, the XML MIME entity begins with a BOM. Since the
+ charset has been omitted, a conforming XML processor follows the
+ requirements of [XML], section 4.3.3. Specifically, the XML
+ processor reads the BOM, and thus knows deterministically that the
+ charset is UTF-16.
+
+
+
+
+Murata, et al. Standards Track [Page 21]
+
+RFC 3023 XML Media Types January 2001
+
+
+ An XML-unaware MIME processor SHOULD make no assumptions about the
+ charset of the XML MIME entity.
+
+8.10 Application/xml with Omitted Charset and UTF-8 Entity
+
+ Content-type: application/xml
+
+ <?xml version='1.0'?>
+
+ In this example, the charset parameter has been omitted, and there is
+ no BOM. Since there is no BOM, the XML processor follows the
+ requirements in section 4.3.3 of [XML], and optionally applies the
+ mechanism described in Appendix F (which is non-normative) of [XML]
+ to determine the charset encoding of UTF-8. The XML MIME entity does
+ not contain an encoding declaration, but since the encoding is UTF-8,
+ this is still a conforming XML MIME entity.
+
+ An XML-unaware MIME processor SHOULD make no assumptions about the
+ charset of the XML MIME entity.
+
+8.11 Application/xml with Omitted Charset and Internal Encoding
+ Declaration
+
+ Content-type: application/xml
+
+ <?xml version='1.0' encoding="iso-10646-ucs-4"?>
+
+ In this example, the charset parameter has been omitted, and there is
+ no BOM. However, the XML MIME entity does have an encoding
+ declaration inside the XML MIME entity that specifies the entity's
+ charset. Following the requirements in section 4.3.3 of [XML], and
+ optionally applying the mechanism described in Appendix F (non-
+ normative) of [XML], the XML processor determines the charset of the
+ XML MIME entity (in this example, UCS-4).
+
+ An XML-unaware MIME processor SHOULD make no assumptions about the
+ charset of the XML MIME entity.
+
+8.12 Text/xml-external-parsed-entity with UTF-8 Charset
+
+ Content-type: text/xml-external-parsed-entity; charset="utf-8"
+
+ <?xml encoding="utf-8"?>
+
+ This is the recommended charset value for use with text/xml-
+ external-parsed-entity. Since the charset parameter is provided,
+ MIME and XML processors MUST treat the enclosed entity as UTF-8
+ encoded.
+
+
+
+Murata, et al. Standards Track [Page 22]
+
+RFC 3023 XML Media Types January 2001
+
+
+ If sent using a 7-bit transport (e.g., SMTP), the XML MIME entity
+ MUST use a content-transfer-encoding of either quoted-printable or
+ base64. For an 8-bit clean transport (e.g., 8BITMIME ESMTP or NNTP),
+ or a binary clean transport (e.g., HTTP) no content-transfer-encoding
+ is necessary.
+
+8.13 Application/xml-external-parsed-entity with UTF-16 Charset
+
+ Content-type: application/xml-external-parsed-entity;
+ charset="utf-16"
+
+ {BOM}<?xml encoding="utf-16"?>
+
+ or
+
+ {BOM}<?xml?>
+
+ This is a recommended charset value for use with application/xml-
+ external-parsed-entity. Since the charset parameter is provided,
+ MIME and XML processors MUST treat the enclosed entity as UTF-16
+ encoded.
+
+ If sent using a 7-bit transport (e.g., SMTP) or an 8-bit clean
+ transport (e.g., 8BITMIME ESMTP or NNTP), the XML MIME entity MUST be
+ encoded in quoted-printable or base64. For a binary clean transport
+ (e.g., HTTP), no content-transfer-encoding is necessary.
+
+8.14 Application/xml-external-parsed-entity with UTF-16BE Charset
+
+ Content-type: application/xml-external-parsed-entity;
+ charset="utf-16be"
+
+ <?xml encoding="utf-16be"?>
+
+ Since the charset parameter is provided, MIME and XML processors MUST
+ treat the enclosed entity as UTF-16BE encoded.
+
+8.15 Application/xml-dtd
+
+ Content-type: application/xml-dtd; charset="utf-8"
+
+ <?xml encoding="utf-8"?>
+
+ Charset "utf-8" is a recommended charset value for use with
+ application/xml-dtd. Since the charset parameter is provided, MIME
+ and XML processors MUST treat the enclosed entity as UTF-8 encoded.
+
+
+
+
+
+Murata, et al. Standards Track [Page 23]
+
+RFC 3023 XML Media Types January 2001
+
+
+8.16 Application/mathml+xml
+
+ Content-type: application/mathml+xml
+
+ <?xml version="1.0" ?>
+
+ MathML documents are XML documents whose content describes
+ mathematical information, as defined by [MathML]. As a format based
+ on XML, MathML documents SHOULD use the '+xml' suffix convention in
+ their MIME content-type identifier. However, no content type has yet
+ been registered for MathML and so this media type should not be used
+ until such registration has been completed.
+
+8.17 Application/xslt+xml
+
+ Content-type: application/xslt+xml
+
+ <?xml version="1.0" ?>
+
+ Extensible Stylesheet Language (XSLT) documents are XML documents
+ whose content describes stylesheets for other XML documents, as
+ defined by [XSLT]. As a format based on XML, XSLT documents SHOULD
+ use the '+xml' suffix convention in their MIME content-type
+ identifier. However, no content type has yet been registered for
+ XSLT and so this media type should not be used until such
+ registration has been completed.
+
+8.18 Application/rdf+xml
+
+ Content-type: application/rdf+xml
+
+ <?xml version="1.0" ?>
+
+ RDF documents identified using this MIME type are XML documents whose
+ content describes metadata, as defined by [RDF]. As a format based
+ on XML, RDF documents SHOULD use the '+xml' suffix convention in
+ their MIME content-type identifier. However, no content type has yet
+ been registered for RDF and so this media type should not be used
+ until such registration has been completed.
+
+8.19 Image/svg+xml
+
+ Content-type: image/svg+xml
+
+ <?xml version="1.0" ?>
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 24]
+
+RFC 3023 XML Media Types January 2001
+
+
+ Scalable Vector Graphics (SVG) documents are XML documents whose
+ content describes graphical information, as defined by [SVG]. As a
+ format based on XML, SVG documents SHOULD use the '+xml' suffix
+ convention in their MIME content-type identifier. However, no
+ content type has yet been registered for SVG and so this media type
+ should not be used until such registration has been completed.
+
+8.20 INCONSISTENT EXAMPLE: Text/xml with UTF-8 Charset
+
+ Content-type: text/xml; charset="utf-8"
+
+ <?xml version="1.0" encoding="iso-8859-1"?>
+
+ Since the charset parameter is provided in the Content-Type header,
+ MIME and XML processors MUST treat the enclosed entity as UTF-8
+ encoded. That is, the "iso-8859-1" encoding MUST be ignored.
+
+ Processors generating XML MIME entities MUST NOT label conflicting
+ charset information between the MIME Content-Type and the XML
+ declaration.
+
+9. IANA Considerations
+
+ As described in Section 7, this document updates the [RFC2048]
+ registration process for XML-based MIME types.
+
+10. Security Considerations
+
+ XML, as a subset of SGML, has all of the same security considerations
+ as specified in [RFC1874], and likely more, due to its expected
+ ubiquitous deployment.
+
+ To paraphrase section 3 of RFC 1874, XML MIME entities contain
+ information to be parsed and processed by the recipient's XML system.
+ These entities may contain and such systems may permit explicit
+ system level commands to be executed while processing the data. To
+ the extent that an XML system will execute arbitrary command strings,
+ recipients of XML MIME entities may be a risk. In general, it may be
+ possible to specify commands that perform unauthorized file
+ operations or make changes to the display processor's environment
+ that affect subsequent operations.
+
+ In general, any information stored outside of the direct control of
+ the user -- including CSS style sheets, XSL transformations, entity
+ declarations, and DTDs -- can be a source of insecurity, by either
+ obvious or subtle means. For example, a tiny "whiteout attack"
+ modification made to a "master" style sheet could make words in
+ critical locations disappear in user documents, without directly
+
+
+
+Murata, et al. Standards Track [Page 25]
+
+RFC 3023 XML Media Types January 2001
+
+
+ modifying the user document or the stylesheet it references. Thus,
+ the security of any XML document is vitally dependent on all of the
+ documents recursively referenced by that document.
+
+ The entity lists and DTDs for XHTML 1.0[XHTML], for instance, are
+ likely to be a commonly used set of information. Many developers
+ will use and trust them, few of whom will know much about the level
+ of security on the W3C's servers, or on any similarly trusted
+ repository.
+
+ The simplest attack involves adding declarations that break
+ validation. Adding extraneous declarations to a list of character
+ entities can effectively "break the contract" used by documents. A
+ tiny change that produces a fatal error in a DTD could halt XML
+ processing on a large scale. Extraneous declarations are fairly
+ obvious, but more sophisticated tricks, like changing attributes from
+ being optional to required, can be difficult to track down. Perhaps
+ the most dangerous option available to crackers is redefining default
+ values for attributes: e.g., if developers have relied on defaulted
+ attributes for security, a relatively small change might expose
+ enormous quantities of information.
+
+ Apart from the structural possibilities, another option, "entity
+ spoofing," can be used to insert text into documents, vandalizing and
+ perhaps conveying an unintended message. Because XML 1.0 permits
+ multiple entity declarations, and the first declaration takes
+ precedence, it's possible to insert malicious content where an entity
+ is used, such as by inserting the full text of Winnie the Pooh in
+ every occurrence of &mdash;.
+
+ Use of the digital signatures work currently underway by the xmldsig
+ working group may eventually ameliorate the dangers of referencing
+ external documents not under one's own control.
+
+ Use of XML is expected to be varied, and widespread. XML is under
+ scrutiny by a wide range of communities for use as a common syntax
+ for community-specific metadata. For example, the Dublin
+ Core[RFC2413] group is using XML for document metadata, and a new
+ effort has begun that is considering use of XML for medical
+ information. Other groups view XML as a mechanism for marshalling
+ parameters for remote procedure calls. More uses of XML will
+ undoubtedly arise.
+
+ Security considerations will vary by domain of use. For example, XML
+ medical records will have much more stringent privacy and security
+ considerations than XML library metadata. Similarly, use of XML as a
+ parameter marshalling syntax necessitates a case by case security
+ review.
+
+
+
+Murata, et al. Standards Track [Page 26]
+
+RFC 3023 XML Media Types January 2001
+
+
+ XML may also have some of the same security concerns as plain text.
+ Like plain text, XML can contain escape sequences that, when
+ displayed, have the potential to change the display processor
+ environment in ways that adversely affect subsequent operations.
+ Possible effects include, but are not limited to, locking the
+ keyboard, changing display parameters so subsequent displayed text is
+ unreadable, or even changing display parameters to deliberately
+ obscure or distort subsequent displayed material so that its meaning
+ is lost or altered. Display processors SHOULD either filter such
+ material from displayed text or else make sure to reset all important
+ settings after a given display operation is complete.
+
+ Some terminal devices have keys whose output, when pressed, can be
+ changed by sending the display processor a character sequence. If
+ this is possible the display of a text object containing such
+ character sequences could reprogram keys to perform some illicit or
+ dangerous action when the key is subsequently pressed by the user.
+ In some cases not only can keys be programmed, they can be triggered
+ remotely, making it possible for a text display operation to directly
+ perform some unwanted action. As such, the ability to program keys
+ SHOULD be blocked either by filtering or by disabling the ability to
+ program keys entirely.
+
+ Note that it is also possible to construct XML documents that make
+ use of what XML terms "entity references" (using the XML meaning of
+ the term "entity" as described in Section 2), to construct repeated
+ expansions of text. Recursive expansions are prohibited by [XML] and
+ XML processors are required to detect them. However, even non-
+ recursive expansions may cause problems with the finite computing
+ resources of computers, if they are performed many times.
+
+References
+
+ [ASCII] "US-ASCII. Coded Character Set -- 7-Bit American Standard
+ Code for Information Interchange", ANSI X3.4-1986, 1986.
+
+ [CSS] Bos, B., Lie, H.W., Lilley, C. and I. Jacobs, "Cascading
+ Style Sheets, level 2 (CSS2) Specification", World Wide
+ Web Consortium Recommendation REC-CSS2, May 1998,
+ <http://www.w3.org/TR/REC-CSS2/>.
+
+ [ISO8859] "ISO-8859. International Standard -- Information
+ Processing -- 8-bit Single-Byte Coded Graphic Character
+ Sets -- Part 1: Latin alphabet No. 1, ISO-8859-1:1987",
+ 1987.
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 27]
+
+RFC 3023 XML Media Types January 2001
+
+
+ [MathML] Ion, P. and R. Miner, "Mathematical Markup Language
+ (MathML) 1.01", World Wide Web Consortium Recommendation
+ REC-MathML, July 1999, <http://www.w3.org/TR/REC-MathML/>.
+
+ [PNG] Boutell, T., "PNG (Portable Network Graphics)
+ Specification", World Wide Web Consortium Recommendation
+ REC-png, October 1996, <http://www.w3.org/TR/REC-png>.
+
+ [RDF] Lassila, O. and R.R. Swick, "Resource Description
+ Framework (RDF) Model and Syntax Specification", World
+ Wide Web Consortium Recommendation REC-rdf-syntax,
+ February 1999, <http://www.w3.org/TR/REC-rdf-syntax/>.
+
+ [RFC0821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC
+ 821, August 1982.
+
+ [RFC0977] Kantor, B. and P. Lapsley, "Network News Transfer
+ Protocol", RFC 977, February 1986.
+
+ [RFC1557] Choi, U., Chon, K. and H. Park, "Korean Character Encoding
+ for Internet Messages", RFC 1557, December 1993.
+
+ [RFC1652] Klensin, J., Freed, N., Rose, M., Stefferud, E. and D.
+ Crocker, "SMTP Service Extension for 8bit-MIMEtransport",
+ RFC 1652, July 1994.
+
+ [RFC1874] Levinson, E., "SGML Media Types", RFC 1874, December 1995.
+
+ [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part One: Format of Internet Message
+ Bodies", RFC 2045, November 1996.
+
+ [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part Two: Media Types", RFC 2046,
+ November 1996.
+
+ [RFC2048] Freed, N., Klensin, J. and J. Postel, "Multipurpose
+ Internet Mail Extensions (MIME) Part Four: Registration
+ Procedures", RFC 2048, November 1996.
+
+ [RFC2060] Crispin, M., "Internet Message Access Protocol - Version
+ 4rev1", RFC 2060, December 1996.
+
+ [RFC2077] Nelson, S., Parks, C. and Mitra, "The Model Primary
+ Content Type for Multipurpose Internet Mail Extensions",
+ RFC 2077, January 1997.
+
+
+
+
+
+Murata, et al. Standards Track [Page 28]
+
+RFC 3023 XML Media Types January 2001
+
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [RFC2130] Weider, C., Preston, C., Simonsen, K., Alvestrand, H.,
+ Atkinson, R., Crispin, M. and P. Svanberg, "The Report of
+ the IAB Character Set Workshop held 29 February - 1 March,
+ 1996", RFC 2130, April 1997.
+
+ [RFC2279] Yergeau, F., "UTF-8, a transformation format of ISO
+ 10646", RFC 2279, January 1998.
+
+ [RFC2376] Whitehead, E. and M. Murata, "XML Media Types", RFC 2376,
+ July 1998.
+
+ [RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform
+ Resource Identifiers (URI): Generic Syntax.", RFC 2396,
+ August 1998.
+
+ [RFC2413] Weibel, S., Kunze, J., Lagoze, C. and M. Wolf, "Dublin
+ Core Metadata for Resource Discovery", RFC 2413, September
+ 1998.
+
+ [RFC2445] Dawson, F. and D. Stenerson, "Internet Calendaring and
+ Scheduling Core Object Specification (iCalendar)", RFC
+ 2445, November 1998.
+
+ [RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S. and D.
+ Jensen, "HTTP Extensions for Distributed Authoring --
+ WEBDAV", RFC 2518, February 1999.
+
+ [RFC2616] Fielding, R., Gettys, J., Mogul, J., Nielsen, H.,
+ Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext
+ Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
+
+ [RFC2629] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629,
+ June 1999.
+
+ [RFC2703] Klyne, G., "Protocol-independent Content Negotiation
+ Framework", RFC 2703, September 1999.
+
+ [RFC2781] Hoffman, P. and F. Yergeau, "UTF-16, an encoding of ISO
+ 10646", RFC 2781, Februrary 2000.
+
+ [RFC2801] Burdett, D., "Internet Open Trading Protocol - IOTP
+ Version 1.0", RFC 2801, April 2000.
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 29]
+
+RFC 3023 XML Media Types January 2001
+
+
+ [SGML] International Standard Organization, "Information
+ Processing -- Text and Office Systems -- Standard
+ Generalized Markup Language (SGML)", ISO 8879, October
+ 1986.
+
+ [SVG] Ferraiolo, J., "Scalable Vector Graphics (SVG)", World
+ Wide Web Consortium Candidate Recommendation SVG, November
+ 2000, <http://www.w3.org/TR/SVG>.
+
+ [XHTML] Pemberton, S. and et al, "XHTML 1.0: The Extensible
+ HyperText Markup Language", World Wide Web Consortium
+ Recommendation xhtml1, January 2000,
+ <http://www.w3.org/TR/xhtml1>.
+
+ [XML] Bray, T., Paoli, J., Sperberg-McQueen, C.M. and E. Maler,
+ "Extensible Markup Language (XML) 1.0 (Second Edition)",
+ World Wide Web Consortium Recommendation REC-xml, October
+ 2000, <http://www.w3.org/TR/REC-xml>.
+
+ [XSLT] Clark, J., "XSL Transformations (XSLT) Version 1.0", World
+ Wide Web Consortium Recommendation xslt, November 1999,
+ <http://www.w3.org/TR/xslt>.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 30]
+
+RFC 3023 XML Media Types January 2001
+
+
+Authors' Addresses
+
+ MURATA Makoto (FAMILY Given)
+ IBM Tokyo Research Laboratory
+ 1623-14, Shimotsuruma
+ Yamato-shi, Kanagawa-ken 242-8502
+ Japan
+
+ Phone: +81-46-215-4678
+ EMail: mmurata@trl.ibm.co.jp
+
+
+ Simon St.Laurent
+ simonstl.com
+ 1259 Dryden Road
+ Ithaca, New York 14850
+ USA
+
+ EMail: simonstl@simonstl.com
+ URI: http://www.simonstl.com/
+
+
+ Dan Kohn
+ Skymoon Ventures
+ 3045 Park Boulevard
+ Palo Alto, California 94306
+ USA
+
+ Phone: +1-650-327-2600
+ EMail: dan@dankohn.com
+ URI: http://www.dankohn.com/
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 31]
+
+RFC 3023 XML Media Types January 2001
+
+
+Appendix A. Why Use the '+xml' Suffix for XML-Based MIME Types?
+
+ Although the use of a suffix was not considered as part of the
+ original MIME architecture, this choice is considered to provide the
+ most functionality with the least potential for interoperability
+ problems or lack of future extensibility. The alternatives to the
+ '+xml' suffix and the reason for its selection are described below.
+
+A.1 Why not just use text/xml or application/xml and let the XML
+ processor dispatch to the correct application based on the
+ referenced DTD?
+
+ text/xml and application/xml remain useful in many situations,
+ especially for document-oriented applications that involve combining
+ XML with a stylesheet in order to present the data. However, XML is
+ also used to define entirely new data types, and an XML-based format
+ such as image/svg+xml fits the definition of a MIME media type
+ exactly as well as image/png[PNG] does. (Note that image/svg+xml is
+ not yet registered.) Although extra functionality is available for
+ MIME processors that are also XML processors, XML-based media types
+ -- even when treated as opaque, non-XML media types -- are just as
+ useful as any other media type and should be treated as such.
+
+ Since MIME dispatchers work off of the MIME type, use of text/xml or
+ application/xml to label discrete media types will hinder correct
+ dispatching and general interoperability. Finally, many XML
+ documents use neither DTDs nor namespaces, yet are perfectly legal
+ XML.
+
+A.2 Why not create a new subtree (e.g., image/xml.svg) to represent XML
+ MIME types?
+
+ The subtree under which a media type is registered -- IETF, vendor
+ (*/vnd.*), or personal (*/prs.*); see [RFC2048] for details -- is
+ completely orthogonal from whether the media type uses XML syntax or
+ not. The suffix approach allows XML document types to be identified
+ within any subtree. The vendor subtree, for example, is likely to
+ include a large number of XML-based document types. By using a
+ suffix, rather than setting up a separate subtree, those types may
+ remain in the same location in the tree of MIME types that they would
+ have occupied had they not been based on XML.
+
+A.3 Why not create a new top-level MIME type for XML-based media types?
+
+ The top-level MIME type (e.g., model/*[RFC2077]) determines what kind
+ of content the type is, not what syntax it uses. For example, agents
+ using image/* to signal acceptance of any image format should
+ certainly be given access to media type image/svg+xml, which is in
+
+
+
+Murata, et al. Standards Track [Page 32]
+
+RFC 3023 XML Media Types January 2001
+
+
+ all respects a standard image subtype. It just happens to use XML to
+ describe its syntax. The two aspects of the media type are
+ completely orthogonal.
+
+ XML-based data types will most likely be registered in ALL top-level
+ categories. Potential, though currently unregistered, examples could
+ include application/mathml+xml[MathML] and image/svg+xml[SVG].
+
+A.4 Why not just have the MIME processor 'sniff' the content to
+ determine whether it is XML?
+
+ Rather than explicitly labeling XML-based media types, the processor
+ could look inside each type and see whether or not it is XML. The
+ processor could also cache a list of XML-based media types.
+
+ Although this method might work acceptably for some mail
+ applications, it would fail completely in many other uses of MIME.
+ For instance, an XML-based web crawler would have no way of
+ determining whether a file is XML except to fetch it and check. The
+ same issue applies in some IMAP4[RFC2060] mail applications, where
+ the client first fetches the MIME type as part of the message
+ structure and then decides whether to fetch the MIME entity.
+ Requiring these fetches just to determine whether the MIME type is
+ XML could have significant bandwidth and latency disadvantages in
+ many situations.
+
+ Sniffing XML also isn't as simple as it might seem. DOCTYPE
+ declarations aren't required, and they can appear fairly deep into a
+ document under certain unpreventable circumstances. (E.g., the XML
+ declaration, comments, and processing instructions can occupy space
+ before the DOCTYPE declaration.) Even sniffing the DOCTYPE isn't
+ completely reliable, thanks to a variety of issues involving default
+ values for namespaces within external DTDs and overrides inside the
+ internal DTD. Finally, the variety in potential character encodings
+ (something XML provides tools to deal with), also makes reliable
+ sniffing less likely.
+
+A.5 Why not use a MIME parameter to specify that a media type uses XML
+ syntax?
+
+ For example, one could use "Content-Type: application/iotp;
+ alternate-type=text/xml" or "Content-Type: application/iotp;
+ syntax=xml".
+
+ Section 5 of [RFC2045] says that "Parameters are modifiers of the
+ media subtype, and as such do not fundamentally affect the nature of
+ the content". However, all XML-based media types are by their nature
+
+
+
+
+Murata, et al. Standards Track [Page 33]
+
+RFC 3023 XML Media Types January 2001
+
+
+ always XML. Parameters, as they have been defined in the MIME
+ architecture, are never invariant across all instantiations of a
+ media type.
+
+ More practically, very few if any MIME dispatchers and other MIME
+ agents support dispatching off of a parameter. While MIME agents on
+ the receiving side will need to be updated in either case to support
+ (or fall back to) generic XML processing, it has been suggested that
+ it is easier to implement this functionality when acting off of the
+ media type rather than a parameter. More important, sending agents
+ require no update to properly tag an image as "image/svg+xml", but
+ few if any sending agents currently support always tagging certain
+ content types with a parameter.
+
+A.6 How about labeling with parameters in the other direction (e.g.,
+ application/xml; Content-Feature=iotp)?
+
+ This proposal fails under the simplest case, of a user with neither
+ knowledge of XML nor an XML-capable MIME dispatcher. In that case,
+ the user's MIME dispatcher is likely to dispatch the content to an
+ XML processing application when the correct default behavior should
+ be to dispatch the content to the application responsible for the
+ content type (e.g., an ecommerce engine for
+ application/iotp+xml[RFC2801], once this media type is registered).
+
+ Note that even if the user had already installed the appropriate
+ application (e.g., the ecommerce engine), and that installation had
+ updated the MIME registry, many operating system level MIME
+ registries such as .mailcap in Unix and HKEY_CLASSES_ROOT in Windows
+ do not currently support dispatching off a parameter, and cannot
+ easily be upgraded to do so. And, even if the operating system were
+ upgraded to support this, each MIME dispatcher would also separately
+ need to be upgraded.
+
+A.7 How about a new superclass MIME parameter that is defined to apply
+ to all MIME types (e.g., Content-Type: application/iotp;
+ $superclass=xml)?
+
+ This combines the problems of Appendix A.5 and Appendix A.6.
+
+ If the sender attaches an image/svg+xml file to a message and
+ includes the instructions "Please copy the French text on the road
+ sign", someone with an XML-aware MIME client and an XML browser but
+ no support for SVG can still probably open the file and copy the
+ text. By contrast, with superclasses, the sender must add superclass
+ support to her existing mailer AND the receiver must add superclass
+ support to his before this transaction can work correctly.
+
+
+
+
+Murata, et al. Standards Track [Page 34]
+
+RFC 3023 XML Media Types January 2001
+
+
+ If the receiver comes to rely on the superclass tag being present and
+ applications are deployed relying on that tag (as always seems to
+ happen), then only upgraded senders will be able to interoperate with
+ those receiving applications.
+
+A.8 What about adding a new parameter to the Content-Disposition header
+ or creating a new Content-Structure header to indicate XML syntax?
+
+ This has nearly identical problems to Appendix A.7, in that it
+ requires both senders and receivers to be upgraded, and few if any
+ operating systems and MIME dispatchers support working off of
+ anything other than the MIME type.
+
+A.9 How about a new Alternative-Content-Type header?
+
+ This is better than Appendix A.8, in that no extra functionality
+ needs to be added to a MIME registry to support dispatching of
+ information other than standard content types. However, it still
+ requires both sender and receiver to be upgraded, and it will also
+ fail in many cases (e.g., web hosting to an outsourced server), where
+ the user can set MIME types (often through implicit mapping to file
+ extensions), but has no way of adding arbitrary HTTP headers.
+
+A.10 How about using a conneg tag instead (e.g., accept-features:
+ (syntax=xml))?
+
+ When the conneg protocol is fully defined, this may potentially be a
+ reasonable thing to do. But given the limited current state of
+ conneg[RFC2703] development, it is not a credible replacement for a
+ MIME-based solution.
+
+A.11 How about a third-level content-type, such as text/xml/rdf?
+
+ MIME explicitly defines two levels of content type, the top-level for
+ the kind of content and the second-level for the specific media type.
+ [RFC2048] extends this in an interoperable way by using prefixes to
+ specify separate trees for IETF, vendor, and personal registrations.
+ This specification also extends the two-level type by using the '
+ +xml' suffix. In both cases, processors that are unaware of these
+ later specifications treat them as opaque and continue to
+ interoperate. By contrast, adding a third-level type would break the
+ current MIME architecture and cause numerous interoperability
+ failures.
+
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 35]
+
+RFC 3023 XML Media Types January 2001
+
+
+A.12 Why use the plus ('+') character for the suffix '+xml'?
+
+ As specified in Section 5.1 of [RFC2045], a tspecial can't be used:
+
+ tspecials :=
+ "(" / ")" / "<" / ">" / "@" /
+ "," / ";" / ":" / "\" / <">
+ "/" / "[" / "]" / "?" / "="
+
+ It was thought that "." would not be a good choice since it is
+ already used as an additional hierarchy delimiter. Also, "*" has a
+ common wildcard meaning, and "-" and "_" are common word separators
+ and easily confused. The characters %'`#& are frequently used for
+ quoting or comments and so are not ideal.
+
+ That leaves: ~!$^+{}|
+
+ Note that "-" is used heavily in the current registry. "$" and "_"
+ are used once each. The others are currently unused.
+
+ It was thought that '+' expressed the semantics that a MIME type can
+ be treated (for example) as both scalable vector graphics AND ALSO as
+ XML; it is both simultaneously.
+
+A.13 What is the semantic difference between application/foo and
+ application/foo+xml?
+
+ MIME processors that are unaware of XML will treat the '+xml' suffix
+ as completely opaque, so it is essential that no extra semantics be
+ assigned to its presence. Therefore, application/foo and
+ application/foo+xml SHOULD be treated as completely independent media
+ types. Although, for example, text/calendar+xml could be an XML
+ version of text/calendar[RFC2445], it is possible that this
+ (hypothetical) new media type would include new semantics as well as
+ new syntax, and in any case, there would be many applications that
+ support text/calendar but had not yet been upgraded to support
+ text/calendar+xml.
+
+A.14 What happens when an even better markup language (e.g., EBML) is
+ defined, or a new category of data?
+
+ In the ten years that MIME has existed, XML is the first generic data
+ format that has seemed to justify special treatment, so it is hoped
+ that no further suffixes will be necessary. However, if some are
+ later defined, and these documents were also XML, they would need to
+ specify that the '+xml' suffix is always the outermost suffix (e.g.,
+
+
+
+
+
+Murata, et al. Standards Track [Page 36]
+
+RFC 3023 XML Media Types January 2001
+
+
+ application/foo+ebml+xml not application/foo+xml+ebml). If they were
+ not XML, then they would use a regular suffix (e.g.,
+ application/foo+ebml).
+
+A.15 Why must I use the '+xml' suffix for my new XML-based media type?
+
+ You don't have to, but unless you have a good reason to explicitly
+ disallow generic XML processing, you should use the suffix so as not
+ to curtail the options of future users and developers.
+
+ Whether the inventors of a media type, today, design it for dispatch
+ to generic XML processing machinery (and most won't) is not the
+ critical issue. The core notion is that the knowledge that some
+ media type happens to use XML syntax opens the door to unanticipated
+ kinds of processing beyond those envisioned by its inventors, and on
+ this basis identifying such encoding is a good and useful thing.
+
+ Developers of new media types are often tightly focused on a
+ particular type of processing that meets current needs. But there is
+ no need to rule out generic processing as well, which could make your
+ media type more valuable over time. It is believed that registering
+ with the '+xml' suffix will cause no interoperability problems
+ whatsoever, while it may enable significant new functionality and
+ interoperability now and in the future. So, the conservative
+ approach is to include the '+xml' suffix.
+
+Appendix B. Changes from RFC 2376
+
+ There are numerous and significant differences between this
+ specification and [RFC2376], which it obsoletes. This appendix
+ summarizes the major differences only.
+
+ First, text/xml-external-parsed-entity and application/xml-external-
+ parsed-entity are added as media types for external parsed entities,
+ and text/xml and application/xml are now prohibited.
+
+ Second, application/xml-dtd is added as a media type for external DTD
+ subsets and external parameter entities, and text/xml and
+ application/xml are now prohibited.
+
+ Third, "utf-16le" and "utf-16be" are added. RFC 2781 has introduced
+ these BOM-less variations of the UTF-16 family.
+
+ Fourth, a naming convention ('+xml') for XML-based media types has
+ been added, which also updates [RFC2048] as described in Section 7.
+ By following this convention, an XML-based media type can be easily
+ recognized as such.
+
+
+
+
+Murata, et al. Standards Track [Page 37]
+
+RFC 3023 XML Media Types January 2001
+
+
+Appendix C. Acknowledgements
+
+ This document reflects the input of numerous participants to the
+ ietf-xml-mime@imc.org mailing list, though any errors are the
+ responsibility of the authors. Special thanks to:
+
+ Mark Baker, James Clark, Dan Connolly, Martin Duerst, Ned Freed,
+ Yaron Goland, Rick Jelliffe, Larry Masinter, David Megginson, Keith
+ Moore, Chris Newman, Gavin Nicol, Marshall Rose, Jim Whitehead and
+ participants of the XML activity at the W3C.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 38]
+
+RFC 3023 XML Media Types January 2001
+
+
+Full Copyright Statement
+
+ Copyright (C) The Internet Society (2001). All Rights Reserved.
+
+ This document and translations of it may be copied and furnished to
+ others, and derivative works that comment on or otherwise explain it
+ or assist in its implementation may be prepared, copied, published
+ and distributed, in whole or in part, without restriction of any
+ kind, provided that the above copyright notice and this paragraph are
+ included on all such copies and derivative works. However, this
+ document itself may not be modified in any way, such as by removing
+ the copyright notice or references to the Internet Society or other
+ Internet organizations, except as needed for the purpose of
+ developing Internet standards in which case the procedures for
+ copyrights defined in the Internet Standards process must be
+ followed, or as required to translate it into languages other than
+ English.
+
+ The limited permissions granted above are perpetual and will not be
+ revoked by the Internet Society or its successors or assigns.
+
+ This document and the information contained herein is provided on an
+ "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+ TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+ BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+ HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+ MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Murata, et al. Standards Track [Page 39]
+