1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
|
Internet Engineering Task Force (IETF) A. Melnikov
Request for Comments: 6657 Isode Limited
Updates: 2046 J. Reschke
Category: Standards Track greenbytes
ISSN: 2070-1721 July 2012
Update to MIME regarding "charset" Parameter Handling
in Textual Media Types
Abstract
This document changes RFC 2046 rules regarding default "charset"
parameter values for "text/*" media types to better align with common
usage by existing clients and servers.
Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc6657.
Copyright Notice
Copyright (c) 2012 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Melnikov & Reschke Standards Track [Page 1]
^L
RFC 6657 MIME Charset Default Update July 2012
Table of Contents
1. Introduction and Overview .......................................2
2. Conventions Used in This Document ...............................2
3. New Rules for Default "charset" Parameter Values for
"text/*" Media Types ............................................3
4. Default "charset" Parameter Value for "text/plain" Media Type ...4
5. Security Considerations .........................................4
6. IANA Considerations .............................................4
7. References ......................................................4
7.1. Normative References .......................................4
7.2. Informative References .....................................5
Appendix A. Acknowledgements ......................................6
1. Introduction and Overview
RFC 2046 specified that the default "charset" parameter (i.e., the
value used when the parameter is not specified) is "US-ASCII"
(Section 4.1.2 of [RFC2046]). RFC 2616 changed the default for use
by HTTP (Hypertext Transfer Protocol) to be "ISO-8859-1" (Section
3.7.1 of [RFC2616]). This encoding is not very common for new
"text/*" media types and a special rule in the HTTP specification
adds confusion about which specification ([RFC2046] or [RFC2616]) is
authoritative in regards to the default charset for "text/*" media
types.
Many complex text subtypes such as "text/html" [RFC2854] and "text/
xml" [RFC3023] have internal (to their format) means of describing
the charset. Many existing User Agents ignore the default of "US-
ASCII" rule for at least "text/html" and "text/xml".
This document changes RFC 2046 rules regarding default "charset"
parameter values for "text/*" media types to better align with common
usage by existing clients and servers. It does not change the
defaults for any currently registered media type.
2. Conventions Used in This Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
Melnikov & Reschke Standards Track [Page 2]
^L
RFC 6657 MIME Charset Default Update July 2012
3. New Rules for Default "charset" Parameter Values for "text/*" Media
Types
Section 4.1.2 of [RFC2046] says:
The default character set, which must be assumed in the absence of
a charset parameter, is US-ASCII.
As explained in the Introduction section, this rule is considered
outdated, so this document replaces it with the following set of
rules:
Each subtype of the "text" media type that uses the "charset"
parameter can define its own default value for the "charset"
parameter, including the absence of any default.
In order to improve interoperability with deployed agents, "text/*"
media type registrations SHOULD either
a. specify that the "charset" parameter is not used for the defined
subtype, because the charset information is transported inside
the payload (such as in "text/xml"), or
b. require explicit unconditional inclusion of the "charset"
parameter, eliminating the need for a default value.
In accordance with option (a) above, registrations for "text/*" media
types that can transport charset information inside the corresponding
payloads (such as "text/html" and "text/xml") SHOULD NOT specify the
use of a "charset" parameter, nor any default value, in order to
avoid conflicting interpretations should the "charset" parameter
value and the value specified in the payload disagree.
Thus, new subtypes of the "text" media type SHOULD NOT define a
default "charset" value. If there is a strong reason to do so
despite this advice, they SHOULD use the "UTF-8" [RFC3629] charset as
the default.
Regardless of what approach is chosen, all new "text/*" registrations
MUST clearly specify how the charset is determined; relying on the
default defined in Section 4.1.2 of [RFC2046] is no longer permitted.
However, existing "text/*" registrations that fail to specify how the
charset is determined still default to US-ASCII.
Specifications covering the "charset" parameter, and what default
value, if any, is used, are subtype-specific, NOT protocol-specific.
Protocols that use MIME, therefore, MUST NOT override default charset
Melnikov & Reschke Standards Track [Page 3]
^L
RFC 6657 MIME Charset Default Update July 2012
values for "text/*" media types to be different for their specific
protocol. The protocol definitions MUST leave that to the subtype
definitions.
4. Default "charset" Parameter Value for "text/plain" Media Type
The default "charset" parameter value for "text/plain" is unchanged
from [RFC2046] and remains as "US-ASCII".
5. Security Considerations
Guessing of the "charset" parameter can lead to security issues such
as content buffer overflows, denial of services, or bypass of
filtering mechanisms. However, this document does not promote
guessing, but encourages use of charset information that is specified
by the sender.
Conflicting information in-band vs. out-of-band can also lead to
similar security problems, and this document recommends the use of
charset information that is more likely to be correct (for example,
in-band over out-of-band).
6. IANA Considerations
IANA has updated the "text" subregistry of the Media Types registry
(<http://www.iana.org/assignments/media-types/text/>) to add the
following preamble: "See [RFC6657] for information about 'charset'
parameter handling for text media types."
Also, IANA has added this RFC to the list of references at the
beginning of the Application for Media Type
(<http://www.iana.org/form/media-types>).
7. References
7.1. Normative References
[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
Extensions (MIME) Part Two: Media Types", RFC 2046,
November 1996.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
10646", STD 63, RFC 3629, November 2003.
Melnikov & Reschke Standards Track [Page 4]
^L
RFC 6657 MIME Charset Default Update July 2012
7.2. Informative References
[RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext
Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
[RFC2854] Connolly, D. and L. Masinter, "The 'text/html' Media
Type", RFC 2854, June 2000.
[RFC3023] Murata, M., St. Laurent, S., and D. Kohn, "XML Media
Types", RFC 3023, January 2001.
Melnikov & Reschke Standards Track [Page 5]
^L
RFC 6657 MIME Charset Default Update July 2012
Appendix A. Acknowledgements
Many thanks to Ned Freed and John Klensin for comments and ideas that
motivated creation of this document, and to Carsten Bormann, Murray
S. Kucherawy, Barry Leiba, and Henri Sivonen for feedback and text
suggestions.
Authors' Addresses
Alexey Melnikov
Isode Limited
5 Castle Business Village
36 Station Road
Hampton, Middlesex TW12 2BX
UK
EMail: Alexey.Melnikov@isode.com
Julian F. Reschke
greenbytes GmbH
Hafenweg 16
Muenster, NW 48155
Germany
EMail: julian.reschke@greenbytes.de
URI: http://greenbytes.de/tech/webdav/
Melnikov & Reschke Standards Track [Page 6]
^L
|