1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
|
Network Working Group E. Levinson
Request for Comments: 1874 Accurate Information Systems, Inc.
Category: Experimental December 1995
SGML Media Types
Status of this Memo
This memo defines an Experimental Protocol for the Internet
community. This memo does not specify an Internet standard of any
kind. Discussion and suggestions for improvement are requested.
Distribution of this memo is unlimited.
Abstract
This document proposes new media sub-types of Text/SGML and
Application/SGML. These media types can be used in the exchange of
SGML documents and their entities. Specific details for the exchange
or encapsulation of groups of related SGML entities using MIME are
currently being considered by the mimesgml Working Group <sgml-
internet@ebt.com>.
1. Introduction
A need exists for the transfer the elements of documents constructed
using the Standard Generalized Markup Language (SGML) [ISO-8879].
While the specific details of such transfers are being considered
general agreement exists on the need to register basic media types
for the SGML entities not covered by existing types.
The Standard Generalized Markup Language (SGML) is used to encode
document structure and a rigorous description of it is left to [ISO-
8879]. The terms used in the present document attempt to be
consistent with SGML terminology and usage.
2. The SGML Media-Types
There are two media-types for SGML parsable entities, Text/SGML and
Application/SGML. Both have the same optional parameters. Text/SGML
provides a fallback to Text/Plain for those without SGML capability.
Senders should base the choice between text and application media-
types on the entity's content. Text is suggested for entities that
would be meaningful to a human being without SGML processing.
Application/SGML is recommended for all others.
Levinson Experimental [Page 1]
^L
RFC 1874 SGML Media Types December 1995
2.1. Text/SGML
MIME type name: Text
MIME subtype name: SGML
Required parameters: none
Optional parameters: charset, SGML-bctf, SGML-boot
Encoding considerations: may be encoded
Security considerations: see section 4 below
Published specification: ISO 8879:1986
Person and email address to contact for further information:
E. Levinson <ELevinson@Accurate.com>
The Text/SGML media-type can be employed when the contents of the
SGML entity is intended to be read by a human and is in a readily
comprehensible form. That is the content can be easily discerned by
someone without SGML display software. Each record in the SGML
entity, delimited by record start (RS) and record end (RE) codes,
must correspond to a line in the Text/SGML body part.
SGML entities that do not meet the above requirements should use the
Application/SGML media-type.
See section 2.3 for a description of the parameters.
2.2. Application/SGML
MIME type name: Application
MIME subtype name: SGML
Required parameters: none
Optional parameters: SGML-bctf, SGML-boot
Encoding considerations: may be encoded
Security considerations: see section 4 below
Published specification: ISO-8879
Person and email address to contact for further information:
E. Levinson <ELevinson@Accurate.com>
Use the Application/SGML media-type for SGML text entities that are
not appropriate for Text/SGML. When used, each record start (RS) and
record end (RE) character shall be explicitly represented by the bit
combination specified in the SGML declaration.
The parameters are described in the next section.
Levinson Experimental [Page 2]
^L
RFC 1874 SGML Media Types December 1995
2.3. SGML Sub-type Parameters
The parameters for the Text/ and Application/SGML subtypes are
defined below.
charset The charset parameter for Text/SGML is defined in
[RFC-1521], the valid values and their meaning are
registered by the Internet Assigned Numbers
Authority (IANA) [RFC-1590]. The default charset
value for all Text content-types is "us-ascii"
[RFC-1521].
The charset parameter is provided to permit non-
SGML capable systems to provide reasonable
behavior when Text/SGML defaults to Text/Plain.
SGML capable systems will use the SGML-bctf param-
eter.
SGML-bctf The SGML-bctf (SGML bit combination transformation
format) parameter describes the method used to
transform the entity's sequence of constant width
binary numbers (called "bit combinations" in [ISO
8879, 4.24]) into the octet stream contained in
the MIME body part.
Valid values for SGML-bctf are the BCTF notation
names defined in Annex C of [ISO-10744] and are
reproduced for convenience in the Appendix. The
default value is "identity", i.e. perform no
transformation.
SGML-boot The SGML-boot parameter value is the content-ID of
a MIME body part (Application/Octet-stream) that
satisfies the requirements of the boot attribute
in [ISO-10744]. The Appendix contains a summary
of those requirements. The SGML-boot parameter is
only applicable if the SGML entity is a document
entity.
3. Security Considerations
SGML entities contain information to be parsed and processed by the
recipient's SGML system. Those entities may contain and such systems
may permit explicit system level commands to be execute while
processing the data. To the extent that an SGML system will execute
arbitrary command strings recipients of SGML entities may be at risk.
Levinson Experimental [Page 3]
^L
RFC 1874 SGML Media Types December 1995
Parsable SGML entities may also contain explicit processing
instructions for a presentation or composition system; use of such
instructions present concerns similar to those of
Application/PostScript.
4. References
[ISO-8879]
Information processing -- 8-bit Single-Byte Coded Graphic
Character Sets -- Part 1: Latin Alphabet No. 1, ISO
8859-1:1987.
[ISO-8879]
ISO 8879:1986, Information processing -- Text and office
systems -- Standard Generalized Markup Language (SGML).
[ISO-10744]
ISO/IEC 10744:1992, Information technology --
Hypermedia/Time-based Structuring Language (HyTime) (as
modified by First Proposed Technical Corrigendum, ISO/IEC
JTC1/SC18 N5027)
[RFC-1521]
Borenstein, N., and N. Freed, "MIME (Multipurpose Internet
Mail Extensions) Part One: Mechanisms for Specifying and
Describing the Format of Internet Message Bodies", RFC
1521, Bellcore, Innosoft, September 1993.
[RFC-1590]
Postel, J., "Media Type Registration Procedure", RFC 1590,
USC/Information Sciences Institute, March 1994.
[RFC-1642]
Goldsmith, D., and M. Davis, "UTF-7, A Mail-Safe
Transformation Format of UNICODE", RFC 1642, Taligent,
Inc., July 1994.
5. Author's Address
Ed Levinson
Accurate Information Systems, Inc.
2 Industrial Way
Eatontown, NJ 07724
EMail: ELevinson@Accurate.com
Levinson Experimental [Page 4]
^L
RFC 1874 SGML Media Types December 1995
APPENDIX
ISO-10744 BCTF Values and Boot Attribute
A.1. Bit Combination Transformation Format (BCTF) Values
The following list of Bit Combination Transformation Format (BCTF)
values is provided as a convenience. The authoritative source is
[ISO-10744].
identity Each bit combination is represented by a single
octet; this BCTF can be used only for entities all
of whose bit combinations have a value not exceeding
255.
fixed-2 Each bit combination is represented by exactly 2
octets, with the more significant octet first; this
BCTF can be used only for entities all of whose bit
combinations have a value not exceeding 65535.
fixed-3 Each bit combination is represented by exactly 3
octets, with a more significant octet preceding any
less significant octets; this BCTF can be used only
for entities all of whose bit combinations have a
value not exceeding 16777215.
fixed-4 Each bit combination is represented by exactly 4
octets, with a more significant octet preceding any
less significant octets.
utf-8 Each bit combination is represented by a variable
number of octets according to UCS Transformation
Format 8 defined in Annex P to be added by the first
proposed drafted amendment (PDAM 1) to ISO/IEC
10646-1:1993.
utf-7 Each bit combination is represented by a variable
number of octets in the range 0 through 127 as
described in [RFC-1642]; this BCTF can be used only
for entities all of whose bit combinations have a
value not exceeding 65535.
euc-jp Each bit combination is treated as a pair of octets,
most significant octet first, encoding a character
using the
Extended_UNIX_Code_Fixed_Width_for_Japanese charset,
and is transformed into the variable length sequence
of octets that would encode that character using the
Levinson Experimental [Page 5]
^L
RFC 1874 SGML Media Types December 1995
Extended_UNIX_Code_Packed_Format_for_Japanese char-
set.
sjis Each bit combination is treated as a pair of octets,
most significant octet first, encoding a character
using the
Extended_UNIX_Code_Fixed_Width_for_Japanese charset,
and is transformed into the variable length sequence
of octets that would encode that character using the
Shift_JIS charset.
A.2. The Boot Attribute
The body part specified by the SGML-boot parameter contains a
sequence of triplets of positive integers separated by white space.
The triplets correspond to the described character set portion [IS0-
8879, 13.1.1.2] of the SGML declaration. SGML-boot provides the
capability to identify the character set of the document's SGML
declaration when it uses significant SGML characters [ibid., 4.298]
in the SGML reference concrete syntax [ibid., 13.4] that have a
character number [ibid., 4.44] in the document's character set that
differs from us-ascii. The default value is "0 128 0", all
characters are us-ascii.
Notes: (1) The triplet, <dscn noc bscn> has the following meaning.
Starting with character number dscn in the us-ascii character set,
renumber noc characters starting at bscn and incrementing by one.
Thus, 0 128 0, represents the identity mapping. (2) The document's
declaration itself may also redefine the significant SGML characters;
the boot attribute is intended to bootstrap the SGML system's parse
of the declaration.
Levinson Experimental [Page 6]
^L
|