summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc3230.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc3230.txt')
-rw-r--r--doc/rfc/rfc3230.txt731
1 files changed, 731 insertions, 0 deletions
diff --git a/doc/rfc/rfc3230.txt b/doc/rfc/rfc3230.txt
new file mode 100644
index 0000000..fd51ea0
--- /dev/null
+++ b/doc/rfc/rfc3230.txt
@@ -0,0 +1,731 @@
+
+
+
+
+
+
+Network Working Group J. Mogul
+Request for Comments: 3230 Compaq WRL
+Category: Standards Track A. Van Hoff
+ Marimba
+ January 2002
+
+
+ Instance Digests in HTTP
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2002). All Rights Reserved.
+
+Abstract
+
+ HTTP/1.1 defines a Content-MD5 header that allows a server to include
+ a digest of the response body. However, this is specifically defined
+ to cover the body of the actual message, not the contents of the full
+ file (which might be quite different, if the response is a Content-
+ Range, or uses a delta encoding). Also, the Content-MD5 is limited
+ to one specific digest algorithm; other algorithms, such as SHA-1
+ (Secure Hash Standard), may be more appropriate in some
+ circumstances. Finally, HTTP/1.1 provides no explicit mechanism by
+ which a client may request a digest. This document proposes HTTP
+ extensions that solve these problems.
+
+Table of Contents
+
+ 1 Introduction.................................................... 2
+ 1.1 Other limitations of HTTP/1.1............................ 3
+ 2 Goals........................................................... 4
+ 3 Terminology..................................................... 5
+ 4 Specification................................................... 6
+ 4.1 Protocol parameter specifications........................ 6
+ 4.1.1 Digest algorithms................................. 6
+ 4.2 Instance digests......................................... 7
+ 4.3 Header specifications.................................... 8
+ 4.3.1 Want-Digest....................................... 8
+ 4.3.2 Digest............................................ 9
+ 5 Negotiation of Content-MD5...................................... 9
+
+
+
+Mogul, et. al. Standards Track [Page 1]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ 6 IANA Considerations............................................. 10
+ 7 Security Considerations......................................... 10
+ 8 Acknowledgements................................................ 10
+ 9 References...................................................... 10
+ 10 Authors' Addresses............................................. 12
+ 11 Full Copyright Statement....................................... 13
+
+1 Introduction
+
+ Although HTTP is typically layered over a reliable transport
+ protocol, such as TCP, this does not guarantee reliable transport of
+ information from sender to receiver. Various problems, including
+ undetected transmission errors, programming errors, corruption of
+ stored data, and malicious intervention can cause errors in the
+ transmitted information.
+
+ A common approach to the problem of data integrity in a network
+ protocol or distributed system, such as HTTP, is the use of digests,
+ checksums, or hash values. The sender computes a digest and sends it
+ with the data; the recipient computes a digest of the received data,
+ and then verifies the integrity of this data by comparing the
+ digests.
+
+ Checksums are used at virtually all layers of the IP stack. However,
+ different digest algorithms might be used at each layer, for reasons
+ of computational cost, because the size and nature of the data being
+ protected varies, and because the possible threats to data integrity
+ vary. For example, Ethernet uses a Cyclic Redundancy Check (CRC).
+ The IPv4 protocol uses a ones-complement checksum over the IP header
+ (but not the rest of the packet). TCP uses a ones-complement
+ checksum over the TCP header and data, and includes a "pseudo-header"
+ to detect certain kinds of programming errors.
+
+ HTTP/1.1 [4] includes a mechanism for ensuring message integrity, the
+ Content-MD5 header. This header is actually defined for MIME-
+ conformant messages in a standalone specification [10]. According to
+ the HTTP/1.1 specification,
+
+ The Content-MD5 entity-header field [...] is an MD5 digest of the
+ entity-body for the purpose of providing an end-to-end message
+ integrity check (MIC) of the entity-body.
+
+ HTTP/1.1 borrowed Content-MD5 from the MIME world based on an analogy
+ between MIME messages (e.g., electronic mail messages) and HTTP
+ messages (requests to or responses from an HTTP server).
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 2]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ As discussed in more detail in section 3, this analogy between MIME
+ messages and HTTP messages has resulted in some confusion. In
+ particular, while a MIME message is self-contained, an HTTP message
+ might not contain the entire representation of the current state of a
+ resource. (More precisely, an HTTP response might not contain an
+ entire "instance"; see section 3 for a definition of this term.)
+
+ There are at least two situations where this distinction is an issue:
+
+ 1. When an HTTP server sends a 206 (Partial Content) response, as
+ defined in HTTP/1.1. The client may form its view of an
+ instance (e.g., an HTML document) by combining a cache entry
+ with the partial content in the message.
+
+ 2. When an HTTP server uses a "delta encoding", as proposed in a
+ separate document [9]. A delta encoding represents the changes
+ between the current instance of a resource and a previous
+ instance, and is an efficient way of reducing the bandwidth
+ required for cache updates. The client forms its view of an
+ instance by applying the delta in the message to one of its
+ cache entries.
+
+ We include these two kinds of transformations in a potentially
+ broader category we call "instance manipulations."
+
+ In each of these cases, the server might use a Content-MD5 header to
+ protect the integrity of the response message. However, because the
+ MIC in a Content-MD5 header field applies only to the entity in that
+ message, and not to the entire instance being reassembled, it cannot
+ protect against errors due to data corruption (e.g., of cache
+ entries), programming errors (e.g., improper application of a partial
+ content or delta), certain malicious attacks [9], or corruption of
+ certain HTTP headers in transit.
+
+ Thus, the Content-MD5 header, while useful and sufficient in many
+ cases, is not sufficient for verifying instance integrity in all uses
+ of HTTP.
+
+ The Digest Authentication mechanism [5] provides (in addition to its
+ other goals) a message-digest function similar to Content-MD5, except
+ that it includes certain header fields. Like Content-MD5, it covers
+ a specific message, not an entire instance.
+
+1.1 Other limitations of HTTP/1.1
+
+ Checksums are not free. Computing a digest takes CPU resources, and
+ might add latency to the generation of a message. (Some of these
+ costs can be avoided by careful caching at the sender's end, but in
+
+
+
+Mogul, et. al. Standards Track [Page 3]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ many cases such a cache would not have a useful hit ratio.)
+ Transmitting a digest consumes HTTP header space (and therefore
+ increases latency and network bandwidth requirements.) If the
+ message recipient does not intend to use the digest, why should the
+ message sender waste resources computing and sending it?
+
+ The Content-MD5 header, of course, implies the use of the MD5
+ algorithm [15]. Other algorithms, however, might be more appropriate
+ for some purposes. These include the SHA-1 algorithm [12] and
+ various "fingerprinting" algorithms [7]. HTTP currently provides no
+ standardized support for the use of these algorithms.
+
+ HTTP/1.1 apparently assumes that the choice to generate a digest is
+ up to the sender, and provides no mechanism for the recipient to
+ indicate whether a checksum would be useful, or what checksum
+ algorithms it would understand.
+
+2 Goals
+
+ The goals of this proposal are:
+
+ 1. Digest coverage for entire instances communicated via HTTP.
+
+ 2. Support for multiple digest algorithms.
+
+ 3. Negotiation of the use of digests.
+
+ The goals do not include:
+
+ - header integrity
+ The digest mechanisms described here cover only the bodies of
+ instances, and do not protect the integrity of associated
+ "entity headers" or other message headers.
+
+ - authentication
+ The digest mechanisms described here are not meant to support
+ authentication of the source of a digest or of a message or
+ instance. These mechanisms, therefore, are not sufficient
+ defense against many kinds of malicious attacks.
+
+ - privacy
+ Digest mechanisms do not provide message privacy.
+
+ - authorization
+ The digest mechanisms described here are not meant to support
+ authorization or other kinds of access controls.
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 4]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ The Digest Access Authentication mechanism [5] can provide some
+ integrity for certain HTTP headers, and does provide authentication.
+
+3 Terminology
+
+ HTTP/1.1 [4] defines the following terms:
+
+ resource A network data object or service that can be
+ identified by a URI, as defined in section 3.2.
+ Resources may be available in multiple
+ representations (e.g. multiple languages, data
+ formats, size, resolutions) or vary in other ways.
+
+ entity The information transferred as the payload of a
+ request or response. An entity consists of
+ metainformation in the form of entity-header fields
+ and content in the form of an entity-body, as
+ described in section 7.
+
+ variant A resource may have one, or more than one,
+ representation(s) associated with it at any given
+ instant. Each of these representations is termed a
+ `variant.' Use of the term `variant' does not
+ necessarily imply that the resource is subject to
+ content negotiation.
+
+ The dictionary definition for "entity" is "something that has
+ separate and distinct existence and objective or conceptual reality"
+ [8]. Unfortunately, the definition for "entity" in HTTP/1.1 is
+ similar to that used in MIME [6], based on an entirely false analogy
+ between MIME and HTTP.
+
+ In MIME, electronic mail messages do have distinct and separate
+ existences. MIME defines "entity" as something that "refers
+ specifically to the MIME-defined header fields and contents of either
+ a message or one of the parts in the body of a multipart entity."
+
+ In HTTP, however, a response message to a GET does not have a
+ distinct and separate existence. Rather, it is describing the
+ current state of a resource (or a variant, subject to a set of
+ constraints). The HTTP/1.1 specification provides no term to
+ describe "the value that would be returned in response to a GET
+ request at the current time for the selected variant of the specified
+ resource." This leads to awkward wordings in the HTTP/1.1
+ specification in places where this concept is necessary.
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 5]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ It is too late to fix the terminological failure in the HTTP/1.1
+ specification, so we instead define a new term, for use in this
+ document:
+
+ instance The entity that would be returned in a status-200
+ response to a GET request, at the current time, for
+ the selected variant of the specified resource,
+ with the application of zero or more content-
+ codings, but without the application of any
+ instance manipulations or transfer-codings.
+
+ It is convenient to think of an entity tag, in HTTP/1.1, as being
+ associated with an instance, rather than an entity. That is, for a
+ given resource, two different response messages might include the
+ same entity tag, but two different instances of the resource should
+ never be associated with the same (strong) entity tag.
+
+ We also define this term:
+
+ instance manipulation
+ An operation on one or more instances which may
+ result in an instance being conveyed from server to
+ client in parts, or in more than one response
+ message. For example, a range selection or a delta
+ encoding. Instance manipulations are end-to-end,
+ and often involve the use of a cache at the client.
+
+4 Specification
+
+ In this specification, the key words "MUST", "MUST NOT", "SHOULD",
+ "SHOULD NOT", and "MAY" are to be interpreted as described in RFC
+ 2119 [2].
+
+4.1 Protocol parameter specifications
+
+4.1.1 Digest algorithms
+
+ Digest algorithm values are used to indicate a specific digest
+ computation. For some algorithms, one or more parameters may be
+ supplied.
+
+ digest-algorithm = token
+
+ The BNF for "parameter" is as is used in RFC 2616 [4]. All digest-
+ algorithm values are case-insensitive.
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 6]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ The Internet Assigned Numbers Authority (IANA) acts as a registry for
+ digest-algorithm values. Initially, the registry contains the
+ following tokens:
+
+ MD5 The MD5 algorithm, as specified in RFC 1321 [15].
+ The output of this algorithm is encoded using the
+ base64 encoding [1].
+
+ SHA The SHA-1 algorithm [12]. The output of this
+ algorithm is encoded using the base64 encoding [1].
+
+ UNIXsum The algorithm computed by the UNIX "sum" command,
+ as defined by the Single UNIX Specification,
+ Version 2 [13]. The output of this algorithm is an
+ ASCII decimal-digit string representing the 16-bit
+ checksum, which is the first word of the output of
+ the UNIX "sum" command.
+
+ UNIXcksum The algorithm computed by the UNIX "cksum" command,
+ as defined by the Single UNIX Specification,
+ Version 2 [13]. The output of this algorithm is an
+ ASCII digit string representing the 32-bit CRC,
+ which is the first word of the output of the UNIX
+ "cksum" command.
+
+ If other digest-algorithm values are defined, the associated encoding
+ MUST either be represented as a quoted string, or MUST NOT include
+ ";" or "," in the character sets used for the encoding.
+
+4.2 Instance digests
+
+ An instance digest is the representation of the output of a digest
+ algorithm, together with an indication of the algorithm used (and any
+ parameters).
+
+ instance-digest = digest-algorithm "="
+ <encoded digest output>
+
+ The digest is computed on the entire instance associated with the
+ message. The instance is a snapshot of the resource prior to the
+ application of of any instance manipulation or transfer-coding (see
+ section 3). The byte order used to compute the digest is the
+ transmission byte order defined for the content-type of the instance.
+
+
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 7]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ Note: the digest is computed before the application of any
+ instance manipulation. If a range or a delta-coding [9] is used,
+ the computation of the digest after the computation of the range
+ or delta would not provide a digest useful for checking the
+ integrity of the reassembled instance.
+
+ The encoded digest output uses the encoding format defined for the
+ specific digest-algorithm. For example, if the digest-algorithm is
+ "MD5", the encoding is base64; if the digest-algorithm is "UNIXsum",
+ the encoding is an ASCII string of decimal digits.
+
+ Examples:
+
+ MD5=HUXZLQLMuI/KZ5KDcJPcOA==
+ sha=thvDyvhfIqlvFe+A9MYgxAfm1q5=
+ UNIXsum=30637
+
+4.3 Header specifications
+
+ The following headers are defined.
+
+4.3.1 Want-Digest
+
+ The Want-Digest message header field indicates the sender's desire to
+ receive an instance digest on messages associated with the Request-
+ URI.
+
+ Want-Digest = "Want-Digest" ":"
+ #(digest-algorithm [ ";" "q" "=" qvalue])
+
+ If a digest-algorithm is not accompanied by a qvalue, it is treated
+ as if its associated qvalue were 1.0.
+
+ The sender is willing to accept a digest-algorithm if and only if it
+ is listed in a Want-Digest header field of a message, and its qvalue
+ is non-zero.
+
+ If multiple acceptable digest-algorithm values are given, the
+ sender's preferred digest-algorithm is the one (or ones) with the
+ highest qvalue.
+
+ Examples:
+
+ Want-Digest: md5
+ Want-Digest: MD5;q=0.3, sha;q=1
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 8]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+4.3.2 Digest
+
+ The Digest message header field provides a message digest of the
+ instance described by the message.
+
+ Digest = "Digest" ":" #(instance-digest)
+
+ The instance described by a message might be fully contained in the
+ message-body, partially-contained in the message-body, or not at all
+ contained in the message-body. The instance is specified by the
+ Request-URI and any cache-validator contained in the message.
+
+ A Digest header field MAY contain multiple instance-digest values.
+ This could be useful for responses expected to reside in caches
+ shared by users with different browsers, for example.
+
+ A recipient MAY ignore any or all of the instance-digests in a Digest
+ header field.
+
+ A sender MAY send an instance-digest using a digest-algorithm without
+ knowing whether the recipient supports the digest-algorithm, or even
+ knowing that the recipient will ignore it.
+
+ Examples:
+
+ Digest: md5=HUXZLQLMuI/KZ5KDcJPcOA==
+ Digest: SHA=thvDyvhfIqlvFe+A9MYgxAfm1q5=,unixsum=30637
+
+5 Negotiation of Content-MD5
+
+ HTTP/1.1 provides a Content-MD5 header field, but does not provide
+ any mechanism for requesting its use (or non-use). The Want-Digest
+ header field defined in this document provides the basis for such a
+ mechanism.
+
+ First, we add to the set of digest-algorithm values (in section
+ 4.1.1) the token "contentMD5", with the provision that this digest-
+ algorithm MUST NOT be used in a Digest header field.
+
+ The presence of the "contentMD5" digest-algorithm with a non-zero
+ qvalue in a Want-Digest header field indicates that the sender wishes
+ to receive a Content-MD5 header on messages associated with the
+ Request-URI.
+
+ The presence of the "contentMD5" digest-algorithm with a zero qvalue
+ in a Want-Digest header field indicates that the sender will ignore
+ Content-MD5 headers on messages associated with the Request-URI.
+
+
+
+
+Mogul, et. al. Standards Track [Page 9]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+6 IANA Considerations
+
+ The Internet Assigned Numbers Authority (IANA) administers the name
+ space for digest-algorithm values. Values and their meaning must be
+ documented in an RFC or other peer-reviewed, permanent, and readily
+ available reference, in sufficient detail so that interoperability
+ between independent implementations is possible. Subject to these
+ constraints, name assignments are First Come, First Served (see RFC
+ 2434 [11]).
+
+7 Security Considerations
+
+ This document specifies a data integrity mechanism that protects HTTP
+ instance data, but not HTTP entity headers, from certain kinds of
+ accidental corruption. It is also useful in detecting at least one
+ spoofing attack [9]. However, it is not intended as general
+ protection against malicious tampering with HTTP messages.
+
+ The HTTP Digest Access Authentication mechanism [5] provides some
+ protection against malicious tampering.
+
+8 Acknowledgements
+
+ It is not clear who first realized that the Content-MD5 header field
+ is not sufficient to provide data integrity when ranges or deltas are
+ used.
+
+ Laurent Demailly may have been the first to suggest an algorithm-
+ independent checksum header for HTTP [3]. Dave Raggett suggested the
+ use of the term "digest" instead of "checksum" [14].
+
+9 References
+
+ [1] Freed, N. and N. Borenstein, N., "MIME (Multipurpose Internet
+ Mail Extensions) Part One: Mechanisms for Specifying and
+ Describing the Format of Internet Message Bodies", RFC 2049,
+ November 1996.
+
+ [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement
+ Levels", BCP 14, RFC 2119, March 1997.
+
+ [3] Laurent Demailly. Re: Revised Charter.
+ http://www.ics.uci.edu/pub/ietf/http/hypermail/1995q4/0165.html.
+
+ [4] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
+ Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol --
+ HTTP/1.1.", RFC 2616, June 1999.
+
+
+
+
+Mogul, et. al. Standards Track [Page 10]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+ [5] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S.,
+ Leach, P., Luotonen, A. and L. Stewart, "HTTP Authentication:
+ Basic and Digest Access Authentication", RFC 2617, June 1999.
+
+ [6] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part One: Format of Internet Message Bodies",
+ RFC 2045, November 1996.
+
+ [7] Nevin Heintze. Scalable Document Fingerprinting. Proc. Second
+ USENIX Workshop on Electronic Commerce, USENIX, Oakland, CA,
+ November, 1996, pp. 191-200.
+ http://www.cs.cmu.edu/afs/cs/user/nch/www/koala/main.html.
+
+ [8] Merriam-Webster. Webster's Seventh New Collegiate Dictionary.
+ G. & C. Merriam Co., Springfield, MA, 1963.
+
+ [9] Mogul, J., Krishnamurthy, B., Douglis, F., Feldmann, A., Goland,
+ Y. and A. van Hoff, "Delta encoding in HTTP", RFC 3229, December
+ 2001.
+
+ [10] Myers, J. and M. Rose, "The Content-MD5 Header Field", RFC 1864,
+ October 1995.
+
+ [11] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA
+ Considerations Section in RFCs", BCP 26, RFC 2434, October 1998.
+
+ [12] National Institute of Standards and Technology. Secure Hash
+ Standard. FEDERAL INFORMATION PROCESSING STANDARDS PUBLICATION
+ 180-1, U.S. Department of Commerce, April, 1995.
+ http://csrc.nist.gov/fips/fip180-1.txt.
+
+ [13] The Open Group. The Single UNIX Specification, Version 2 - 6
+ Vol Set for UNIX 98. Document number T912, The Open Group,
+ February, 1997.
+
+ [14] Dave Raggett. Re: Revised Charter.
+ http://www.ics.uci.edu/pub/ietf/http/hypermail/1995q4/0182.html.
+
+ [15] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April
+ 1992.
+
+
+
+
+
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 11]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+10 Authors' Addresses
+
+ Jeffrey C. Mogul
+ Western Research Laboratory
+ Compaq Computer Corporation
+ 250 University Avenue
+ Palo Alto, California, 94305, U.S.A.
+
+ EMail: JeffMogul@acm.org
+ Phone: 1 650 617 3304 (email preferred)
+
+ Arthur van Hoff
+ Marimba, Inc.
+ 440 Clyde Avenue
+ Mountain View, CA 94043
+
+ EMail: avh@marimba.com
+ Phone: 1 (650) 930 5283
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 12]
+
+RFC 3230 Instance Digests in HTTP January 2002
+
+
+11 Full Copyright Statement
+
+ Copyright (C) The Internet Society (2002). All Rights Reserved.
+
+ This document and translations of it may be copied and furnished to
+ others, and derivative works that comment on or otherwise explain it
+ or assist in its implementation may be prepared, copied, published
+ and distributed, in whole or in part, without restriction of any
+ kind, provided that the above copyright notice and this paragraph are
+ included on all such copies and derivative works. However, this
+ document itself may not be modified in any way, such as by removing
+ the copyright notice or references to the Internet Society or other
+ Internet organizations, except as needed for the purpose of
+ developing Internet standards in which case the procedures for
+ copyrights defined in the Internet Standards process must be
+ followed, or as required to translate it into languages other than
+ English.
+
+ The limited permissions granted above are perpetual and will not be
+ revoked by the Internet Society or its successors or assigns.
+
+ This document and the information contained herein is provided on an
+ "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+ TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+ BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+ HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+ MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Mogul, et. al. Standards Track [Page 13]
+