summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2938.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc2938.txt')
-rw-r--r--doc/rfc/rfc2938.txt1011
1 files changed, 1011 insertions, 0 deletions
diff --git a/doc/rfc/rfc2938.txt b/doc/rfc/rfc2938.txt
new file mode 100644
index 0000000..b762086
--- /dev/null
+++ b/doc/rfc/rfc2938.txt
@@ -0,0 +1,1011 @@
+
+
+
+
+
+
+Network Working Group G. Klyne
+Request for Comments: 2938 Content Technologies
+Updates: 2533 L. Masinter
+Category: Standards Track AT&T
+ September 2000
+
+
+ Identifying Composite Media Features
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2000). All Rights Reserved.
+
+Abstract
+
+ In RFC 2533, an expression format is presented for describing media
+ feature capabilities as a combination of simple media feature tags.
+
+ This document describes an abbreviated format for a composite media
+ feature set, based upon a hash of the feature expression describing
+ that composite.
+
+Table of Contents
+
+ 1. Introduction ................................................2
+ 1.1 Organization of this document ...............................2
+ 1.2 Terminology and document conventions ........................2
+ 2. Motivation and goals ........................................3
+ 3. Composite feature representation ............................4
+ 3.1 Feature set hashed reference format .........................5
+ 3.1.1 Hash value calculation ......................................6
+ 3.1.2 Base-32 value representation ................................7
+ 3.2 Resolving feature set identifiers ...........................8
+ 3.2.1 Query protocol ..............................................8
+ 3.2.2 Inline feature set details ..................................9
+ 4. Examples ...................................................10
+ 5. Internationalization Considerations ........................12
+ 6. Security Considerations ....................................13
+ 7. Acknowledgements ...........................................13
+ 8. References .................................................13
+
+
+
+Klyne & Masinter Standards Track [Page 1]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ 9. Authors' Addresses .........................................15
+ 10. Appendix A: The birthday paradox ...........................16
+ 11. Full Copyright Statement ...................................18
+
+1. Introduction
+
+ In "A Syntax for Describing Media Feature Sets" [1], an expression
+ format is presented for describing media feature capabilities as a
+ combination of simple media feature tags [2].
+
+ This document proposes an abbreviated format for a composite media
+ feature set, based upon a hash of the feature expression describing
+ that composite.
+
+ This memo extends and builds upon the expression syntax described in
+ RFC 2533 [1], and it is assumed that the reader is familiar with the
+ interpretation of feature set expressions described there.
+
+1.1 Organization of this document
+
+ Section 2 sets out some of the background and goals for feature set
+ references.
+
+ Section 3 presents a syntax for feature set references, and describes
+ how they are related to feature set expressions.
+
+1.2 Terminology and document conventions
+
+ This section defines a number of terms and other document
+ conventions, which are used with specific meaning in this memo. The
+ terms are listed in alphabetical order.
+
+ dereference
+ the act of replacing a feature set reference with its
+ corresponding feature set expression. Also called
+ "resolution".
+
+ feature set
+ some set of media features described by a media feature
+ assertion, as described in "A Syntax for Describing Media
+ Feature Sets" [1]. (See that memo for a more formal
+ definition of this term.)
+
+ feature set expression
+ a string that describes some feature set, formulated
+ according to the rules in "A Syntax for Describing Media
+ feature sets" [1] (and possibly extended by other
+ specifications).
+
+
+
+Klyne & Masinter Standards Track [Page 2]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ feature set reference
+ a brief construct that references some feature set. (See
+ also: "dereference".)
+
+ feature set tag
+ a name that conforms to the syntax of a feature tag [2] that
+ is used to denote a feature set rather than a single
+ feature.
+
+ resolution
+ (See "dereference").
+
+ This specification uses syntax notation and conventions described
+ in RFC 2234, "Augmented BNF for Syntax Specifications: ABNF" [3].
+
+ NOTE: Comments like this provide additional nonessential
+ information about the rationale behind this document. Such
+ information is not needed for building a conformant
+ implementation, but may help those who wish to understand the
+ design in greater depth.
+
+2. Motivation and goals
+
+ The range of media feature capabilities of a message handling system
+ can be quite extensive, and the corresponding feature set expression
+ [1] can reach a significant size.
+
+ A requirement has been identified to allow recurring feature sets to
+ be identified by a single reference value, which can be combined with
+ other elements in a feature set expression. It is anticipated that
+ mechanisms will be provided that allow the recipient of such a
+ feature set reference to discover the corresponding feature set
+ expression, but any such mechanism is beyond the scope of this
+ specification.
+
+ Thus, the goals for this proposal are:
+
+ o to provide an abbreviated form for referencing an arbitrary
+ feature set expression.
+
+ o the meaning of (i.e., the corresponding feature set expression) a
+ feature set reference should be independent of any particular
+ mechanism that may be used to dereference it.
+
+ o to be able to verify whether a given feature set expression
+ corresponds to some feature set reference without having to
+ perform an explicit dereferencing operation (i.e., without
+ incurring additional network traffic).
+
+
+
+Klyne & Masinter Standards Track [Page 3]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ o for protocol processors that conform to RFC 2533 [1] to be able to
+ sensibly handle a feature set reference without explicit knowledge
+ of its meaning (i.e., the introduction of feature set references
+ should not break existing feature expression processors). That
+ is, the applicable interpretation and processing rules of RFC 2533
+ [1] apply equally to expressions containing feature set
+ references.
+
+ NOTE: This proposal does not attempt to address the "override"
+ or "default" problem. (Where a feature set may be referenced and
+ selectively modified.)
+
+ Some circumstances in which such an abbreviated form might be used
+ include:
+
+ o A media feature expression that contains a repeated sub-
+ expression. If the sub-expression is quite large, space can be
+ saved by writing it out once, then using the abbreviated form to
+ reference it.
+
+ o A capability that is common to a range of devices, such as a given
+ class of fax machine where are large number of feature tags are
+ involved, but only a small number of common feature sets. If the
+ recipient understands, or can discover, that some abbreviation
+ stands for a given feature set then feature expression size can be
+ reduced by using the abbreviation.
+
+ If feature set abbreviations are used in this way, it may be that
+ they can be interpreted by a simple table lookup rather than full
+ feature expression parsing. (Making this useful in practice will
+ depend on crafting the feature subsets appropriately.)
+
+ Examples of such usage are given in section 4 of this memo.
+
+ This memo does not specify how a program that receives a feature set
+ abbreviation should discover the corresponding feature set
+ expression: see section 3.2.
+
+3. Composite feature representation
+
+ This specification hinges on two central ideas:
+
+ o the use of auxiliary predicates (introduced in RFC 2533 [1]) to
+ form the basis of a feature set identifier, and
+
+ o the use of a token based on a hash function computed over the
+ referenced feature set expression.
+
+
+
+
+Klyne & Masinter Standards Track [Page 4]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ A key reason to use a hash function to generate an identifier is to
+ define a global name space without requiring a central naming
+ authority. New feature set tags can be introduced by any party
+ following the appropriate rules of formulation, without reference to
+ any centralized authority.
+
+ Local resolution services may be needed to map feature set tags to
+ their corresponding feature set expressions, but these are not able
+ to vary the meaning of any given tag. Failure of a resolution
+ service to return the correct expression is detectable by a calling
+ application, which should reject any incorrect value supplied.
+
+ NOTE: where a feature set reference is used, its meaning is
+ defined by substitution of the referenced feature expression into
+ the referencing expression. When all references have been thus
+ replaced, the result is interpreted as a normal feature
+ expression.
+
+ In particular, if a referenced feature expression contains some
+ feature tag that is also constrained by the referencing
+ expression, the constraints are interpreted per RFC 2533 [1],
+ without regard for their origin. E.g., (using some notation
+ introduced below):
+ (& (pix-x=100) (pix-y<=300)
+ (h.SBB5REAOMHC09CP2GM4V07PQP0) )
+ where (h.SBB5REAOMHC09CP2GM4V07PQP0) resolves to:
+ (& (pix-x<=200) (pix-y<=150) )
+ yields a result equivalent to:
+ (& (pix-x=100) (pix-y<=150) )
+
+3.1 Feature set hashed reference format
+
+ This specification introduces a special form of auxiliary predicate
+ name with the following syntax:
+
+ fname = "h." 1*BASE32DIGIT
+ BASE32DIGIT = DIGIT
+ / "A" / "B" / "C" / "D" / "E" / "F" / "G" / "H"
+ / "I" / "J" / "K" / "L" / "M" / "N" / "O" / "P"
+ / "Q" / "R" / "S" / "T" / "U" / "V"
+
+ The sequence of base-32 digits represents the value of a hash
+ function calculated over the corresponding feature set expression
+ (see following sections). Note that the above syntax allows upper-
+ or lower-case letters for base-32 digits (per RFC 2234 [3]).
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 5]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ Thus, within a feature set expression, a hashed feature set reference
+ would have the following form:
+
+ (h.123456789abcdefghijklmnopq)
+
+3.1.1 Hash value calculation
+
+ The hash value is calculated using the MD5 algorithm [6] over the
+ text of the referenced feature set expression subjected to certain
+ normalizations. The feature expression must conform to the syntax
+ given for 'filter' in RFC 2533 [1]:
+
+ filter = "(" filtercomp ")" *( ";" parameter )
+
+ The steps for calculating a hash value are:
+
+ 1. Whitespace normalization: all spaces, CR, LF, TAB and any other
+ layout control characters that may be embedded in the feature
+ expression string, other than those contained within quoted
+ strings, are removed (or ignored for the purpose of hash value
+ computation).
+
+ 2. Case normalization: all lower case letters in the feature
+ expression, other than those contained within quoted strings, are
+ converted to upper case. That is, unquoted characters with US-
+ ASCII values 97 to 122 (decimal) are changed to corresponding
+ characters in the range 65 to 90.
+
+ 3. Hash computation: the MD5 algorithm, described in RFC 1321 [6], is
+ applied to the normalized feature expression string (represented
+ as a sequence of octets containing US-ASCII character codes; see
+ also section 5).
+
+ The result obtained in step 3 is a 128-bit (16 octet) value that
+ is converted to a base-32 representation to form the feature set
+ reference.
+
+ NOTE: under some circumstances, removal of ALL whitespace may
+ result in an invalid feature expression string. This should not
+ be a problem as this is done only for the purpose of calculating
+ a hash value, and significantly different feature expressions are
+ expected to differ in ways other than their whitespace.
+
+ NOTE: case normalization is deemed appropriate since feature tag
+ and token matching is case insensitive.
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 6]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+3.1.2 Base-32 value representation
+
+ RFC 1321 [6] describes how to calculate an MD5 hash value that is a
+ sequence of 16 octets. This is then required to be coded as a base-
+ 32 value, which is a sequence of base-32 digit characters.
+
+ Each successive character in a base-32 value represents 5 successive
+ bits of the underlying octet sequence. Thus, each group of 8
+ characters represents a sequence of 5 octets (40 bits):
+
+ 1 2 3
+ 01234567 89012345 67890123 45678901 23456789
+ +--------+--------+--------+--------+--------+
+ |< 1 >< 2| >< 3 ><|.4 >< 5.|>< 6 ><.|7 >< 8 >|
+ +--------+--------+--------+--------+--------+
+ <===> 8th character
+ <====> 7th character
+ <===> 6th character
+ <====> 5th character
+ <====> 4th character
+ <===> 3rd character
+ <====> 2nd character
+ <===> 1st character
+
+ The value (i.e. sequence of bits) represented by each base-32 digit
+ character is indicated by the following table:
+
+ "0" 0 "A" 10 "K" 20 "U" 30
+ "1" 1 "B" 11 "L" 21 "V" 31
+ "2" 2 "C" 12 "M" 22
+ "3" 3 "D" 13 "N" 23
+ "4" 4 "E" 14 "O" 24
+ "5" 5 "F" 15 "P" 25
+ "6" 6 "G" 16 "Q" 26
+ "7" 7 "H" 17 "R" 27
+ "8" 8 "I" 18 "S" 28
+ "9" 9 "J" 19 "T" 29
+
+ When encoding a base-32 value, each full group of 5 octets is
+ represented by a sequence of 8 characters indicated above. If a
+ group of less than 5 octets remain after this, they are encoded using
+ as many additional characters as may be needed: 1, 2, 3 or 4 octets
+ are encoded by 2, 4, 5 or 7 characters respectively. Any spare bits
+ represented by the base-32 digit characters are selected to be zero.
+
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 7]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ When decoding a base-32 value, the reverse mapping is applied: each
+ full group of 8 characters codes a sequence of 5 octets. A final
+ group of 2, 4, 5 or 7 characters codes a sequence of 1, 2, 3 or 4
+ octets respectively. Any spare bits represented by the final group
+ of characters are discarded.
+
+ Thus, for a 128-bit (16 octet) MD5 hash value, the first 15 octets
+ are coded as 24 base 32 digit characters, and the final octet is
+ coded by two characters.
+
+ NOTE: Base64 representation (per MIME [4]) would be more compact
+ (21 rather than 26 characters for the MD5 128-bit hash value),
+ but an auxiliary predicate name is defined (by [1]) to have the
+ same syntax as a feature tag, and the feature tag matching rules
+ (per [2]) state that feature tag matching is case insensitive.
+
+ Base36 representation was considered (i.e., using all letters
+ "A"-"Z") but was not used because this would require extended
+ precision multiplication and division operations to encode and
+ decode the hash values.
+
+3.2 Resolving feature set identifiers
+
+ This memo does not mandate any particular mechanism for dereferencing
+ a feature set identifier. It is expected that specific dereferencing
+ mechanisms will be specified for any application or protocol that
+ uses them.
+
+ The following sections describe some ways that feature set
+ dereferencing information may be incorporated into a feature set
+ expression. These are based on auxiliary predicate definitions
+ within a "where" clause [1].
+
+ When a hashed feature set reference is used, conformance to the
+ hashing rules takes precedence over any other determination of the
+ feature expression. Any expression, however obtained, may not be
+ substituted for the hash-based reference unless it yields the correct
+ hash value.
+
+3.2.1 Query protocol
+
+ A protocol providing request/response type queries (e.g., HTTP, LDAP,
+ etc.) might be set up to provide a resolution service.
+
+ Thus, a query to a server associated with the capabilities could be
+ performed on the feature set identifier. The response returned would
+ be a CONNEG expression; e.g.,
+
+
+
+
+Klyne & Masinter Standards Track [Page 8]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ (h.SBB5REAOMHC09CP2GM4V07PQP0)
+ where
+ (h.SBB5REAOMHC09CP2GM4V07PQP0) :- (& (pix-x<=200) (pix-y<=150) )
+ end
+
+ or just:
+
+ (& (pix-x<=200) (pix-y<=150) )
+
+ This result would be combined with the original expression to
+ obtain a result not including the hash based predicate.
+
+ This process might be further enhanced by using URN resolution
+ mechanisms (e.g., DNS NAPTR [10]) to discover the resolution
+ protocol and server.
+
+3.2.2 Inline feature set details
+
+ In this case, a reference is resolved by including its definition
+ inline in an expression.
+
+ The feature set expression associated with a reference value may be
+ specified directly in a "where" clause, using the auxiliary
+ predicate definition syntax [1]; e.g.,
+
+ (& (dpi=100) (h.SBB5REAOMHC09CP2GM4V07PQP0) )
+ where
+ (h.SBB5REAOMHC09CP2GM4V07PQP0) :- (& (pix-x<=200) (pix-y<=150) )
+ end
+
+ This form might be used on request (where the request mechanism is
+ defined by the invoking application protocol), or when the originator
+ believes the recipient may not understand the reference.
+
+ It is an error if the inline feature expression does not yield the
+ hash value contained in auxiliary predicate name.
+
+ NOTE: viewed in isolation, this format does not have any obvious
+ value, in that the (h.xxx) form of auxiliary predicate could be
+ replaced by any arbitrary name.
+
+ It is anticipated that this form might be used as a follow-up
+ response in a sequence along the lines of:
+ A> Capabilities are:
+ (& (dpi=100) (h.SBB5REAOMHC09CP2GM4V07PQP0) )
+ B> Do not understand:
+ (h.SBB5REAOMHC09CP2GM4V07PQP0)
+
+
+
+
+Klyne & Masinter Standards Track [Page 9]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ A> Capabilities are:
+ (& (dpi=100) (h.SBB5REAOMHC09CP2GM4V07PQP0) )
+ where
+ (h.SBB5REAOMHC09CP2GM4V07PQP0) :- (& (pix-x<=200)
+ (pix-y<=150) )
+ end
+
+4. Examples
+
+ The following are some examples of feature set expressions containing
+ feature set references:
+
+ (& (dpi=100) (h.SBB5REAOMHC09CP2GM4V07PQP0) )
+
+ (& (dpi=100) (h.SBB5REAOMHC09CP2GM4V07PQP0) )
+ where
+ (h.SBB5REAOMHC09CP2GM4V07PQP0) :-
+ (& (pix-x<=200) (pix-y<=150) )
+ end
+
+ (h.QGEOPMCF02P09QC016CEPU22FO)
+ where
+ (h.QGEOPMCF02P09QC016CEPU22FO) :-
+ (| (& (ua-media=continuous) (dpi=200) (dpi-xyratio=200/100)
+ (color=Binary) (paper-size=B4) (image-coding=MH) )
+ (& (ua-media=continuous) (dpi=200) (dpi-xyratio=200/100)
+ (color=Binary) (paper-size=B4) (image-coding=MR) )
+ (& (ua-media=stationery) (dpi=300) (dpi-xyratio=1)
+ (color=Binary) (paper-size=A4) (image-coding=JBIG) )
+ (& (ua-media=transparency) (dpi=300) (dpi-xyratio=1)
+
+ (color=Binary) (paper-size=A4) (image-coding=JBIG) ) )
+ end
+
+ The following examples are based on Internet fax work, and show how a
+ feature-hash might be used to express the commonly-used features. A
+ form of Internet fax system that is expected to be quite common is a
+ so-called "simple mode" system, whose capabilities are described by
+ the following feature expression:
+
+ (& (image-file-structure=TIFF-minimal)
+ (MRC-mode=0)
+ (color=Binary)
+ (image-coding=MH) (MRC-mode=0)
+ (| (& (dpi=204) (dpi-xyratio=[204/98,204/196]) )
+ (& (dpi=200) (dpi-xyratio=[200/100,1]) ) )
+ (size-x<=2150/254)
+ (paper-size=A4)
+
+
+
+Klyne & Masinter Standards Track [Page 10]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ (ua-media=stationery) )
+
+ This might be expressed by the hash-based feature set identifier:
+
+ (h.MSB955PVIRT1QOHET9AJT5JM3O)
+
+ The following example describes capabilities of a full-color
+ Internet fax system. Note a number of feature values are
+ applicable in common with '(color=grey)' and '(color=full)':
+
+ (& (image-file-structure=TIFF)
+ (MRC-mode=0)
+ (| (& (color=Binary)
+ (image-coding=[MH,MR,MMR])
+ (| (& (dpi=204) (dpi-xyratio=[204/98,204/196]) )
+ (& (dpi=200) (dpi-xyratio=[200/100,1]) )
+ (& (dpi=300) (dpi-xyratio=1) ) ) )
+ (& (color=grey)
+ (image-coding=JPEG)
+ (image-coding-constraint=JPEG-T4E)
+ (color-levels<=256)
+ (color-space=CIELAB)
+ (color-illuminant=D50)
+ (CIELAB-L-min>=0)
+ (CIELAB-L-max<=100)
+ (dpi=[100,200,300]) (dpi-xyratio=1) )
+ (& (color=full)
+ (image-coding=JPEG)
+ (image-coding-constraint=JPEG-T4E)
+ (color-subsampling=["1:1:1","4:1:1"])
+ (color-levels<=16777216)
+ (color-space=CIELAB)
+ (color-illuminant=D50)
+ (CIELAB-L-min>=0)
+ (CIELAB-L-max<=100)
+ (CIELAB-a-min>=-85)
+ (CIELAB-a-max<=85)
+ (CIELAB-b-min>=-75)
+ (CIELAB-b-max<=125)
+ (dpi=[100,200,300]) (dpi-xyratio=1) ) )
+ (size-x<=2150/254)
+ (paper-size=[letter,A4,B4]) )
+ (ua-media=stationery) )
+
+
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 11]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ Separating out the common capabilities yields:
+
+ (& (image-file-structure=TIFF)
+ (MRC-mode=0)
+ (| (& (color=Binary)
+ (image-coding=[MH,MR,MMR])
+ (| (& (dpi=204) (dpi-xyratio=[204/98,204/196]) )
+ (& (dpi=200) (dpi-xyratio=[200/100,1]) )
+ (& (dpi=300) (dpi-xyratio=1) ) ) )
+ (& (color=grey)
+ (color-levels<=256)
+ (h.QVSEM8V2LMJ8VOR7V682J7079O) )
+ (& (color=full)
+ (color-subsampling=["1:1:1","4:1:1"])
+ (color-levels<=16777216)
+ (CIELAB-a-min>=-85)
+ (CIELAB-a-max<=85)
+ (CIELAB-b-min>=-75)
+ (CIELAB-b-max<=125)
+ (h.QVSEM8V2LMJ8VOR7V682J7079O) ) )
+ (size-x<=2150/254)
+ (paper-size=[letter,A4,B4]) )
+ (ua-media=stationery) )
+ where
+ (h.QVSEM8V2LMJ8VOR7V682J7079O) :-
+ (& (image-coding=JPEG)
+ (image-coding-constraint=JPEG-T4E)
+ (color-space=CIELAB)
+ (color-illuminant=D50)
+ (CIELAB-L-min>=0)
+ (CIELAB-L-max<=100)
+ (dpi=[100,200,300]) (dpi-xyratio=1) )
+ end
+
+5. Internationalization Considerations
+
+ Feature set expressions and URI strings are currently defined to
+ consist of only characters from the US-ASCII repertoire [1,5]; under
+ these circumstances this specification is not impacted by
+ internationalization considerations (other than any already
+ applicable to URIs [5]).
+
+ But, if future revisions of the feature set syntax permit non-US-
+ ASCII characters (e.g. within quoted strings), then some canonical
+ representation must be defined for the purposes of calculating hash
+ values. One choice might be to use a UTF-8 equivalent representation
+ as the basis for calculating the feature set hash. Another choice
+
+
+
+
+Klyne & Masinter Standards Track [Page 12]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ might be to leave this as an application protocol issue (but this
+ could lead to non-interoperable feature sets between different
+ protocols).
+
+ Another conceivable issue is that of up-casing the feature expression
+ in preparation for computing a hash value. This does not apply to
+ the content of strings so is not likely to be an issue. But if
+ changes are made that do permit non-US-ASCII characters in feature
+ tags or token strings, consideration must be given to properly
+ defining how case conversion is to be performed.
+
+6. Security Considerations
+
+ For the most part, security considerations are the same as those that
+ apply for capability identification in general [1,2,9].
+
+ A possible added consideration is that use of a specific feature set
+ identifier may reveal more information about a system than is
+ necessary for a transaction at hand.
+
+7. Acknowledgements
+
+ Ideas here have been improved by early discussions with Martin
+ Duerst, Al Gilman and Ted Hardie. Useful suggestions for improvement
+ were provided by Maurizio Codogno.
+
+8. References
+
+ [1] Klyne, G., "A Syntax for Describing Media Feature Sets", RFC
+ 2533, March 1999.
+
+ [2] Mutz, A. and T. Hardie, "Media Feature Tag Registration
+ Procedure", RFC 2506, March 1999.
+
+ [3] Crocker, D. and P. Overell, "Augmented BNF for Syntax
+ Specifications: ABNF", RFC 2234, November 1997.
+
+ [4] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part 1: Format of Internet message bodies",
+ RFC 2045, November 1996.
+
+ [5] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource
+ Identifiers (URI): Generic Syntax", RFC 2396, August 1998.
+
+ [6] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April
+ 1992.
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 13]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ [7] "Applied Cryptography"
+ Bruce Schneier
+ John Wiley and Sons, 1996 (second edition)
+ ISBN 0-471-12845-7 (cloth)
+ ISBN 0-471-11709-9 (paper)
+
+ [8] Klyne, G., "Protocol-independent Content Negotiation Framework",
+ RFC 2703, September 1999.
+
+ [9] "Numerical Recipes"
+ William H Press, Brian P Flannery, Saul A Teukolski and
+ William T Vetterling
+ Cambridge University Press (1986)
+ ISBN 0 521 30811 9
+ (The Gamma function approximation is presented in chapter 6 on
+ "Special Functions". There have been several later editions of
+ this book published, so the chapter reference may change.)
+
+ [10] Daniel, R. and M. Mealling, "Resolution of Uniform Resource
+ Identifiers using the Domain Name System", RFC 2168, June 1997.
+
+ [11] Java source code of feature set matching algorithm, with feature
+ set hash computation option. Linked from
+ <http://www.imc.org/ietf-medfree/>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 14]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+9. Authors' Addresses
+
+ Graham Klyne
+ Content Technologies Ltd.
+ 1220 Parkview,
+ Arlington Business Park
+ Theale
+ Reading, RG7 4SA
+ United Kingdom
+
+ Phone: +44 118 930 1300
+ Fax: +44 118 930 1301
+ EMail: GK@ACM.ORG
+
+
+ Larry Masinter
+ AT&T Labs
+ 75 Willow Road
+ Menlo Park, CA 94025
+
+ Phone: +1-650-463-7059
+ EMail: LMM@acm.org
+ http://larry.masinter.net
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 15]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+10. Appendix A: The birthday paradox
+
+ NOTE: this entire section is commentary, and does not affect the
+ feature set reference specification in any way.
+
+ The use of a hash value to represent an arbitrary feature set is
+ based on a presumption that no two distinct feature sets will yield
+ the same hash value.
+
+ There is a small but distinct possibility that two different feature
+ sets will indeed yield the same hash value.
+
+ We assume that the 128-bit hash function distributes hash values for
+ feature sets, even those with very small differences, randomly and
+ evenly through the range of 2^128 (approximately 3*10^38) possible
+ values. This is a fundamental property of a good digest algorithm
+ like MD5. Thus, the chance that any two distinct feature set
+ expressions yield the same hash is less than 1 in 10^38. This is
+ negligible when compared with, say, the probability that a receiving
+ system will fail having received data conforming to a negotiated
+ feature set.
+
+ But when the number of distinct feature sets in circulation
+ increases, the probability of repeating a hash value increases
+ surprisingly. This is illustrated by the "birthday paradox": given
+ a random collection of just 23 people, there is a greater than even
+ chance that there exists some pair with the same birthday. This
+ topic is discussed further in sections 7.4 and 7.5 of Bruce
+ Schneier's "Applied Cryptography" [7].
+
+ The table below shows the "birthday paradox" probabilities that at
+ least one pair of feature sets has the same hash value for different
+ numbers of feature sets in use.
+
+ Number of feature Probability of two
+ sets in use sets with the same
+ hash value
+ 1 0
+ 2 3E-39
+ 10 1E-37
+ 1E3 1E-33
+ 1E6 1E-27
+ 1E9 1E-21
+ 1E12 1E-15
+ 1E15 1E-9
+ 1E18 1E-3
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 16]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+ The above probability computations are approximate, being
+ performed using logarithms of a Gamma function
+ approximation by Lanczos [9]. The probability formula is
+ 'P=1-(m!/((m-n)! m^n))', where 'm' is the total number of
+ possible hash values (2^128) and 'n' is the number of
+ feature sets in use.
+
+ If original feature set expressions are generated manually, or only
+ in response to some manually constrained process, the total number
+ of feature sets in circulation is likely to remain very small in
+ relation to the total number of possible hash values.
+
+ The outcome of all this is: assuming that the feature sets are
+ manually generated, even taking account of the birthday paradox
+ effect, the probability of incorrectly identifying a feature set
+ using a hash value is still negligibly small when compared with
+ other possible failure modes.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 17]
+
+RFC 2938 Identifying Composite Media Features September 2000
+
+
+11. Full Copyright Statement
+
+ Copyright (C) The Internet Society (2000). All Rights Reserved.
+
+ This document and translations of it may be copied and furnished to
+ others, and derivative works that comment on or otherwise explain it
+ or assist in its implementation may be prepared, copied, published
+ and distributed, in whole or in part, without restriction of any
+ kind, provided that the above copyright notice and this paragraph are
+ included on all such copies and derivative works. However, this
+ document itself may not be modified in any way, such as by removing
+ the copyright notice or references to the Internet Society or other
+ Internet organizations, except as needed for the purpose of
+ developing Internet standards in which case the procedures for
+ copyrights defined in the Internet Standards process must be
+ followed, or as required to translate it into languages other than
+ English.
+
+ The limited permissions granted above are perpetual and will not be
+ revoked by the Internet Society or its successors or assigns.
+
+ This document and the information contained herein is provided on an
+ "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+ TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+ BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+ HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+ MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klyne & Masinter Standards Track [Page 18]
+