diff options
Diffstat (limited to 'doc/rfc/rfc2169.txt')
-rw-r--r-- | doc/rfc/rfc2169.txt | 507 |
1 files changed, 507 insertions, 0 deletions
diff --git a/doc/rfc/rfc2169.txt b/doc/rfc/rfc2169.txt new file mode 100644 index 0000000..f5db5a4 --- /dev/null +++ b/doc/rfc/rfc2169.txt @@ -0,0 +1,507 @@ + + + + + + +Network Working Group R. Daniel +Request for Comments: 2169 Los Alamos National Laboratory +Category: Experimental June 1997 + + + A Trivial Convention for using HTTP in URN Resolution + +Status of this Memo +=================== + + This memo defines an Experimental Protocol for the Internet + community. This memo does not specify an Internet standard of any + kind. Discussion and suggestions for improvement are requested. + Distribution of this memo is unlimited. + +Abstract: +========= + + The Uniform Resource Names Working Group (URN-WG) was formed to + specify persistent, location-independent names for network accessible + resources, as well as resolution mechanisms to retrieve the resources + given such a name. At this time the URN-WG is considering one + particular resolution mechanism, the NAPTR proposal [1]. That + proposal specifies how a client may find a "resolver" for a URN. A + resolver is a database that can provide information about the + resource identified by a URN, such as the resource's location, a + bibliographic description, or even the resource itself. The protocol + used for the client to communicate with the resolver is not specified + in the NAPTR proposal. Instead, the NAPTR resource record provides a + field that indicates the "resolution protocol" and "resolution + service requests" offered by the resolver. + + This document specifies the "THTTP" resolution protocol - a trivial + convention for encoding resolution service requests and responses as + HTTP 1.0 or 1.1 requests and responses. The primary goal of THTTP is + to be simple to implement so that existing HTTP servers may easily + add support for URN resolution. We expect that the databases used by + early resolvers will be useful when more sophisticated resolution + protocols are developed later. + +1.0 Introduction: +================== + + The NAPTR specification[1] defined a new DNS resource record which + may be used to discover resolvers for Uniform Resource Identifiers. + That resource record provides the "services" field to specify the + "resolution protocol" spoken by the resolver, as well as the + "resolution services" it offers. Resolution protocols mentioned in + + + +Daniel Experimental [Page 1] + +RFC 2169 HTTP in URN Resolution June 1997 + + + that specification are Z3950, THTTP, RCDS, HDL, and RWHOIS. (That + list is expected to grow over time). The NAPTR specification also + lists a variety of resolution services, such as N2L (given a URN, + return a URL); N2R (Given a URN, return the named resource), etc. + + This document specifies the "THTTP" (Trivial HTTP) resolution + protocol. THTTP is a simple convention for encoding resolution + service requests and responses as HTTP 1.0 or 1.1 requests and + responses. The primary goal of THTTP is to have a URN resolution + protocol that can easily be added to existing HTTP daemons. Other + resolution protocols are expected to arise over time, so this + document serves a secondary purpose of illustrating the information + that needs to be specified for a URN resolution protocol. One of the + resolution protocols we expect to be developed is an extension of + HTTP with new methods for the resolution services. Therefore, we use + "THTTP" as the identifier for this protocol to leave "HTTP" for later + developments. + + The reader is assumed to be familiar with the HTTP/1.0 [2] and 1.1 + [3] specifications. Implementors of this specification should be + familiar with CGI scripts, or server-specific interfaces, for + database lookups. + +2.0 General Approach: +===================== + + The general approach used to encode resolution service requests in + THTTP is quite simple: + + GET /uri-res/<service>?<uri> HTTP/1.0 + + For example, if we have the URN "urn:foo:12345-54321" and want a URL, + we would send the request: + + GET /uri-res/N2L?urn:foo:12345-54321 HTTP/1.0 + + The request could also be encoded as an HTTP 1.1 request. This would + look like: + + GET /uri-res/N2L?urn:foo:12345-54321 HTTP/1.1 + Host: <whatever host we are sending the request to> + + Responses from the HTTP server follow standard HTTP practice. Status + codes, such as 200 (OK) or 404 (Not Found) shall be returned. The + normal rules for determining cachability, negotiating formats, etc. + apply. + + + + + +Daniel Experimental [Page 2] + +RFC 2169 HTTP in URN Resolution June 1997 + + + Handling these requests on the server side is easy to implement using + CGI or other, server-specific, extension mechanisms. CGI scripts + will see the incoming URI in the QUERY_STRING environment variable. + Any %encoded characters in the URN will remain in their %encoded + state in that string. The script can take the URN, look it up in a + database, and return the requested information. + + One caveat should be kept in mind. The URN syntax document [4] + discusses the notion of lexical equivalance and requires that + resolvers return identical results for URNs that are lexically + equivalent. Implementors of this specification must be careful to + obey that rule. For example, the two requests below MUST return + identical results, since the URNs are lexically equivalent. + GET /uri-res/N2L?urn:cid:foo@huh.com HTTP/1.0 + GET /uri-res/N2L?URN:CID:foo@huh.com HTTP/1.0 + +3.0 Service-specific details: +============================= + + This section goes through the various resolution services established + in the URN services document [5] and states how to encode each of + them, how the results should be returned, and any special status + codes that are likely to arise. + + Unless stated otherwise, the THTTP requests are formed according to + the simple convention above, either for HTTP/1.0 or HTTP/1.1. The + response is assumed to be an entity with normal headers and body + unless stated otherwise. (N2L is the only request that need not + return a body). + +3.1 N2L (URN to URL): +---------------------- + + The request is encoded as above. The URL MUST be returned in a + Location: header for the convienience of the user in the most common + case of wanting the resource. If the lookup is successful, a 30X + status line SHOULD be returned. HTTP/1.1 clients should be sent the + 303 status code. HTTP/1.0 clients should be sent the 302 (Moved + temporarily) status code unless the resolver has particular reasons + for using 301 (moved permanently) or 304 (not modified) codes. + + Note that access controls may be applied to this, or any other, + resolution service request. Therefore the 401 (unauthorized) and 403 + (forbidden) status codes are legal responses. The server may wish to + provide a body in the response to explain the reason for refusing + access, and/or to provide alternate information about the resource, + such as the price it will cost to obtain the resource's URL. + + + + +Daniel Experimental [Page 3] + +RFC 2169 HTTP in URN Resolution June 1997 + + +3.2 N2Ls (URN to URLs): +------------------------ + + The request is encoded as above. The result is a list of 0 or more + URLs. The Internet Media Type (aka ContentType) of the result may be + negotiated using standard HTTP mechanisms if desired. At a minimum + the resolver should support the text/uri-list media type. (See + Appendix A for the definition of this media type). That media type is + suitable for machine-processing of the list of URLs. Resolvers may + also return the results as text/html, text/plain, or any other media + type they deem suitable. + + No matter what the particular media type, the result MUST be a list + of the URLs which may be used to obtain an instance of the resource + identified by the URN. All URIs shall be encoded according to the URI + specification [6]. + + If the client has requested the result be returned as text/html or + application/html, the result should be a valid HTML docment + containing the fragment: + <UL> + <LI><A HREF="...url 1...">...url 1...</A> + <LI><A HREF="...url 2...">...url 2...</A> + etc. + </UL> + where the strings ...url n... are replaced by the n'th URL in the + list. + +3.3 N2R (URN to Resource): +--------------------------- + + The request is encoded as above. The resource is returned using + standard HTTP mechanisms. The request may be modified using the + Accept: header as in normal HTTP to specify that the result be given + in a preferred Internet Media Type. + +3.4 N2Rs (URN to Resources): +----------------------------- + + This resolution service returns multiple instances of a resource, for + example, GIF and JPEG versions of an image. The judgment about the + resources being "the same" resides with the naming authority that + issued the URN. + + The request is encoded as above. The result shall be a MIME + multipart/alternative message with the alternative versions of the + resource in seperate body parts. If there is only one version of the + resource identified by the URN, it MAY be returned without the + + + +Daniel Experimental [Page 4] + +RFC 2169 HTTP in URN Resolution June 1997 + + + multipart/alternative wrapper. Resolver software SHOULD look at the + Accept: header, if any, and only return versions of the resource that + are acceptable according to that header. + +3.5 N2C (URN to URC): +---------------------- + + URCs (Uniform Resource Characteristics) are descriptions of other + resources. This request allows us to obtain a description of the + resource identified by a URN, as opposed to the resource itself. The + description might be a bibliographic citation, a digital signature, a + revision history, etc. This document does not specify the content of + any response to a URC request. That content is expected to vary from + one resolver to another. + + The format of any response to a N2C request MUST be communicated + using the ContentType header, as is standard HTTP practice. The + Accept: header SHOULD be honored. + +3.6 N2Ns (URN to URNs): +------------------------ + + While URNs are supposed to identify one and only one resource, that + does not mean that a resource may have one and only one URN. For + example, consider a resource that has something like "current- + weather-map" for one URN and "weather-map-for-datetime-x" for another + URN. The N2Ns service request lets us obtain lists of URNs that are + believed equivalent at the time of the request. As the weathermap + example shows, some of the equivalances will be transitory, so the + standard HTTP mechanisms for communicating cachability MUST be + honored. + + The request is encoded as above. The result is a list of all the + URNs, known to the resolver, which identify the same resource as the + input URN. The result shall be encoded as for the N2Ls request above + (text/uri-list unless specified otherwise by an Accept: header). + +3.7 L2Ns (URL to URNs): +---------------------- + + The request is encoded as above. The response is a list of any URNs + known to be assigned to the resource at the given URL. The result + shall be encoded as for the N2Ls and N2Ns requests. + + + + + + + + +Daniel Experimental [Page 5] + +RFC 2169 HTTP in URN Resolution June 1997 + + +3.8 L2Ls (URL to URLs): +------------------------ + + The request is encoded as described above. The result is a list of + all the URLs that the resolver knows are associated with the resource + located by the given URL. This is encoded as for the N2Ls, N2Ns, and + L2Ns requests. + +3.9 L2C (URL to URC): +---------------------- + + The request is encoded as above, the response is the same as for the + N2C request. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Daniel Experimental [Page 6] + +RFC 2169 HTTP in URN Resolution June 1997 + + +Appendix A: The text/uri-list Internet Media Type +================================================= +[This appendix will be augmented or replaced by the registration of the +text/uri-list IMT once that registration has been performed]. + + Several of the resolution service requests, such as N2Ls, N2Ns, L2Ns, + L2Ls, result in a list of URIs being returned to the client. The + text/uri-list Internet Media Type is defined to provide a simple + format for the automatic processing of such lists of URIs. + + The format of text/uri-list resources is: + + 1) Any lines beginning with the '#' character are comment lines + and are ignored during processing. (Note that '#' is a character + that may appear in URIs, so it only denotes a comment when it is the + first character on a line). + 2) The remaining non-comment lines MUST be URIs (URNs or URLs), encoded + according to the URI specification RFC[6]. Each URI shall appear on + one and only one line. + 3) As for all text/* formats, lines are terminated with a CR LF pair, + although clients should be liberal in accepting lines with only + one of those characters. + + In applications where one URI has been mapped to a list of URIs, such + as in response to the N2Ls request, the first line of the text/uri- + list response SHOULD be a comment giving the original URI. + + An example of such a result for the N2L request is shown below in + figure 1. + + # urn:cid:foo@huh.org + http://www.huh.org/cid/foo.html + http://www.huh.org/cid/foo.pdf + ftp://ftp.foo.org/cid/foo.txt + + Figure 1: Example of the text/uri-list format + +Appendix B: n2l.pl script +========================== + + This is a simple CGI script for the N2L resolution service. It + assumes the presence of a DBM database to store the URN to URL + mappings. This script does not specify standard behavior, it is + provided merely as a courtesy for implementors. In fact, this script + does not process incoming Accept: headers, nor does it generate + status codes. Such behavior should be part of a real script for any + of the resolution services. + + + + +Daniel Experimental [Page 7] + +RFC 2169 HTTP in URN Resolution June 1997 + + + #!/bin/perl + # N2L - performs urn to url resolution + + $n2l_File = "...filename for DBM database..."; + + + $urn = $ENV{'QUERY_STRING'} ; + + # Sanity check on the URN. Minimum length of a valid URN is + # 7 characters - "urn:", a 1-character Namespace ID, ":", and + # a 1-character namespace-specific string. More elaborate + # sanity checks should be part of a real resolver script. + if(length($urn)<7) + { + $error=1; + } + + if(!$error) + { + # Convert lexically equivalent versions of a URI into + # a canonical version for DB lookups. + $urn =~ s/^urn:([^:]*):(.*)$/sprintf("urn:%s:%s", lc $1, $2)/ie; + + dbmopen(%lu,$n2l_File,0444); + if($lu{$urn}) + { + $url=$lu{$urn}; + print STDOUT "Location: $url\n\n"; + }else{ + $error=2; + } + dbmclose(%lu); + } + + if($error) + { + print "Content-Type: text/html \n\n"; + print "<html>\n"; + print "<head><title>URN Resolution: N2L</title></head>\n"; + print "<BODY>\n"; + print "<h1>URN to URL resolution failed for the URN:</h1>\n"; + print "<hr><h3>$urn</h3>\n"; + print "</body>\n"; + print "</html>\n"; + } + + exit; + + + + +Daniel Experimental [Page 8] + +RFC 2169 HTTP in URN Resolution June 1997 + + +References: +=========== + + [1] Daniel, Ron and Michael Mealling, RFC 2168, "Resolution of Uniform + Resource Identifiers using the Domain Name System", June 1997. + + [2] Berners-Lee, T, R. Fielding, H. Frystyk, RFC 1945, "Hypertext + Transfer Protocol -- HTTP/1.0", T. Berners-Lee, May 1996. + + [3] Fielding, R., J. Gettys, J.C. Mogul, H. Frystyk, T. Berners-Lee, + RFC 2068, "Hypertext Transfer Protocol -- HTTP/1.1", Jan. 1997. + + [4] Moats, R., RFC 2141, "URN Syntax", May 1997. + + [5] URN-WG. "URN Resolution Services". Work In Progress. + + [6] Berners-Lee, T., RFC 1630, "Universal Resource Identifiers in WWW: + A Unifying Syntax for the Expression of Names and Addresses of + Objects on the Network as used in the World-Wide Web", June 1994. + +Security Considerations +======================= + + Communications with a resolver may be of a sensitive nature. Some + resolvers will hold information that should only be released to + authorized users. The results from resolvers may be the target of + spoofing, especially once electronic commerce transactions are common + and there is money to be made by directing users to pirate + repositories rather than repositories which pay royalties to + rightsholders. Resolution requests may be of interest to traffic + analysts. The requests may also be subject to spoofing. + + The requests and responses in this draft are amenable to encoding, + signing, and authentication in the manner of any other HTTP traffic. + +Author Contact Information: +=========================== + + Advanced Computing Lab, MS B287 + Los Alamos National Laboratory + Los Alamos, NM, USA, 87545 + voice: +1 505 665 0597 + fax: +1 505 665 4939 + email: rdaniel@lanl.gov + + + + + + + +Daniel Experimental [Page 9] + |