summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2168.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc2168.txt')
-rw-r--r--doc/rfc/rfc2168.txt1123
1 files changed, 1123 insertions, 0 deletions
diff --git a/doc/rfc/rfc2168.txt b/doc/rfc/rfc2168.txt
new file mode 100644
index 0000000..3eed1bd
--- /dev/null
+++ b/doc/rfc/rfc2168.txt
@@ -0,0 +1,1123 @@
+
+
+
+
+
+
+Network Working Group R. Daniel
+Request for Comments: 2168 Los Alamos National Laboratory
+Category: Experimental M. Mealling
+ Network Solutions, Inc.
+ June 1997
+
+
+ Resolution of Uniform Resource Identifiers
+ using the Domain Name System
+
+Status of this Memo
+===================
+
+ This memo defines an Experimental Protocol for the Internet
+ community. This memo does not specify an Internet standard of any
+ kind. Discussion and suggestions for improvement are requested.
+ Distribution of this memo is unlimited.
+
+Abstract:
+=========
+
+ Uniform Resource Locators (URLs) are the foundation of the World Wide
+ Web, and are a vital Internet technology. However, they have proven
+ to be brittle in practice. The basic problem is that URLs typically
+ identify a particular path to a file on a particular host. There is
+ no graceful way of changing the path or host once the URL has been
+ assigned. Neither is there a graceful way of replicating the resource
+ located by the URL to achieve better network utilization and/or fault
+ tolerance. Uniform Resource Names (URNs) have been hypothesized as a
+ adjunct to URLs that would overcome such problems. URNs and URLs are
+ both instances of a broader class of identifiers known as Uniform
+ Resource Identifiers (URIs).
+
+ The requirements document for URN resolution systems[15] defines the
+ concept of a "resolver discovery service". This document describes
+ the first, experimental, RDS. It is implemented by a new DNS Resource
+ Record, NAPTR (Naming Authority PoinTeR), that provides rules for
+ mapping parts of URIs to domain names. By changing the mapping
+ rules, we can change the host that is contacted to resolve a URI.
+ This will allow a more graceful handling of URLs over long time
+ periods, and forms the foundation for a new proposal for Uniform
+ Resource Names.
+
+
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 1]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ In addition to locating resolvers, the NAPTR provides for other
+ naming systems to be grandfathered into the URN world, provides
+ independence between the name assignment system and the resolution
+ protocol system, and allows multiple services (Name to Location, Name
+ to Description, Name to Resource, ...) to be offered. In conjunction
+ with the SRV RR, the NAPTR record allows those services to be
+ replicated for the purposes of fault tolerance and load balancing.
+
+Introduction:
+=============
+
+ Uniform Resource Locators have been a significant advance in
+ retrieving Internet-accessible resources. However, their brittle
+ nature over time has been recognized for several years. The Uniform
+ Resource Identifier working group proposed the development of Uniform
+ Resource Names to serve as persistent, location-independent
+ identifiers for Internet resources in order to overcome most of the
+ problems with URLs. RFC-1737 [1] sets forth requirements on URNs.
+
+ During the lifetime of the URI-WG, a number of URN proposals were
+ generated. The developers of several of those proposals met in a
+ series of meetings, resulting in a compromise known as the Knoxville
+ framework. The major principle behind the Knoxville framework is
+ that the resolution system must be separate from the way names are
+ assigned. This is in marked contrast to most URLs, which identify the
+ host to contact and the protocol to use. Readers are referred to [2]
+ for background on the Knoxville framework and for additional
+ information on the context and purpose of this proposal.
+
+ Separating the way names are resolved from the way they are
+ constructed provides several benefits. It allows multiple naming
+ approaches and resolution approaches to compete, as it allows
+ different protocols and resolvers to be used. There is just one
+ problem with such a separation - how do we resolve a name when it
+ can't give us directions to its resolver?
+
+ For the short term, DNS is the obvious candidate for the resolution
+ framework, since it is widely deployed and understood. However, it is
+ not appropriate to use DNS to maintain information on a per-resource
+ basis. First of all, DNS was never intended to handle that many
+ records. Second, the limited record size is inappropriate for catalog
+ information. Third, domain names are not appropriate as URNs.
+
+ Therefore our approach is to use DNS to locate "resolvers" that can
+ provide information on individual resources, potentially including
+ the resource itself. To accomplish this, we "rewrite" the URI into a
+ domain name following the rules provided in NAPTR records. Rewrite
+ rules provide considerable power, which is important when trying to
+
+
+
+Daniel & Mealling Experimental [Page 2]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ meet the goals listed above. However, collections of rules can become
+ difficult to understand. To lessen this problem, the NAPTR rules are
+ *always* applied to the original URI, *never* to the output of
+ previous rules.
+
+ Locating a resolver through the rewrite procedure may take multiple
+ steps, but the beginning is always the same. The start of the URI is
+ scanned to extract its colon-delimited prefix. (For URNs, the prefix
+ is always "urn:" and we extract the following colon-delimited
+ namespace identifier [3]). NAPTR resolution begins by taking the
+ extracted string, appending the well-known suffix ".urn.net", and
+ querying the DNS for NAPTR records at that domain name. Based on the
+ results of this query, zero or more additional DNS queries may be
+ needed to locate resolvers for the URI. The details of the
+ conversation between the client and the resolver thus located are
+ outside the bounds of this draft. Three brief examples of this
+ procedure are given in the next section.
+
+ The NAPTR RR provides the level of indirection needed to keep the
+ naming system independent of the resolution system, its protocols,
+ and services. Coupled with the new SRV resource record proposal[4]
+ there is also the potential for replicating the resolver on multiple
+ hosts, overcoming some of the most significant problems of URLs. This
+ is an important and subtle point. Not only do the NAPTR and SRV
+ records allow us to replicate the resource, we can replicate the
+ resolvers that know about the replicated resource. Preventing a
+ single point of failure at the resolver level is a significant
+ benefit. Separating the resolution procedure from the way names are
+ constructed has additional benefits. Different resolution procedures
+ can be used over time, and resolution procedures that are determined
+ to be useful can be extended to deal with additional namespaces.
+
+Caveats
+=======
+
+ The NAPTR proposal is the first resolution procedure to be considered
+ by the URN-WG. There are several concerns about the proposal which
+ have motivated the group to recommend it for publication as an
+ Experimental rather than a standards-track RFC.
+
+ First, URN resolution is new to the IETF and we wish to gain
+ operational experience before recommending any procedure for the
+ standards track. Second, the NAPTR proposal is based on DNS and
+ consequently inherits concerns about security and administration. The
+ recent advancement of the DNSSEC and secure update drafts to Proposed
+ Standard reduce these concerns, but we wish to experiment with those
+ new capabilities in the context of URN administration. A third area
+ of concern is the potential for a noticeable impact on the DNS. We
+
+
+
+Daniel & Mealling Experimental [Page 3]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ believe that the proposal makes appropriate use of caching and
+ additional information, but it is best to go slow where the potential
+ for impact on a core system like the DNS is concerned. Fourth, the
+ rewrite rules in the NAPTR proposal are based on regular expressions.
+ Since regular expressions are difficult for humans to construct
+ correctly, concerns exist about the usability and maintainability of
+ the rules. This is especially true where international character sets
+ are concerned. Finally, the URN-WG is developing a requirements
+ document for URN Resolution Services[15], but that document is not
+ complete. That document needs to precede any resolution service
+ proposals on the standards track.
+
+Terminology
+===========
+
+ "Must" or "Shall" - Software that does not behave in the manner that
+ this document says it must is not conformant to this
+ document.
+ "Should" - Software that does not follow the behavior that this
+ document says it should may still be conformant, but is
+ probably broken in some fundamental way.
+ "May" - Implementations may or may not provide the described
+ behavior, while still remaining conformant to this
+ document.
+
+Brief overview and examples of the NAPTR RR:
+============================================
+
+ A detailed description of the NAPTR RR will be given later, but to
+ give a flavor for the proposal we first give a simple description of
+ the record and three examples of its use.
+
+ The key fields in the NAPTR RR are order, preference, service, flags,
+ regexp, and replacement:
+
+ * The order field specifies the order in which records MUST be
+ processed when multiple NAPTR records are returned in response to a
+ single query. A naming authority may have delegated a portion of
+ its namespace to another agency. Evaluating the NAPTR records in
+ the correct order is necessary for delegation to work properly.
+
+ * The preference field specifies the order in which records SHOULD be
+ processed when multiple NAPTR records have the same value of
+ "order". This field lets a service provider specify the order in
+ which resolvers are contacted, so that more capable machines are
+ contacted in preference to less capable ones.
+
+
+
+
+
+Daniel & Mealling Experimental [Page 4]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ * The service field specifies the resolution protocol and resolution
+ service(s) that will be available if the rewrite specified by the
+ regexp or replacement fields is applied. Resolution protocols are
+ the protocols used to talk with a resolver. They will be specified
+ in other documents, such as [5]. Resolution services are operations
+ such as N2R (URN to Resource), N2L (URN to URL), N2C (URN to URC),
+ etc. These will be discussed in the URN Resolution Services
+ document[6], and their behavior in a particular resolution protocol
+ will be given in the specification for that protocol (see [5] for a
+ concrete example).
+
+ * The flags field contains modifiers that affect what happens in the
+ next DNS lookup, typically for optimizing the process. Flags may
+ also affect the interpretation of the other fields in the record,
+ therefore, clients MUST skip NAPTR records which contain an unknown
+ flag value.
+
+ * The regexp field is one of two fields used for the rewrite rules,
+ and is the core concept of the NAPTR record. The regexp field is a
+ String containing a sed-like substitution expression. (The actual
+ grammar for the substitution expressions is given later in this
+ draft). The substitution expression is applied to the original URN
+ to determine the next domain name to be queried. The regexp field
+ should be used when the domain name to be generated is conditional
+ on information in the URI. If the next domain name is always known,
+ which is anticipated to be a common occurrence, the replacement
+ field should be used instead.
+
+ * The replacement field is the other field that may be used for the
+ rewrite rule. It is an optimization of the rewrite process for the
+ case where the next domain name is fixed instead of being
+ conditional on the content of the URI. The replacement field is a
+ domain name (subject to compression if a DNS sender knows that a
+ given recipient is able to decompress names in this RR type's RDATA
+ field). If the rewrite is more complex than a simple substitution
+ of a domain name, the replacement field should be set to . and the
+ regexp field used.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 5]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ Note that the client applies all the substitutions and performs all
+ lookups, they are not performed in the DNS servers. Note also that it
+ is the belief of the developers of this document that regexps should
+ rarely be used. The replacement field seems adequate for the vast
+ majority of situations. Regexps are only necessary when portions of a
+ namespace are to be delegated to different resolvers. Finally, note
+ that the regexp and replacement fields are, at present, mutually
+ exclusive. However, developers of client software should be aware
+ that a new flag might be defined which requires values in both
+ fields.
+
+Example 1
+---------
+
+ Consider a URN that uses the hypothetical DUNS namespace. DUNS
+ numbers are identifiers for approximately 30 million registered
+ businesses around the world, assigned and maintained by Dunn and
+ Bradstreet. The URN might look like:
+
+ urn:duns:002372413:annual-report-1997
+
+ The first step in the resolution process is to find out about the
+ DUNS namespace. The namespace identifier, "duns", is extracted from
+ the URN, prepended to urn.net, and the NAPTRs for duns.urn.net looked
+ up. It might return records of the form:
+
+duns.urn.net
+;; order pref flags service regexp replacement
+ IN NAPTR 100 10 "s" "dunslink+N2L+N2C" "" dunslink.udp.isi.dandb.com
+ IN NAPTR 100 20 "s" "rcds+N2C" "" rcds.udp.isi.dandb.com
+ IN NAPTR 100 30 "s" "http+N2L+N2C+N2R" "" http.tcp.isi.dandb.com
+
+ The order field contains equal values, indicating that no name
+ delegation order has to be followed. The preference field indicates
+ that the provider would like clients to use the special dunslink
+ protocol, followed by the RCDS protocol, and that HTTP is offered as
+ a last resort. All the records specify the "s" flag, which will be
+ explained momentarily. The service fields say that if we speak
+ dunslink, we will be able to issue either the N2L or N2C requests to
+ obtain a URL or a URC (description) of the resource. The Resource
+ Cataloging and Distribution Service (RCDS)[7] could be used to get a
+ URC for the resource, while HTTP could be used to get a URL, URC, or
+ the resource itself. All the records supply the next domain name to
+ query, none of them need to be rewritten with the aid of regular
+ expressions.
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 6]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ The general case might require multiple NAPTR rewrites to locate a
+ resolver, but eventually we will come to the "terminal NAPTR". Once
+ we have the terminal NAPTR, our next probe into the DNS will be for a
+ SRV or A record instead of another NAPTR. Rather than probing for a
+ non-existent NAPTR record to terminate the loop, the flags field is
+ used to indicate a terminal lookup. If it has a value of "s", the
+ next lookup should be for SRV RRs, "a" denotes that A records should
+ sought. A "p" flag is also provided to indicate that the next action
+ is Protocol-specific, but that looking up another NAPTR will not be
+ part of it.
+
+ Since our example RR specified the "s" flag, it was terminal.
+ Assuming our client does not know the dunslink protocol, our next
+ action is to lookup SRV RRs for rcds.udp.isi.dandb.com, which will
+ tell us hosts that can provide the necessary resolution service. That
+ lookup might return:
+
+ ;; Pref Weight Port Target
+ rcds.udp.isi.dandb.com IN SRV 0 0 1000 defduns.isi.dandb.com
+ IN SRV 0 0 1000 dbmirror.com.au
+ IN SRV 0 0 1000 ukmirror.com.uk
+
+ telling us three hosts that could actually do the resolution, and
+ giving us the port we should use to talk to their RCDS server. (The
+ reader is referred to the SRV proposal [4] for the interpretation of
+ the fields above).
+
+ There is opportunity for significant optimization here. We can return
+ the SRV records as additional information for terminal NAPTRs (and
+ the A records as additional information for those SRVs). While this
+ recursive provision of additional information is not explicitly
+ blessed in the DNS specifications, it is not forbidden, and BIND does
+ take advantage of it [8]. This is a significant optimization. In
+ conjunction with a long TTL for *.urn.net records, the average number
+ of probes to DNS for resolving DUNS URNs would approach one.
+ Therefore, DNS server implementors SHOULD provide additional
+ information with NAPTR responses. The additional information will be
+ either SRV or A records. If SRV records are available, their A
+ records should be provided as recursive additional information.
+
+ Note that the example NAPTR records above are intended to represent
+ the reply the client will see. They are not quite identical to what
+ the domain administrator would put into the zone files. For one
+ thing, the administrator should supply the trailing '.' character on
+ any FQDNs.
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 7]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+Example 2
+---------
+
+ Consider a URN namespace based on MIME Content-Ids. The URN might
+ look like this:
+
+ urn:cid:199606121851.1@mordred.gatech.edu
+
+ (Note that this example is chosen for pedagogical purposes, and does
+ not conform to the recently-approved CID URL scheme.)
+
+ The first step in the resolution process is to find out about the CID
+ namespace. The namespace identifier, cid, is extracted from the URN,
+ prepended to urn.net, and the NAPTR for cid.urn.net looked up. It
+ might return records of the form:
+
+ cid.urn.net
+ ;; order pref flags service regexp replacement
+ IN NAPTR 100 10 "" "" "/urn:cid:.+@([^\.]+\.)(.*)$/\2/i" .
+
+ We have only one NAPTR response, so ordering the responses is not a
+ problem. The replacement field is empty, so we check the regexp
+ field and use the pattern provided there. We apply that regexp to the
+ entire URN to see if it matches, which it does. The \2 part of the
+ substitution expression returns the string "gatech.edu". Since the
+ flags field does not contain "s" or "a", the lookup is not terminal
+ and our next probe to DNS is for more NAPTR records:
+ lookup(query=NAPTR, "gatech.edu").
+
+ Note that the rule does not extract the full domain name from the
+ CID, instead it assumes the CID comes from a host and extracts its
+ domain. While all hosts, such as mordred, could have their very own
+ NAPTR, maintaining those records for all the machines at a site as
+ large as Georgia Tech would be an intolerable burden. Wildcards are
+ not appropriate here since they only return results when there is no
+ exactly matching names already in the system.
+
+ The record returned from the query on "gatech.edu" might look like:
+
+gatech.edu IN NAPTR
+;; order pref flags service regexp replacement
+ IN NAPTR 100 50 "s" "z3950+N2L+N2C" "" z3950.tcp.gatech.edu
+ IN NAPTR 100 50 "s" "rcds+N2C" "" rcds.udp.gatech.edu
+ IN NAPTR 100 50 "s" "http+N2L+N2C+N2R" "" http.tcp.gatech.edu
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 8]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ Continuing with our example, we note that the values of the order and
+ preference fields are equal in all records, so the client is free to
+ pick any record. The flags field tells us that these are the last
+ NAPTR patterns we should see, and after the rewrite (a simple
+ replacement in this case) we should look up SRV records to get
+ information on the hosts that can provide the necessary service.
+
+ Assuming we prefer the Z39.50 protocol, our lookup might return:
+
+ ;; Pref Weight Port Target
+ z3950.tcp.gatech.edu IN SRV 0 0 1000 z3950.gatech.edu
+ IN SRV 0 0 1000 z3950.cc.gatech.edu
+ IN SRV 0 0 1000 z3950.uga.edu
+
+ telling us three hosts that could actually do the resolution, and
+ giving us the port we should use to talk to their Z39.50 server.
+
+ Recall that the regular expression used \2 to extract a domain name
+ from the CID, and \. for matching the literal '.' characters
+ seperating the domain name components. Since '\' is the escape
+ character, literal occurances of a backslash must be escaped by
+ another backslash. For the case of the cid.urn.net record above, the
+ regular expression entered into the zone file should be
+ "/urn:cid:.+@([^\\.]+\\.)(.*)$/\\2/i". When the client code actually
+ receives the record, the pattern will have been converted to
+ "/urn:cid:.+@([^.]+\.)(.*)$/\2/i".
+
+Example 3
+---------
+
+ Even if URN systems were in place now, there would still be a
+ tremendous number of URLs. It should be possible to develop a URN
+ resolution system that can also provide location independence for
+ those URLs. This is related to the requirement in [1] to be able to
+ grandfather in names from other naming systems, such as ISO Formal
+ Public Identifiers, Library of Congress Call Numbers, ISBNs, ISSNs,
+ etc.
+
+ The NAPTR RR could also be used for URLs that have already been
+ assigned. Assume we have the URL for a very popular piece of
+ software that the publisher wishes to mirror at multiple sites around
+ the world:
+
+ http://www.foo.com/software/latest-beta.exe
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 9]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ We extract the prefix, "http", and lookup NAPTR records for
+ http.urn.net. This might return a record of the form
+
+ http.urn.net IN NAPTR
+ ;; order pref flags service regexp replacement
+ 100 90 "" "" "!http://([^/:]+)!\1!i" .
+
+ This expression returns everything after the first double slash and
+ before the next slash or colon. (We use the '!' character to delimit
+ the parts of the substitution expression. Otherwise we would have to
+ use backslashes to escape the forward slashes, and would have a
+ regexp in the zone file that looked like
+ "/http:\\/\\/([^\\/:]+)/\\1/i".).
+
+ Applying this pattern to the URL extracts "www.foo.com". Looking up
+ NAPTR records for that might return:
+
+ www.foo.com
+ ;; order pref flags service regexp replacement
+ IN NAPTR 100 100 "s" "http+L2R" "" http.tcp.foo.com
+ IN NAPTR 100 100 "s" "ftp+L2R" "" ftp.tcp.foo.com
+
+ Looking up SRV records for http.tcp.foo.com would return information
+ on the hosts that foo.com has designated to be its mirror sites. The
+ client can then pick one for the user.
+
+NAPTR RR Format
+===============
+
+ The format of the NAPTR RR is given below. The DNS type code for
+ NAPTR is 35.
+
+ Domain TTL Class Order Preference Flags Service Regexp
+ Replacement
+
+ where:
+
+ Domain
+ The domain name this resource record refers to.
+ TTL
+ Standard DNS Time To Live field
+ Class
+ Standard DNS meaning
+
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 10]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ Order
+ A 16-bit integer specifying the order in which the NAPTR
+ records MUST be processed to ensure correct delegation of
+ portions of the namespace over time. Low numbers are processed
+ before high numbers, and once a NAPTR is found that "matches"
+ a URN, the client MUST NOT consider any NAPTRs with a higher
+ value for order.
+
+ Preference
+ A 16-bit integer which specifies the order in which NAPTR
+ records with equal "order" values SHOULD be processed, low
+ numbers being processed before high numbers. This is similar
+ to the preference field in an MX record, and is used so domain
+ administrators can direct clients towards more capable hosts
+ or lighter weight protocols.
+
+ Flags
+ A String giving flags to control aspects of the rewriting and
+ interpretation of the fields in the record. Flags are single
+ characters from the set [A-Z0-9]. The case of the alphabetic
+ characters is not significant.
+
+ At this time only three flags, "S", "A", and "P", are defined.
+ "S" means that the next lookup should be for SRV records
+ instead of NAPTR records. "A" means that the next lookup
+ should be for A records. The "P" flag says that the remainder
+ of the resolution shall be carried out in a Protocol-specific
+ fashion, and we should not do any more DNS queries.
+
+ The remaining alphabetic flags are reserved. The numeric flags
+ may be used for local experimentation. The S, A, and P flags
+ are all mutually exclusive, and resolution libraries MAY
+ signal an error if more than one is given. (Experimental code
+ and code for assisting in the creation of NAPTRs would be more
+ likely to signal such an error than a client such as a
+ browser). We anticipate that multiple flags will be allowed in
+ the future, so implementers MUST NOT assume that the flags
+ field can only contain 0 or 1 characters. Finally, if a client
+ encounters a record with an unknown flag, it MUST ignore it
+ and move to the next record. This test takes precedence even
+ over the "order" field. Since flags can control the
+ interpretation placed on fields, a novel flag might change the
+ interpretation of the regexp and/or replacement fields such
+ that it is impossible to determine if a record matched a URN.
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 11]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ Service
+ Specifies the resolution service(s) available down this
+ rewrite path. It may also specify the particular protocol that
+ is used to talk with a resolver. A protocol MUST be specified
+ if the flags field states that the NAPTR is terminal. If a
+ protocol is specified, but the flags field does not state that
+ the NAPTR is terminal, the next lookup MUST be for a NAPTR.
+ The client MAY choose not to perform the next lookup if the
+ protocol is unknown, but that behavior MUST NOT be relied
+ upon.
+
+ The service field may take any of the values below (using the
+ Augmented BNF of RFC 822[9]):
+
+ service_field = [ [protocol] *("+" rs)]
+ protocol = ALPHA *31ALPHANUM
+ rs = ALPHA *31ALPHANUM
+ // The protocol and rs fields are limited to 32
+ // characters and must start with an alphabetic.
+ // The current set of "known" strings are:
+ // protocol = "rcds" / "thttp" / "hdl" / "rwhois" / "z3950"
+ // rs = "N2L" / "N2Ls" / "N2R" / "N2Rs" / "N2C"
+ // / "N2Ns" / "L2R" / "L2Ns" / "L2Ls" / "L2C"
+
+ i.e. an optional protocol specification followed by 0 or more
+ resolution services. Each resolution service is indicated by
+ an initial '+' character.
+
+ Note that the empty string is also a valid service field. This
+ will typically be seen at the top levels of a namespace, when
+ it is impossible to know what services and protocols will be
+ offered by a particular publisher within that name space.
+
+ At this time the known protocols are rcds[7], hdl[10] (binary,
+ UDP-based protocols), thttp[5] (a textual, TCP-based
+ protocol), rwhois[11] (textual, UDP or TCP based), and
+ Z39.50[12] (binary, TCP-based). More will be allowed later.
+ The names of the protocols must be formed from the characters
+ [a-Z0-9]. Case of the characters is not significant.
+
+ The service requests currently allowed will be described in
+ more detail in [6], but in brief they are:
+ N2L - Given a URN, return a URL
+ N2Ls - Given a URN, return a set of URLs
+ N2R - Given a URN, return an instance of the resource.
+ N2Rs - Given a URN, return multiple instances of the
+ resource, typically encoded using
+ multipart/alternative.
+
+
+
+Daniel & Mealling Experimental [Page 12]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ N2C - Given a URN, return a collection of meta-
+ information on the named resource. The format of
+ this response is the subject of another document.
+ N2Ns - Given a URN, return all URNs that are also
+ identifers for the resource.
+ L2R - Given a URL, return the resource.
+ L2Ns - Given a URL, return all the URNs that are
+ identifiers for the resource.
+ L2Ls - Given a URL, return all the URLs for instances of
+ of the same resource.
+ L2C - Given a URL, return a description of the
+ resource.
+
+ The actual format of the service request and response will be
+ determined by the resolution protocol, and is the subject for
+ other documents (e.g. [5]). Protocols need not offer all
+ services. The labels for service requests shall be formed from
+ the set of characters [A-Z0-9]. The case of the alphabetic
+ characters is not significant.
+
+ Regexp
+ A STRING containing a substitution expression that is applied
+ to the original URI in order to construct the next domain name
+ to lookup. The grammar of the substitution expression is given
+ in the next section.
+
+ Replacement
+ The next NAME to query for NAPTR, SRV, or A records depending
+ on the value of the flags field. As mentioned above, this may
+ be compressed.
+
+Substitution Expression Grammar:
+================================
+
+ The content of the regexp field is a substitution expression. True
+ sed(1) substitution expressions are not appropriate for use in this
+ application for a variety of reasons, therefore the contents of the
+ regexp field MUST follow the grammar below:
+
+subst_expr = delim-char ere delim-char repl delim-char *flags
+delim-char = "/" / "!" / ... (Any non-digit or non-flag character other
+ than backslash '\'. All occurances of a delim_char in a
+ subst_expr must be the same character.)
+ere = POSIX Extended Regular Expression (see [13], section
+ 2.8.4)
+repl = dns_str / backref / repl dns_str / repl backref
+dns_str = 1*DNS_CHAR
+backref = "\" 1POS_DIGIT
+
+
+
+Daniel & Mealling Experimental [Page 13]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+flags = "i"
+DNS_CHAR = "-" / "0" / ... / "9" / "a" / ... / "z" / "A" / ... / "Z"
+POS_DIGIT = "1" / "2" / ... / "9" ; 0 is not an allowed backref
+value domain name (see RFC-1123 [14]).
+
+ The result of applying the substitution expression to the original
+ URI MUST result in a string that obeys the syntax for DNS host names
+ [14]. Since it is possible for the regexp field to be improperly
+ specified, such that a non-conforming host name can be constructed,
+ client software SHOULD verify that the result is a legal host name
+ before making queries on it.
+
+ Backref expressions in the repl portion of the substitution
+ expression are replaced by the (possibly empty) string of characters
+ enclosed by '(' and ')' in the ERE portion of the substitution
+ expression. N is a single digit from 1 through 9, inclusive. It
+ specifies the N'th backref expression, the one that begins with the
+ N'th '(' and continues to the matching ')'. For example, the ERE
+ (A(B(C)DE)(F)G)
+ has backref expressions:
+ \1 = ABCDEFG
+ \2 = BCDE
+ \3 = C
+ \4 = F
+ \5..\9 = error - no matching subexpression
+
+ The "i" flag indicates that the ERE matching SHALL be performed in a
+ case-insensitive fashion. Furthermore, any backref replacements MAY
+ be normalized to lower case when the "i" flag is given.
+
+ The first character in the substitution expression shall be used as
+ the character that delimits the components of the substitution
+ expression. There must be exactly three non-escaped occurrences of
+ the delimiter character in a substitution expression. Since escaped
+ occurrences of the delimiter character will be interpreted as
+ occurrences of that character, digits MUST NOT be used as delimiters.
+ Backrefs would be confused with literal digits were this allowed.
+ Similarly, if flags are specified in the substitution expression, the
+ delimiter character must not also be a flag character.
+
+
+
+
+
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 14]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+Advice to domain administrators:
+================================
+
+ Beware of regular expressions. Not only are they a pain to get
+ correct on their own, but there is the previously mentioned
+ interaction with DNS. Any backslashes in a regexp must be entered
+ twice in a zone file in order to appear once in a query response.
+ More seriously, the need for double backslashes has probably not been
+ tested by all implementors of DNS servers. We anticipate that urn.net
+ will be the heaviest user of regexps. Only when delegating portions
+ of namespaces should the typical domain administrator need to use
+ regexps.
+
+ On a related note, beware of interactions with the shell when
+ manipulating regexps from the command line. Since '\' is a common
+ escape character in shells, there is a good chance that when you
+ think you are saying "\\" you are actually saying "\". Similar
+ caveats apply to characters such as
+
+ The "a" flag allows the next lookup to be for A records rather than
+ SRV records. Since there is no place for a port specification in the
+ NAPTR record, when the "A" flag is used the specified protocol must
+ be running on its default port.
+
+ The URN Sytnax draft defines a canonical form for each URN, which
+ requires %encoding characters outside a limited repertoire. The
+ regular expressions MUST be written to operate on that canonical
+ form. Since international character sets will end up with extensive
+ use of %encoded characters, regular expressions operating on them
+ will be essentially impossible to read or write by hand.
+
+Usage
+=====
+
+ For the edification of implementers, pseudocode for a client routine
+ using NAPTRs is given below. This code is provided merely as a
+ convience, it does not have any weight as a standard way to process
+ NAPTR records. Also, as is the case with pseudocode, it has never
+ been executed and may contain logical errors. You have been warned.
+
+ //
+ // findResolver(URN)
+ // Given a URN, find a host that can resolve it.
+ //
+ findResolver(string URN) {
+ // prepend prefix to urn.net
+ sprintf(key, "%s.urn.net", extractNS(URN));
+ do {
+
+
+
+Daniel & Mealling Experimental [Page 15]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ rewrite_flag = false;
+ terminal = false;
+ if (key has been seen) {
+ quit with a loop detected error
+ }
+ add key to list of "seens"
+ records = lookup(type=NAPTR, key); // get all NAPTR RRs for 'key'
+
+ discard any records with an unknown value in the "flags" field.
+ sort NAPTR records by "order" field and "preference" field
+ (with "order" being more significant than "preference").
+ n_naptrs = number of NAPTR records in response.
+ curr_order = records[0].order;
+ max_order = records[n_naptrs-1].order;
+
+ // Process current batch of NAPTRs according to "order" field.
+ for (j=0; j < n_naptrs && records[j].order <= max_order; j++) {
+ if (unknown_flag) // skip this record and go to next one
+ continue;
+ newkey = rewrite(URN, naptr[j].replacement, naptr[j].regexp);
+ if (!newkey) // Skip to next record if the rewrite didn't
+ match continue;
+ // We did do a rewrite, shrink max_order to current value
+ // so that delegation works properly
+ max_order = naptr[j].order;
+ // Will we know what to do with the protocol and services
+ // specified in the NAPTR? If not, try next record.
+ if(!isKnownProto(naptr[j].services)) {
+ continue;
+ }
+ if(!isKnownService(naptr[j].services)) {
+ continue;
+ }
+
+ // At this point we have a successful rewrite and we will
+ // know how to speak the protocol and request a known
+ // resolution service. Before we do the next lookup, check
+ // some optimization possibilities.
+
+ if (strcasecmp(flags, "S")
+ || strcasecmp(flags, "P"))
+ || strcasecmp(flags, "A")) {
+ terminal = true;
+ services = naptr[j].services;
+ addnl = any SRV and/or A records returned as additional
+ info for naptr[j].
+ }
+ key = newkey;
+
+
+
+Daniel & Mealling Experimental [Page 16]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ rewriteflag = true;
+ break;
+ }
+ } while (rewriteflag && !terminal);
+
+ // Did we not find our way to a resolver?
+ if (!rewrite_flag) {
+ report an error
+ return NULL;
+ }
+
+
+ // Leave rest to another protocol?
+ if (strcasecmp(flags, "P")) {
+ return key as host to talk to;
+ }
+
+ // If not, keep plugging
+ if (!addnl) { // No SRVs came in as additional info, look them up
+ srvs = lookup(type=SRV, key);
+ }
+
+ sort SRV records by preference, weight, ...
+ foreach (SRV record) { // in order of preference
+ try contacting srv[j].target using the protocol and one of the
+ resolution service requests from the "services" field of the
+ last NAPTR record.
+ if (successful)
+ return (target, protocol, service);
+ // Actually we would probably return a result, but this
+ // code was supposed to just tell us a good host to talk to.
+ }
+ die with an "unable to find a host" error;
+ }
+
+Notes:
+======
+
+ - A client MUST process multiple NAPTR records in the order
+ specified by the "order" field, it MUST NOT simply use the first
+ record that provides a known protocol and service combination.
+
+
+
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 17]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ - If a record at a particular order matches the URI, but the
+ client doesn't know the specified protocol and service, the
+ client SHOULD continue to examine records that have the same
+ order. The client MUST NOT consider records with a higher value
+ of order. This is necessary to make delegation of portions of
+ the namespace work. The order field is what lets site
+ administrators say "all requests for URIs matching pattern x go
+ to server 1, all others go to server 2".
+ (A match is defined as:
+ 1) The NAPTR provides a replacement domain name
+ or
+ 2) The regular expression matches the URN
+ )
+
+ - When multiple RRs have the same "order", the client should use
+ the value of the preference field to select the next NAPTR to
+ consider. However, because of preferred protocols or services,
+ estimates of network distance and bandwidth, etc. clients may
+ use different criteria to sort the records.
+ - If the lookup after a rewrite fails, clients are strongly
+ encouraged to report a failure, rather than backing up to pursue
+ other rewrite paths.
+ - When a namespace is to be delegated among a set of resolvers,
+ regexps must be used. Each regexp appears in a separate NAPTR
+ RR. Administrators should do as little delegation as possible,
+ because of limitations on the size of DNS responses.
+ - Note that SRV RRs impose additional requirements on clients.
+
+Acknowledgments:
+=================
+
+ The editors would like to thank Keith Moore for all his consultations
+ during the development of this draft. We would also like to thank
+ Paul Vixie for his assistance in debugging our implementation, and
+ his answers on our questions. Finally, we would like to acknowledge
+ our enormous intellectual debt to the participants in the Knoxville
+ series of meetings, as well as to the participants in the URI and URN
+ working groups.
+
+References:
+===========
+
+ [1] Sollins, Karen and Larry Masinter, "Functional Requirements
+ for Uniform Resource Names", RFC-1737, Dec. 1994.
+
+ [2] The URN Implementors, Uniform Resource Names: A Progress Report,
+ http://www.dlib.org/dlib/february96/02arms.html, D-Lib Magazine,
+ February 1996.
+
+
+
+Daniel & Mealling Experimental [Page 18]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+ [3] Moats, Ryan, "URN Syntax", RFC-2141, May 1997.
+
+ [4] Gulbrandsen, A. and P. Vixie, "A DNS RR for specifying
+ the location of services (DNS SRV)", RFC-2052, October 1996.
+
+ [5] Daniel, Jr., Ron, "A Trivial Convention for using HTTP in URN
+ Resolution", RFC-2169, June 1997.
+
+ [6] URN-WG, "URN Resolution Services", Work in Progress.
+
+ [7] Moore, Keith, Shirley Browne, Jason Cox, and Jonathan Gettler,
+ Resource Cataloging and Distribution System, Technical Report
+ CS-97-346, University of Tennessee, Knoxville, December 1996
+
+ [8] Paul Vixie, personal communication.
+
+ [9] Crocker, Dave H. "Standard for the Format of ARPA Internet Text
+ Messages", RFC-822, August 1982.
+
+ [10] Orth, Charles and Bill Arms; Handle Resolution Protocol
+ Specification, http://www.handle.net/docs/client_spec.html
+
+ [11] Williamson, S., M. Kosters, D. Blacka, J. Singh, K. Zeilstra,
+ "Referral Whois Protocol (RWhois)", RFC-2167, June 1997.
+
+ [12] Information Retrieval (Z39.50): Application Service Definition
+ and Protocol Specification, ANSI/NISO Z39.50-1995, July 1995.
+
+ [13] IEEE Standard for Information Technology - Portable Operating
+ System Interface (POSIX) - Part 2: Shell and Utilities (Vol. 1);
+ IEEE Std 1003.2-1992; The Institute of Electrical and
+ Electronics Engineers; New York; 1993. ISBN:1-55937-255-9
+
+ [14] Braden, R., "Requirements for Internet Hosts - Application and
+ and Support", RFC-1123, Oct. 1989.
+
+ [15] Sollins, Karen, "Requirements and a Framework for URN Resolution
+ Systems", November 1996, Work in Progress.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 19]
+
+RFC 2168 Resolution of URIs Using the DNS June 1997
+
+
+Security Considerations
+=======================
+
+ The use of "urn.net" as the registry for URN namespaces is subject to
+ denial of service attacks, as well as other DNS spoofing attacks. The
+ interactions with DNSSEC are currently being studied. It is expected
+ that NAPTR records will be signed with SIG records once the DNSSEC
+ work is deployed.
+
+ The rewrite rules make identifiers from other namespaces subject to
+ the same attacks as normal domain names. Since they have not been
+ easily resolvable before, this may or may not be considered a
+ problem.
+
+ Regular expressions should be checked for sanity, not blindly passed
+ to something like PERL.
+
+ This document has discussed a way of locating a resolver, but has not
+ discussed any detail of how the communication with the resolver takes
+ place. There are significant security considerations attached to the
+ communication with a resolver. Those considerations are outside the
+ scope of this document, and must be addressed by the specifications
+ for particular resolver communication protocols.
+
+Author Contact Information:
+===========================
+
+ Ron Daniel
+ Los Alamos National Laboratory
+ MS B287
+ Los Alamos, NM, USA, 87545
+ voice: +1 505 665 0597
+ fax: +1 505 665 4939
+ email: rdaniel@lanl.gov
+
+
+ Michael Mealling
+ Network Solutions
+ 505 Huntmar Park Drive
+ Herndon, VA 22070
+ voice: (703) 742-0400
+ fax: (703) 742-9552
+ email: michaelm@internic.net
+ URL: http://www.netsol.com/
+
+
+
+
+
+
+
+Daniel & Mealling Experimental [Page 20]
+