From 4bfd864f10b68b71482b35c818559068ef8d5797 Mon Sep 17 00:00:00 2001
From: Thomas Voss <mail@thomasvoss.com>
Date: Wed, 27 Nov 2024 20:54:24 +0100
Subject: doc: Add RFC documents

---
 doc/rfc/rfc2396.txt | 2243 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 2243 insertions(+)
 create mode 100644 doc/rfc/rfc2396.txt

(limited to 'doc/rfc/rfc2396.txt')

diff --git a/doc/rfc/rfc2396.txt b/doc/rfc/rfc2396.txt
new file mode 100644
index 0000000..5bd5211
--- /dev/null
+++ b/doc/rfc/rfc2396.txt
@@ -0,0 +1,2243 @@
+
+
+
+
+
+
+Network Working Group                                     T. Berners-Lee
+Request for Comments: 2396                                       MIT/LCS
+Updates: 1808, 1738                                          R. Fielding
+Category: Standards Track                                    U.C. Irvine
+                                                             L. Masinter
+                                                       Xerox Corporation
+                                                             August 1998
+
+
+           Uniform Resource Identifiers (URI): Generic Syntax
+
+Status of this Memo
+
+   This document specifies an Internet standards track protocol for the
+   Internet community, and requests discussion and suggestions for
+   improvements.  Please refer to the current edition of the "Internet
+   Official Protocol Standards" (STD 1) for the standardization state
+   and status of this protocol.  Distribution of this memo is unlimited.
+
+Copyright Notice
+
+   Copyright (C) The Internet Society (1998).  All Rights Reserved.
+
+IESG Note
+
+   This paper describes a "superset" of operations that can be applied
+   to URI.  It consists of both a grammar and a description of basic
+   functionality for URI.  To understand what is a valid URI, both the
+   grammar and the associated description have to be studied.  Some of
+   the functionality described is not applicable to all URI schemes, and
+   some operations are only possible when certain media types are
+   retrieved using the URI, regardless of the scheme used.
+
+Abstract
+
+   A Uniform Resource Identifier (URI) is a compact string of characters
+   for identifying an abstract or physical resource.  This document
+   defines the generic syntax of URI, including both absolute and
+   relative forms, and guidelines for their use; it revises and replaces
+   the generic definitions in RFC 1738 and RFC 1808.
+
+   This document defines a grammar that is a superset of all valid URI,
+   such that an implementation can parse the common components of a URI
+   reference without knowing the scheme-specific requirements of every
+   possible identifier type.  This document does not define a generative
+   grammar for URI; that task will be performed by the individual
+   specifications of each URI scheme.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 1]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+1. Introduction
+
+   Uniform Resource Identifiers (URI) provide a simple and extensible
+   means for identifying a resource.  This specification of URI syntax
+   and semantics is derived from concepts introduced by the World Wide
+   Web global information initiative, whose use of such objects dates
+   from 1990 and is described in "Universal Resource Identifiers in WWW"
+   [RFC1630].  The specification of URI is designed to meet the
+   recommendations laid out in "Functional Recommendations for Internet
+   Resource Locators" [RFC1736] and "Functional Requirements for Uniform
+   Resource Names" [RFC1737].
+
+   This document updates and merges "Uniform Resource Locators"
+   [RFC1738] and "Relative Uniform Resource Locators" [RFC1808] in order
+   to define a single, generic syntax for all URI.  It excludes those
+   portions of RFC 1738 that defined the specific syntax of individual
+   URL schemes; those portions will be updated as separate documents, as
+   will the process for registration of new URI schemes.  This document
+   does not discuss the issues and recommendation for dealing with
+   characters outside of the US-ASCII character set [ASCII]; those
+   recommendations are discussed in a separate document.
+
+   All significant changes from the prior RFCs are noted in Appendix G.
+
+1.1 Overview of URI
+
+   URI are characterized by the following definitions:
+
+      Uniform
+         Uniformity provides several benefits: it allows different types
+         of resource identifiers to be used in the same context, even
+         when the mechanisms used to access those resources may differ;
+         it allows uniform semantic interpretation of common syntactic
+         conventions across different types of resource identifiers; it
+         allows introduction of new types of resource identifiers
+         without interfering with the way that existing identifiers are
+         used; and, it allows the identifiers to be reused in many
+         different contexts, thus permitting new applications or
+         protocols to leverage a pre-existing, large, and widely-used
+         set of resource identifiers.
+
+      Resource
+         A resource can be anything that has identity.  Familiar
+         examples include an electronic document, an image, a service
+         (e.g., "today's weather report for Los Angeles"), and a
+         collection of other resources.  Not all resources are network
+         "retrievable"; e.g., human beings, corporations, and bound
+         books in a library can also be considered resources.
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 2]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+         The resource is the conceptual mapping to an entity or set of
+         entities, not necessarily the entity which corresponds to that
+         mapping at any particular instance in time.  Thus, a resource
+         can remain constant even when its content---the entities to
+         which it currently corresponds---changes over time, provided
+         that the conceptual mapping is not changed in the process.
+
+      Identifier
+         An identifier is an object that can act as a reference to
+         something that has identity.  In the case of URI, the object is
+         a sequence of characters with a restricted syntax.
+
+   Having identified a resource, a system may perform a variety of
+   operations on the resource, as might be characterized by such words
+   as `access', `update', `replace', or `find attributes'.
+
+1.2. URI, URL, and URN
+
+   A URI can be further classified as a locator, a name, or both.  The
+   term "Uniform Resource Locator" (URL) refers to the subset of URI
+   that identify resources via a representation of their primary access
+   mechanism (e.g., their network "location"), rather than identifying
+   the resource by name or by some other attribute(s) of that resource.
+   The term "Uniform Resource Name" (URN) refers to the subset of URI
+   that are required to remain globally unique and persistent even when
+   the resource ceases to exist or becomes unavailable.
+
+   The URI scheme (Section 3.1) defines the namespace of the URI, and
+   thus may further restrict the syntax and semantics of identifiers
+   using that scheme.  This specification defines those elements of the
+   URI syntax that are either required of all URI schemes or are common
+   to many URI schemes.  It thus defines the syntax and semantics that
+   are needed to implement a scheme-independent parsing mechanism for
+   URI references, such that the scheme-dependent handling of a URI can
+   be postponed until the scheme-dependent semantics are needed.  We use
+   the term URL below when describing syntax or semantics that only
+   apply to locators.
+
+   Although many URL schemes are named after protocols, this does not
+   imply that the only way to access the URL's resource is via the named
+   protocol.  Gateways, proxies, caches, and name resolution services
+   might be used to access some resources, independent of the protocol
+   of their origin, and the resolution of some URL may require the use
+   of more than one protocol (e.g., both DNS and HTTP are typically used
+   to access an "http" URL's resource when it can't be found in a local
+   cache).
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 3]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   A URN differs from a URL in that it's primary purpose is persistent
+   labeling of a resource with an identifier.  That identifier is drawn
+   from one of a set of defined namespaces, each of which has its own
+   set name structure and assignment procedures.  The "urn" scheme has
+   been reserved to establish the requirements for a standardized URN
+   namespace, as defined in "URN Syntax" [RFC2141] and its related
+   specifications.
+
+   Most of the examples in this specification demonstrate URL, since
+   they allow the most varied use of the syntax and often have a
+   hierarchical namespace.  A parser of the URI syntax is capable of
+   parsing both URL and URN references as a generic URI; once the scheme
+   is determined, the scheme-specific parsing can be performed on the
+   generic URI components.  In other words, the URI syntax is a superset
+   of the syntax of all URI schemes.
+
+1.3. Example URI
+
+   The following examples illustrate URI that are in common use.
+
+   ftp://ftp.is.co.za/rfc/rfc1808.txt
+      -- ftp scheme for File Transfer Protocol services
+
+   gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles
+      -- gopher scheme for Gopher and Gopher+ Protocol services
+
+   http://www.math.uio.no/faq/compression-faq/part1.html
+      -- http scheme for Hypertext Transfer Protocol services
+
+   mailto:mduerst@ifi.unizh.ch
+      -- mailto scheme for electronic mail addresses
+
+   news:comp.infosystems.www.servers.unix
+      -- news scheme for USENET news groups and articles
+
+   telnet://melvyl.ucop.edu/
+      -- telnet scheme for interactive services via the TELNET Protocol
+
+1.4. Hierarchical URI and Relative Forms
+
+   An absolute identifier refers to a resource independent of the
+   context in which the identifier is used.  In contrast, a relative
+   identifier refers to a resource by describing the difference within a
+   hierarchical namespace between the current context and an absolute
+   identifier of the resource.
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 4]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Some URI schemes support a hierarchical naming system, where the
+   hierarchy of the name is denoted by a "/" delimiter separating the
+   components in the scheme. This document defines a scheme-independent
+   `relative' form of URI reference that can be used in conjunction with
+   a `base' URI (of a hierarchical scheme) to produce another URI. The
+   syntax of hierarchical URI is described in Section 3; the relative
+   URI calculation is described in Section 5.
+
+1.5. URI Transcribability
+
+   The URI syntax was designed with global transcribability as one of
+   its main concerns. A URI is a sequence of characters from a very
+   limited set, i.e. the letters of the basic Latin alphabet, digits,
+   and a few special characters.  A URI may be represented in a variety
+   of ways: e.g., ink on paper, pixels on a screen, or a sequence of
+   octets in a coded character set.  The interpretation of a URI depends
+   only on the characters used and not how those characters are
+   represented in a network protocol.
+
+   The goal of transcribability can be described by a simple scenario.
+   Imagine two colleagues, Sam and Kim, sitting in a pub at an
+   international conference and exchanging research ideas.  Sam asks Kim
+   for a location to get more information, so Kim writes the URI for the
+   research site on a napkin.  Upon returning home, Sam takes out the
+   napkin and types the URI into a computer, which then retrieves the
+   information to which Kim referred.
+
+   There are several design concerns revealed by the scenario:
+
+      o  A URI is a sequence of characters, which is not always
+         represented as a sequence of octets.
+
+      o  A URI may be transcribed from a non-network source, and thus
+         should consist of characters that are most likely to be able to
+         be typed into a computer, within the constraints imposed by
+         keyboards (and related input devices) across languages and
+         locales.
+
+      o  A URI often needs to be remembered by people, and it is easier
+         for people to remember a URI when it consists of meaningful
+         components.
+
+   These design concerns are not always in alignment.  For example, it
+   is often the case that the most meaningful name for a URI component
+   would require characters that cannot be typed into some systems.  The
+   ability to transcribe the resource identifier from one medium to
+   another was considered more important than having its URI consist of
+   the most meaningful of components.  In local and regional contexts
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 5]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   and with improving technology, users might benefit from being able to
+   use a wider range of characters; such use is not defined in this
+   document.
+
+1.6. Syntax Notation and Common Elements
+
+   This document uses two conventions to describe and define the syntax
+   for URI.  The first, called the layout form, is a general description
+   of the order of components and component separators, as in
+
+      <first>/<second>;<third>?<fourth>
+
+   The component names are enclosed in angle-brackets and any characters
+   outside angle-brackets are literal separators.  Whitespace should be
+   ignored.  These descriptions are used informally and do not define
+   the syntax requirements.
+
+   The second convention is a BNF-like grammar, used to define the
+   formal URI syntax.  The grammar is that of [RFC822], except that "|"
+   is used to designate alternatives.  Briefly, rules are separated from
+   definitions by an equal "=", indentation is used to continue a rule
+   definition over more than one line, literals are quoted with "",
+   parentheses "(" and ")" are used to group elements, optional elements
+   are enclosed in "[" and "]" brackets, and elements may be preceded
+   with <n>* to designate n or more repetitions of the following
+   element; n defaults to 0.
+
+   Unlike many specifications that use a BNF-like grammar to define the
+   bytes (octets) allowed by a protocol, the URI grammar is defined in
+   terms of characters.  Each literal in the grammar corresponds to the
+   character it represents, rather than to the octet encoding of that
+   character in any particular coded character set.  How a URI is
+   represented in terms of bits and bytes on the wire is dependent upon
+   the character encoding of the protocol used to transport it, or the
+   charset of the document which contains it.
+
+   The following definitions are common to many elements:
+
+      alpha    = lowalpha | upalpha
+
+      lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
+                 "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
+                 "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
+
+      upalpha  = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
+                 "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
+                 "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 6]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
+                 "8" | "9"
+
+      alphanum = alpha | digit
+
+   The complete URI syntax is collected in Appendix A.
+
+2. URI Characters and Escape Sequences
+
+   URI consist of a restricted set of characters, primarily chosen to
+   aid transcribability and usability both in computer systems and in
+   non-computer communications. Characters used conventionally as
+   delimiters around URI were excluded.  The restricted set of
+   characters consists of digits, letters, and a few graphic symbols
+   were chosen from those common to most of the character encodings and
+   input facilities available to Internet users.
+
+      uric          = reserved | unreserved | escaped
+
+   Within a URI, characters are either used as delimiters, or to
+   represent strings of data (octets) within the delimited portions.
+   Octets are either represented directly by a character (using the US-
+   ASCII character for that octet [ASCII]) or by an escape encoding.
+   This representation is elaborated below.
+
+2.1 URI and non-ASCII characters
+
+   The relationship between URI and characters has been a source of
+   confusion for characters that are not part of US-ASCII. To describe
+   the relationship, it is useful to distinguish between a "character"
+   (as a distinguishable semantic entity) and an "octet" (an 8-bit
+   byte). There are two mappings, one from URI characters to octets, and
+   a second from octets to original characters:
+
+   URI character sequence->octet sequence->original character sequence
+
+   A URI is represented as a sequence of characters, not as a sequence
+   of octets. That is because URI might be "transported" by means that
+   are not through a computer network, e.g., printed on paper, read over
+   the radio, etc.
+
+   A URI scheme may define a mapping from URI characters to octets;
+   whether this is done depends on the scheme. Commonly, within a
+   delimited component of a URI, a sequence of characters may be used to
+   represent a sequence of octets. For example, the character "a"
+   represents the octet 97 (decimal), while the character sequence "%",
+   "0", "a" represents the octet 10 (decimal).
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 7]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   There is a second translation for some resources: the sequence of
+   octets defined by a component of the URI is subsequently used to
+   represent a sequence of characters. A 'charset' defines this mapping.
+   There are many charsets in use in Internet protocols. For example,
+   UTF-8 [UTF-8] defines a mapping from sequences of octets to sequences
+   of characters in the repertoire of ISO 10646.
+
+   In the simplest case, the original character sequence contains only
+   characters that are defined in US-ASCII, and the two levels of
+   mapping are simple and easily invertible: each 'original character'
+   is represented as the octet for the US-ASCII code for it, which is,
+   in turn, represented as either the US-ASCII character, or else the
+   "%" escape sequence for that octet.
+
+   For original character sequences that contain non-ASCII characters,
+   however, the situation is more difficult. Internet protocols that
+   transmit octet sequences intended to represent character sequences
+   are expected to provide some way of identifying the charset used, if
+   there might be more than one [RFC2277].  However, there is currently
+   no provision within the generic URI syntax to accomplish this
+   identification. An individual URI scheme may require a single
+   charset, define a default charset, or provide a way to indicate the
+   charset used.
+
+   It is expected that a systematic treatment of character encoding
+   within URI will be developed as a future modification of this
+   specification.
+
+2.2. Reserved Characters
+
+   Many URI include components consisting of or delimited by, certain
+   special characters.  These characters are called "reserved", since
+   their usage within the URI component is limited to their reserved
+   purpose.  If the data for a URI component would conflict with the
+   reserved purpose, then the conflicting data must be escaped before
+   forming the URI.
+
+      reserved    = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
+                    "$" | ","
+
+   The "reserved" syntax class above refers to those characters that are
+   allowed within a URI, but which may not be allowed within a
+   particular component of the generic URI syntax; they are used as
+   delimiters of the components described in Section 3.
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 8]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Characters in the "reserved" set are not reserved in all contexts.
+   The set of characters actually reserved within any given URI
+   component is defined by that component. In general, a character is
+   reserved if the semantics of the URI changes if the character is
+   replaced with its escaped US-ASCII encoding.
+
+2.3. Unreserved Characters
+
+   Data characters that are allowed in a URI but do not have a reserved
+   purpose are called unreserved.  These include upper and lower case
+   letters, decimal digits, and a limited set of punctuation marks and
+   symbols.
+
+      unreserved  = alphanum | mark
+
+      mark        = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
+
+   Unreserved characters can be escaped without changing the semantics
+   of the URI, but this should not be done unless the URI is being used
+   in a context that does not allow the unescaped character to appear.
+
+2.4. Escape Sequences
+
+   Data must be escaped if it does not have a representation using an
+   unreserved character; this includes data that does not correspond to
+   a printable character of the US-ASCII coded character set, or that
+   corresponds to any US-ASCII character that is disallowed, as
+   explained below.
+
+2.4.1. Escaped Encoding
+
+   An escaped octet is encoded as a character triplet, consisting of the
+   percent character "%" followed by the two hexadecimal digits
+   representing the octet code. For example, "%20" is the escaped
+   encoding for the US-ASCII space character.
+
+      escaped     = "%" hex hex
+      hex         = digit | "A" | "B" | "C" | "D" | "E" | "F" |
+                            "a" | "b" | "c" | "d" | "e" | "f"
+
+2.4.2. When to Escape and Unescape
+
+   A URI is always in an "escaped" form, since escaping or unescaping a
+   completed URI might change its semantics.  Normally, the only time
+   escape encodings can safely be made is when the URI is being created
+   from its component parts; each component may have its own set of
+   characters that are reserved, so only the mechanism responsible for
+   generating or interpreting that component can determine whether or
+
+
+
+Berners-Lee, et. al.        Standards Track                     [Page 9]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   not escaping a character will change its semantics. Likewise, a URI
+   must be separated into its components before the escaped characters
+   within those components can be safely decoded.
+
+   In some cases, data that could be represented by an unreserved
+   character may appear escaped; for example, some of the unreserved
+   "mark" characters are automatically escaped by some systems.  If the
+   given URI scheme defines a canonicalization algorithm, then
+   unreserved characters may be unescaped according to that algorithm.
+   For example, "%7e" is sometimes used instead of "~" in an http URL
+   path, but the two are equivalent for an http URL.
+
+   Because the percent "%" character always has the reserved purpose of
+   being the escape indicator, it must be escaped as "%25" in order to
+   be used as data within a URI.  Implementers should be careful not to
+   escape or unescape the same string more than once, since unescaping
+   an already unescaped string might lead to misinterpreting a percent
+   data character as another escaped character, or vice versa in the
+   case of escaping an already escaped string.
+
+2.4.3. Excluded US-ASCII Characters
+
+   Although they are disallowed within the URI syntax, we include here a
+   description of those US-ASCII characters that have been excluded and
+   the reasons for their exclusion.
+
+   The control characters in the US-ASCII coded character set are not
+   used within a URI, both because they are non-printable and because
+   they are likely to be misinterpreted by some control mechanisms.
+
+   control     = <US-ASCII coded characters 00-1F and 7F hexadecimal>
+
+   The space character is excluded because significant spaces may
+   disappear and insignificant spaces may be introduced when URI are
+   transcribed or typeset or subjected to the treatment of word-
+   processing programs.  Whitespace is also used to delimit URI in many
+   contexts.
+
+   space       = <US-ASCII coded character 20 hexadecimal>
+
+   The angle-bracket "<" and ">" and double-quote (") characters are
+   excluded because they are often used as the delimiters around URI in
+   text documents and protocol fields.  The character "#" is excluded
+   because it is used to delimit a URI from a fragment identifier in URI
+   references (Section 4). The percent character "%" is excluded because
+   it is used for the encoding of escaped characters.
+
+   delims      = "<" | ">" | "#" | "%" | <">
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 10]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Other characters are excluded because gateways and other transport
+   agents are known to sometimes modify such characters, or they are
+   used as delimiters.
+
+   unwise      = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
+
+   Data corresponding to excluded characters must be escaped in order to
+   be properly represented within a URI.
+
+3. URI Syntactic Components
+
+   The URI syntax is dependent upon the scheme.  In general, absolute
+   URI are written as follows:
+
+      <scheme>:<scheme-specific-part>
+
+   An absolute URI contains the name of the scheme being used (<scheme>)
+   followed by a colon (":") and then a string (the <scheme-specific-
+   part>) whose interpretation depends on the scheme.
+
+   The URI syntax does not require that the scheme-specific-part have
+   any general structure or set of semantics which is common among all
+   URI.  However, a subset of URI do share a common syntax for
+   representing hierarchical relationships within the namespace.  This
+   "generic URI" syntax consists of a sequence of four main components:
+
+      <scheme>://<authority><path>?<query>
+
+   each of which, except <scheme>, may be absent from a particular URI.
+   For example, some URI schemes do not allow an <authority> component,
+   and others do not use a <query> component.
+
+      absoluteURI   = scheme ":" ( hier_part | opaque_part )
+
+   URI that are hierarchical in nature use the slash "/" character for
+   separating hierarchical components.  For some file systems, a "/"
+   character (used to denote the hierarchical structure of a URI) is the
+   delimiter used to construct a file name hierarchy, and thus the URI
+   path will look similar to a file pathname.  This does NOT imply that
+   the resource is a file or that the URI maps to an actual filesystem
+   pathname.
+
+      hier_part     = ( net_path | abs_path ) [ "?" query ]
+
+      net_path      = "//" authority [ abs_path ]
+
+      abs_path      = "/"  path_segments
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 11]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   URI that do not make use of the slash "/" character for separating
+   hierarchical components are considered opaque by the generic URI
+   parser.
+
+      opaque_part   = uric_no_slash *uric
+
+      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
+                      "&" | "=" | "+" | "$" | ","
+
+   We use the term <path> to refer to both the <abs_path> and
+   <opaque_part> constructs, since they are mutually exclusive for any
+   given URI and can be parsed as a single component.
+
+3.1. Scheme Component
+
+   Just as there are many different methods of access to resources,
+   there are a variety of schemes for identifying such resources.  The
+   URI syntax consists of a sequence of components separated by reserved
+   characters, with the first component defining the semantics for the
+   remainder of the URI string.
+
+   Scheme names consist of a sequence of characters beginning with a
+   lower case letter and followed by any combination of lower case
+   letters, digits, plus ("+"), period ("."), or hyphen ("-").  For
+   resiliency, programs interpreting URI should treat upper case letters
+   as equivalent to lower case in scheme names (e.g., allow "HTTP" as
+   well as "http").
+
+      scheme        = alpha *( alpha | digit | "+" | "-" | "." )
+
+   Relative URI references are distinguished from absolute URI in that
+   they do not begin with a scheme name.  Instead, the scheme is
+   inherited from the base URI, as described in Section 5.2.
+
+3.2. Authority Component
+
+   Many URI schemes include a top hierarchical element for a naming
+   authority, such that the namespace defined by the remainder of the
+   URI is governed by that authority.  This authority component is
+   typically defined by an Internet-based server or a scheme-specific
+   registry of naming authorities.
+
+      authority     = server | reg_name
+
+   The authority component is preceded by a double slash "//" and is
+   terminated by the next slash "/", question-mark "?", or by the end of
+   the URI.  Within the authority component, the characters ";", ":",
+   "@", "?", and "/" are reserved.
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 12]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   An authority component is not required for a URI scheme to make use
+   of relative references.  A base URI without an authority component
+   implies that any relative reference will also be without an authority
+   component.
+
+3.2.1. Registry-based Naming Authority
+
+   The structure of a registry-based naming authority is specific to the
+   URI scheme, but constrained to the allowed characters for an
+   authority component.
+
+      reg_name      = 1*( unreserved | escaped | "$" | "," |
+                          ";" | ":" | "@" | "&" | "=" | "+" )
+
+3.2.2. Server-based Naming Authority
+
+   URL schemes that involve the direct use of an IP-based protocol to a
+   specified server on the Internet use a common syntax for the server
+   component of the URI's scheme-specific data:
+
+      <userinfo>@<host>:<port>
+
+   where <userinfo> may consist of a user name and, optionally, scheme-
+   specific information about how to gain authorization to access the
+   server.  The parts "<userinfo>@" and ":<port>" may be omitted.
+
+      server        = [ [ userinfo "@" ] hostport ]
+
+   The user information, if present, is followed by a commercial at-sign
+   "@".
+
+      userinfo      = *( unreserved | escaped |
+                         ";" | ":" | "&" | "=" | "+" | "$" | "," )
+
+   Some URL schemes use the format "user:password" in the userinfo
+   field. This practice is NOT RECOMMENDED, because the passing of
+   authentication information in clear text (such as URI) has proven to
+   be a security risk in almost every case where it has been used.
+
+   The host is a domain name of a network host, or its IPv4 address as a
+   set of four decimal digit groups separated by ".".  Literal IPv6
+   addresses are not supported.
+
+      hostport      = host [ ":" port ]
+      host          = hostname | IPv4address
+      hostname      = *( domainlabel "." ) toplabel [ "." ]
+      domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
+      toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 13]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
+      port          = *digit
+
+   Hostnames take the form described in Section 3 of [RFC1034] and
+   Section 2.1 of [RFC1123]: a sequence of domain labels separated by
+   ".", each domain label starting and ending with an alphanumeric
+   character and possibly also containing "-" characters.  The rightmost
+   domain label of a fully qualified domain name will never start with a
+   digit, thus syntactically distinguishing domain names from IPv4
+   addresses, and may be followed by a single "." if it is necessary to
+   distinguish between the complete domain name and any local domain.
+   To actually be "Uniform" as a resource locator, a URL hostname should
+   be a fully qualified domain name.  In practice, however, the host
+   component may be a local domain literal.
+
+      Note: A suitable representation for including a literal IPv6
+      address as the host part of a URL is desired, but has not yet been
+      determined or implemented in practice.
+
+   The port is the network port number for the server.  Most schemes
+   designate protocols that have a default port number.  Another port
+   number may optionally be supplied, in decimal, separated from the
+   host by a colon.  If the port is omitted, the default port number is
+   assumed.
+
+3.3. Path Component
+
+   The path component contains data, specific to the authority (or the
+   scheme if there is no authority component), identifying the resource
+   within the scope of that scheme and authority.
+
+      path          = [ abs_path | opaque_part ]
+
+      path_segments = segment *( "/" segment )
+      segment       = *pchar *( ";" param )
+      param         = *pchar
+
+      pchar         = unreserved | escaped |
+                      ":" | "@" | "&" | "=" | "+" | "$" | ","
+
+   The path may consist of a sequence of path segments separated by a
+   single slash "/" character.  Within a path segment, the characters
+   "/", ";", "=", and "?" are reserved.  Each path segment may include a
+   sequence of parameters, indicated by the semicolon ";" character.
+   The parameters are not significant to the parsing of relative
+   references.
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 14]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+3.4. Query Component
+
+   The query component is a string of information to be interpreted by
+   the resource.
+
+      query         = *uric
+
+   Within a query component, the characters ";", "/", "?", ":", "@",
+   "&", "=", "+", ",", and "$" are reserved.
+
+4. URI References
+
+   The term "URI-reference" is used here to denote the common usage of a
+   resource identifier.  A URI reference may be absolute or relative,
+   and may have additional information attached in the form of a
+   fragment identifier.  However, "the URI" that results from such a
+   reference includes only the absolute URI after the fragment
+   identifier (if any) is removed and after any relative URI is resolved
+   to its absolute form.  Although it is possible to limit the
+   discussion of URI syntax and semantics to that of the absolute
+   result, most usage of URI is within general URI references, and it is
+   impossible to obtain the URI from such a reference without also
+   parsing the fragment and resolving the relative form.
+
+      URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
+
+   The syntax for relative URI is a shortened form of that for absolute
+   URI, where some prefix of the URI is missing and certain path
+   components ("." and "..") have a special meaning when, and only when,
+   interpreting a relative path.  The relative URI syntax is defined in
+   Section 5.
+
+4.1. Fragment Identifier
+
+   When a URI reference is used to perform a retrieval action on the
+   identified resource, the optional fragment identifier, separated from
+   the URI by a crosshatch ("#") character, consists of additional
+   reference information to be interpreted by the user agent after the
+   retrieval action has been successfully completed.  As such, it is not
+   part of a URI, but is often used in conjunction with a URI.
+
+      fragment      = *uric
+
+   The semantics of a fragment identifier is a property of the data
+   resulting from a retrieval action, regardless of the type of URI used
+   in the reference.  Therefore, the format and interpretation of
+   fragment identifiers is dependent on the media type [RFC2046] of the
+   retrieval result.  The character restrictions described in Section 2
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 15]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   for URI also apply to the fragment in a URI-reference.  Individual
+   media types may define additional restrictions or structure within
+   the fragment for specifying different types of "partial views" that
+   can be identified within that media type.
+
+   A fragment identifier is only meaningful when a URI reference is
+   intended for retrieval and the result of that retrieval is a document
+   for which the identified fragment is consistently defined.
+
+4.2. Same-document References
+
+   A URI reference that does not contain a URI is a reference to the
+   current document.  In other words, an empty URI reference within a
+   document is interpreted as a reference to the start of that document,
+   and a reference containing only a fragment identifier is a reference
+   to the identified fragment of that document.  Traversal of such a
+   reference should not result in an additional retrieval action.
+   However, if the URI reference occurs in a context that is always
+   intended to result in a new request, as in the case of HTML's FORM
+   element, then an empty URI reference represents the base URI of the
+   current document and should be replaced by that URI when transformed
+   into a request.
+
+4.3. Parsing a URI Reference
+
+   A URI reference is typically parsed according to the four main
+   components and fragment identifier in order to determine what
+   components are present and whether the reference is relative or
+   absolute.  The individual components are then parsed for their
+   subparts and, if not opaque, to verify their validity.
+
+   Although the BNF defines what is allowed in each component, it is
+   ambiguous in terms of differentiating between an authority component
+   and a path component that begins with two slash characters.  The
+   greedy algorithm is used for disambiguation: the left-most matching
+   rule soaks up as much of the URI reference string as it is capable of
+   matching.  In other words, the authority component wins.
+
+   Readers familiar with regular expressions should see Appendix B for a
+   concrete parsing example and test oracle.
+
+5. Relative URI References
+
+   It is often the case that a group or "tree" of documents has been
+   constructed to serve a common purpose; the vast majority of URI in
+   these documents point to resources within the tree rather than
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 16]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   outside of it.  Similarly, documents located at a particular site are
+   much more likely to refer to other resources at that site than to
+   resources at remote sites.
+
+   Relative addressing of URI allows document trees to be partially
+   independent of their location and access scheme.  For instance, it is
+   possible for a single set of hypertext documents to be simultaneously
+   accessible and traversable via each of the "file", "http", and "ftp"
+   schemes if the documents refer to each other using relative URI.
+   Furthermore, such document trees can be moved, as a whole, without
+   changing any of the relative references.  Experience within the WWW
+   has demonstrated that the ability to perform relative referencing is
+   necessary for the long-term usability of embedded URI.
+
+   The syntax for relative URI takes advantage of the <hier_part> syntax
+   of <absoluteURI> (Section 3) in order to express a reference that is
+   relative to the namespace of another hierarchical URI.
+
+      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]
+
+   A relative reference beginning with two slash characters is termed a
+   network-path reference, as defined by <net_path> in Section 3.  Such
+   references are rarely used.
+
+   A relative reference beginning with a single slash character is
+   termed an absolute-path reference, as defined by <abs_path> in
+   Section 3.
+
+   A relative reference that does not begin with a scheme name or a
+   slash character is termed a relative-path reference.
+
+      rel_path      = rel_segment [ abs_path ]
+
+      rel_segment   = 1*( unreserved | escaped |
+                          ";" | "@" | "&" | "=" | "+" | "$" | "," )
+
+   Within a relative-path reference, the complete path segments "." and
+   ".." have special meanings: "the current hierarchy level" and "the
+   level above this hierarchy level", respectively.  Although this is
+   very similar to their use within Unix-based filesystems to indicate
+   directory levels, these path components are only considered special
+   when resolving a relative-path reference to its absolute form
+   (Section 5.2).
+
+   Authors should be aware that a path segment which contains a colon
+   character cannot be used as the first segment of a relative URI path
+   (e.g., "this:that"), because it would be mistaken for a scheme name.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 17]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   It is therefore necessary to precede such segments with other
+   segments (e.g., "./this:that") in order for them to be referenced as
+   a relative path.
+
+   It is not necessary for all URI within a given scheme to be
+   restricted to the <hier_part> syntax, since the hierarchical
+   properties of that syntax are only necessary when relative URI are
+   used within a particular document.  Documents can only make use of
+   relative URI when their base URI fits within the <hier_part> syntax.
+   It is assumed that any document which contains a relative reference
+   will also have a base URI that obeys the syntax.  In other words,
+   relative URI cannot be used within a document that has an unsuitable
+   base URI.
+
+   Some URI schemes do not allow a hierarchical syntax matching the
+   <hier_part> syntax, and thus cannot use relative references.
+
+5.1. Establishing a Base URI
+
+   The term "relative URI" implies that there exists some absolute "base
+   URI" against which the relative reference is applied.  Indeed, the
+   base URI is necessary to define the semantics of any relative URI
+   reference; without it, a relative reference is meaningless.  In order
+   for relative URI to be usable within a document, the base URI of that
+   document must be known to the parser.
+
+   The base URI of a document can be established in one of four ways,
+   listed below in order of precedence.  The order of precedence can be
+   thought of in terms of layers, where the innermost defined base URI
+   has the highest precedence.  This can be visualized graphically as:
+
+      .----------------------------------------------------------.
+      |  .----------------------------------------------------.  |
+      |  |  .----------------------------------------------.  |  |
+      |  |  |  .----------------------------------------.  |  |  |
+      |  |  |  |  .----------------------------------.  |  |  |  |
+      |  |  |  |  |       <relative_reference>       |  |  |  |  |
+      |  |  |  |  `----------------------------------'  |  |  |  |
+      |  |  |  | (5.1.1) Base URI embedded in the       |  |  |  |
+      |  |  |  |         document's content             |  |  |  |
+      |  |  |  `----------------------------------------'  |  |  |
+      |  |  | (5.1.2) Base URI of the encapsulating entity |  |  |
+      |  |  |         (message, document, or none).        |  |  |
+      |  |  `----------------------------------------------'  |  |
+      |  | (5.1.3) URI used to retrieve the entity            |  |
+      |  `----------------------------------------------------'  |
+      | (5.1.4) Default Base URI is application-dependent        |
+      `----------------------------------------------------------'
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 18]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+5.1.1. Base URI within Document Content
+
+   Within certain document media types, the base URI of the document can
+   be embedded within the content itself such that it can be readily
+   obtained by a parser.  This can be useful for descriptive documents,
+   such as tables of content, which may be transmitted to others through
+   protocols other than their usual retrieval context (e.g., E-Mail or
+   USENET news).
+
+   It is beyond the scope of this document to specify how, for each
+   media type, the base URI can be embedded.  It is assumed that user
+   agents manipulating such media types will be able to obtain the
+   appropriate syntax from that media type's specification.  An example
+   of how the base URI can be embedded in the Hypertext Markup Language
+   (HTML) [RFC1866] is provided in Appendix D.
+
+   A mechanism for embedding the base URI within MIME container types
+   (e.g., the message and multipart types) is defined by MHTML
+   [RFC2110].  Protocols that do not use the MIME message header syntax,
+   but which do allow some form of tagged metainformation to be included
+   within messages, may define their own syntax for defining the base
+   URI as part of a message.
+
+5.1.2. Base URI from the Encapsulating Entity
+
+   If no base URI is embedded, the base URI of a document is defined by
+   the document's retrieval context.  For a document that is enclosed
+   within another entity (such as a message or another document), the
+   retrieval context is that entity; thus, the default base URI of the
+   document is the base URI of the entity in which the document is
+   encapsulated.
+
+5.1.3. Base URI from the Retrieval URI
+
+   If no base URI is embedded and the document is not encapsulated
+   within some other entity (e.g., the top level of a composite entity),
+   then, if a URI was used to retrieve the base document, that URI shall
+   be considered the base URI.  Note that if the retrieval was the
+   result of a redirected request, the last URI used (i.e., that which
+   resulted in the actual retrieval of the document) is the base URI.
+
+5.1.4. Default Base URI
+
+   If none of the conditions described in Sections 5.1.1--5.1.3 apply,
+   then the base URI is defined by the context of the application.
+   Since this definition is necessarily application-dependent, failing
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 19]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   to define the base URI using one of the other methods may result in
+   the same content being interpreted differently by different types of
+   application.
+
+   It is the responsibility of the distributor(s) of a document
+   containing relative URI to ensure that the base URI for that document
+   can be established.  It must be emphasized that relative URI cannot
+   be used reliably in situations where the document's base URI is not
+   well-defined.
+
+5.2. Resolving Relative References to Absolute Form
+
+   This section describes an example algorithm for resolving URI
+   references that might be relative to a given base URI.
+
+   The base URI is established according to the rules of Section 5.1 and
+   parsed into the four main components as described in Section 3.  Note
+   that only the scheme component is required to be present in the base
+   URI; the other components may be empty or undefined.  A component is
+   undefined if its preceding separator does not appear in the URI
+   reference; the path component is never undefined, though it may be
+   empty.  The base URI's query component is not used by the resolution
+   algorithm and may be discarded.
+
+   For each URI reference, the following steps are performed in order:
+
+   1) The URI reference is parsed into the potential four components and
+      fragment identifier, as described in Section 4.3.
+
+   2) If the path component is empty and the scheme, authority, and
+      query components are undefined, then it is a reference to the
+      current document and we are done.  Otherwise, the reference URI's
+      query and fragment components are defined as found (or not found)
+      within the URI reference and not inherited from the base URI.
+
+   3) If the scheme component is defined, indicating that the reference
+      starts with a scheme name, then the reference is interpreted as an
+      absolute URI and we are done.  Otherwise, the reference URI's
+      scheme is inherited from the base URI's scheme component.
+
+      Due to a loophole in prior specifications [RFC1630], some parsers
+      allow the scheme name to be present in a relative URI if it is the
+      same as the base URI scheme.  Unfortunately, this can conflict
+      with the correct parsing of non-hierarchical URI.  For backwards
+      compatibility, an implementation may work around such references
+      by removing the scheme if it matches that of the base URI and the
+      scheme is known to always use the <hier_part> syntax.  The parser
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 20]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      can then continue with the steps below for the remainder of the
+      reference components.  Validating parsers should mark such a
+      misformed relative reference as an error.
+
+   4) If the authority component is defined, then the reference is a
+      network-path and we skip to step 7.  Otherwise, the reference
+      URI's authority is inherited from the base URI's authority
+      component, which will also be undefined if the URI scheme does not
+      use an authority component.
+
+   5) If the path component begins with a slash character ("/"), then
+      the reference is an absolute-path and we skip to step 7.
+
+   6) If this step is reached, then we are resolving a relative-path
+      reference.  The relative path needs to be merged with the base
+      URI's path.  Although there are many ways to do this, we will
+      describe a simple method using a separate string buffer.
+
+      a) All but the last segment of the base URI's path component is
+         copied to the buffer.  In other words, any characters after the
+         last (right-most) slash character, if any, are excluded.
+
+      b) The reference's path component is appended to the buffer
+         string.
+
+      c) All occurrences of "./", where "." is a complete path segment,
+         are removed from the buffer string.
+
+      d) If the buffer string ends with "." as a complete path segment,
+         that "." is removed.
+
+      e) All occurrences of "<segment>/../", where <segment> is a
+         complete path segment not equal to "..", are removed from the
+         buffer string.  Removal of these path segments is performed
+         iteratively, removing the leftmost matching pattern on each
+         iteration, until no matching pattern remains.
+
+      f) If the buffer string ends with "<segment>/..", where <segment>
+         is a complete path segment not equal to "..", that
+         "<segment>/.." is removed.
+
+      g) If the resulting buffer string still begins with one or more
+         complete path segments of "..", then the reference is
+         considered to be in error.  Implementations may handle this
+         error by retaining these components in the resolved path (i.e.,
+         treating them as part of the final URI), by removing them from
+         the resolved path (i.e., discarding relative levels above the
+         root), or by avoiding traversal of the reference.
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 21]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      h) The remaining buffer string is the reference URI's new path
+         component.
+
+   7) The resulting URI components, including any inherited from the
+      base URI, are recombined to give the absolute form of the URI
+      reference.  Using pseudocode, this would be
+
+         result = ""
+
+         if scheme is defined then
+             append scheme to result
+             append ":" to result
+
+         if authority is defined then
+             append "//" to result
+             append authority to result
+
+         append path to result
+
+         if query is defined then
+             append "?" to result
+             append query to result
+
+         if fragment is defined then
+             append "#" to result
+             append fragment to result
+
+         return result
+
+      Note that we must be careful to preserve the distinction between a
+      component that is undefined, meaning that its separator was not
+      present in the reference, and a component that is empty, meaning
+      that the separator was present and was immediately followed by the
+      next component separator or the end of the reference.
+
+   The above algorithm is intended to provide an example by which the
+   output of implementations can be tested -- implementation of the
+   algorithm itself is not required.  For example, some systems may find
+   it more efficient to implement step 6 as a pair of segment stacks
+   being merged, rather than as a series of string pattern replacements.
+
+      Note: Some WWW client applications will fail to separate the
+      reference's query component from its path component before merging
+      the base and reference paths in step 6 above.  This may result in
+      a loss of information if the query component contains the strings
+      "/../" or "/./".
+
+   Resolution examples are provided in Appendix C.
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 22]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+6. URI Normalization and Equivalence
+
+   In many cases, different URI strings may actually identify the
+   identical resource. For example, the host names used in URL are
+   actually case insensitive, and the URL <http://www.XEROX.com> is
+   equivalent to <http://www.xerox.com>. In general, the rules for
+   equivalence and definition of a normal form, if any, are scheme
+   dependent. When a scheme uses elements of the common syntax, it will
+   also use the common syntax equivalence rules, namely that the scheme
+   and hostname are case insensitive and a URL with an explicit ":port",
+   where the port is the default for the scheme, is equivalent to one
+   where the port is elided.
+
+7. Security Considerations
+
+   A URI does not in itself pose a security threat.  Users should beware
+   that there is no general guarantee that a URL, which at one time
+   located a given resource, will continue to do so.  Nor is there any
+   guarantee that a URL will not locate a different resource at some
+   later point in time, due to the lack of any constraint on how a given
+   authority apportions its namespace.  Such a guarantee can only be
+   obtained from the person(s) controlling that namespace and the
+   resource in question.  A specific URI scheme may include additional
+   semantics, such as name persistence, if those semantics are required
+   of all naming authorities for that scheme.
+
+   It is sometimes possible to construct a URL such that an attempt to
+   perform a seemingly harmless, idempotent operation, such as the
+   retrieval of an entity associated with the resource, will in fact
+   cause a possibly damaging remote operation to occur.  The unsafe URL
+   is typically constructed by specifying a port number other than that
+   reserved for the network protocol in question.  The client
+   unwittingly contacts a site that is in fact running a different
+   protocol.  The content of the URL contains instructions that, when
+   interpreted according to this other protocol, cause an unexpected
+   operation.  An example has been the use of a gopher URL to cause an
+   unintended or impersonating message to be sent via a SMTP server.
+
+   Caution should be used when using any URL that specifies a port
+   number other than the default for the protocol, especially when it is
+   a number within the reserved space.
+
+   Care should be taken when a URL contains escaped delimiters for a
+   given protocol (for example, CR and LF characters for telnet
+   protocols) that these are not unescaped before transmission.  This
+   might violate the protocol, but avoids the potential for such
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 23]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   characters to be used to simulate an extra operation or parameter in
+   that protocol, which might lead to an unexpected and possibly harmful
+   remote operation to be performed.
+
+   It is clearly unwise to use a URL that contains a password which is
+   intended to be secret. In particular, the use of a password within
+   the 'userinfo' component of a URL is strongly disrecommended except
+   in those rare cases where the 'password' parameter is intended to be
+   public.
+
+8. Acknowledgements
+
+   This document was derived from RFC 1738 [RFC1738] and RFC 1808
+   [RFC1808]; the acknowledgements in those specifications still apply.
+   In addition, contributions by Gisle Aas, Martin Beet, Martin Duerst,
+   Jim Gettys, Martijn Koster, Dave Kristol, Daniel LaLiberte, Foteos
+   Macrides, James Marshall, Ryan Moats, Keith Moore, and Lauren Wood
+   are gratefully acknowledged.
+
+9. References
+
+   [RFC2277] Alvestrand, H., "IETF Policy on Character Sets and
+             Languages", BCP 18, RFC 2277, January 1998.
+
+   [RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
+             Unifying Syntax for the Expression of Names and Addresses
+             of Objects on the Network as used in the World-Wide Web",
+             RFC 1630, June 1994.
+
+   [RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, Editors,
+             "Uniform Resource Locators (URL)", RFC 1738, December 1994.
+
+   [RFC1866] Berners-Lee T., and D. Connolly, "HyperText Markup Language
+             Specification -- 2.0", RFC 1866, November 1995.
+
+   [RFC1123] Braden, R., Editor, "Requirements for Internet Hosts --
+             Application and Support", STD 3, RFC 1123, October 1989.
+
+   [RFC822]  Crocker, D., "Standard for the Format of ARPA Internet Text
+             Messages", STD 11, RFC 822, August 1982.
+
+   [RFC1808] Fielding, R., "Relative Uniform Resource Locators", RFC
+             1808, June 1995.
+
+   [RFC2046] Freed, N., and N. Borenstein, "Multipurpose Internet Mail
+             Extensions (MIME) Part Two: Media Types", RFC 2046,
+             November 1996.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 24]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   [RFC1736] Kunze, J., "Functional Recommendations for Internet
+             Resource Locators", RFC 1736, February 1995.
+
+   [RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997.
+
+   [RFC1034] Mockapetris, P., "Domain Names - Concepts and Facilities",
+             STD 13, RFC 1034, November 1987.
+
+   [RFC2110] Palme, J., and A. Hopmann, "MIME E-mail Encapsulation of
+             Aggregate Documents, such as HTML (MHTML)", RFC 2110, March
+             1997.
+
+   [RFC1737] Sollins, K., and L. Masinter, "Functional Requirements for
+             Uniform Resource Names", RFC 1737, December 1994.
+
+   [ASCII]   US-ASCII. "Coded Character Set -- 7-bit American Standard
+             Code for Information Interchange", ANSI X3.4-1986.
+
+   [UTF-8]   Yergeau, F., "UTF-8, a transformation format of ISO 10646",
+             RFC 2279, January 1998.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 25]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+10. Authors' Addresses
+
+   Tim Berners-Lee
+   World Wide Web Consortium
+   MIT Laboratory for Computer Science, NE43-356
+   545 Technology Square
+   Cambridge, MA 02139
+
+   Fax: +1(617)258-8682
+   EMail: timbl@w3.org
+
+
+   Roy T. Fielding
+   Department of Information and Computer Science
+   University of California, Irvine
+   Irvine, CA  92697-3425
+
+   Fax: +1(949)824-1715
+   EMail: fielding@ics.uci.edu
+
+
+   Larry Masinter
+   Xerox PARC
+   3333 Coyote Hill Road
+   Palo Alto, CA 94034
+
+   Fax: +1(415)812-4333
+   EMail: masinter@parc.xerox.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 26]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+A. Collected BNF for URI
+
+      URI-reference = [ absoluteURI | relativeURI ] [ "#" fragment ]
+      absoluteURI   = scheme ":" ( hier_part | opaque_part )
+      relativeURI   = ( net_path | abs_path | rel_path ) [ "?" query ]
+
+      hier_part     = ( net_path | abs_path ) [ "?" query ]
+      opaque_part   = uric_no_slash *uric
+
+      uric_no_slash = unreserved | escaped | ";" | "?" | ":" | "@" |
+                      "&" | "=" | "+" | "$" | ","
+
+      net_path      = "//" authority [ abs_path ]
+      abs_path      = "/"  path_segments
+      rel_path      = rel_segment [ abs_path ]
+
+      rel_segment   = 1*( unreserved | escaped |
+                          ";" | "@" | "&" | "=" | "+" | "$" | "," )
+
+      scheme        = alpha *( alpha | digit | "+" | "-" | "." )
+
+      authority     = server | reg_name
+
+      reg_name      = 1*( unreserved | escaped | "$" | "," |
+                          ";" | ":" | "@" | "&" | "=" | "+" )
+
+      server        = [ [ userinfo "@" ] hostport ]
+      userinfo      = *( unreserved | escaped |
+                         ";" | ":" | "&" | "=" | "+" | "$" | "," )
+
+      hostport      = host [ ":" port ]
+      host          = hostname | IPv4address
+      hostname      = *( domainlabel "." ) toplabel [ "." ]
+      domainlabel   = alphanum | alphanum *( alphanum | "-" ) alphanum
+      toplabel      = alpha | alpha *( alphanum | "-" ) alphanum
+      IPv4address   = 1*digit "." 1*digit "." 1*digit "." 1*digit
+      port          = *digit
+
+      path          = [ abs_path | opaque_part ]
+      path_segments = segment *( "/" segment )
+      segment       = *pchar *( ";" param )
+      param         = *pchar
+      pchar         = unreserved | escaped |
+                      ":" | "@" | "&" | "=" | "+" | "$" | ","
+
+      query         = *uric
+
+      fragment      = *uric
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 27]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      uric          = reserved | unreserved | escaped
+      reserved      = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
+                      "$" | ","
+      unreserved    = alphanum | mark
+      mark          = "-" | "_" | "." | "!" | "~" | "*" | "'" |
+                      "(" | ")"
+
+      escaped       = "%" hex hex
+      hex           = digit | "A" | "B" | "C" | "D" | "E" | "F" |
+                              "a" | "b" | "c" | "d" | "e" | "f"
+
+      alphanum      = alpha | digit
+      alpha         = lowalpha | upalpha
+
+      lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" |
+                 "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" |
+                 "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
+      upalpha  = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
+                 "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
+                 "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"
+      digit    = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
+                 "8" | "9"
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 28]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+B. Parsing a URI Reference with a Regular Expression
+
+   As described in Section 4.3, the generic URI syntax is not sufficient
+   to disambiguate the components of some forms of URI.  Since the
+   "greedy algorithm" described in that section is identical to the
+   disambiguation method used by POSIX regular expressions, it is
+   natural and commonplace to use a regular expression for parsing the
+   potential four components and fragment identifier of a URI reference.
+
+   The following line is the regular expression for breaking-down a URI
+   reference into its components.
+
+      ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
+       12            3  4          5       6  7        8 9
+
+   The numbers in the second line above are only to assist readability;
+   they indicate the reference points for each subexpression (i.e., each
+   paired parenthesis).  We refer to the value matched for subexpression
+   <n> as $<n>.  For example, matching the above expression to
+
+      http://www.ics.uci.edu/pub/ietf/uri/#Related
+
+   results in the following subexpression matches:
+
+      $1 = http:
+      $2 = http
+      $3 = //www.ics.uci.edu
+      $4 = www.ics.uci.edu
+      $5 = /pub/ietf/uri/
+      $6 = <undefined>
+      $7 = <undefined>
+      $8 = #Related
+      $9 = Related
+
+   where <undefined> indicates that the component is not present, as is
+   the case for the query component in the above example.  Therefore, we
+   can determine the value of the four components and fragment as
+
+      scheme    = $2
+      authority = $4
+      path      = $5
+      query     = $7
+      fragment  = $9
+
+   and, going in the opposite direction, we can recreate a URI reference
+   from its components using the algorithm in step 7 of Section 5.2.
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 29]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+C. Examples of Resolving Relative URI References
+
+   Within an object with a well-defined base URI of
+
+      http://a/b/c/d;p?q
+
+   the relative URI would be resolved as follows:
+
+C.1.  Normal Examples
+
+      g:h           =  g:h
+      g             =  http://a/b/c/g
+      ./g           =  http://a/b/c/g
+      g/            =  http://a/b/c/g/
+      /g            =  http://a/g
+      //g           =  http://g
+      ?y            =  http://a/b/c/?y
+      g?y           =  http://a/b/c/g?y
+      #s            =  (current document)#s
+      g#s           =  http://a/b/c/g#s
+      g?y#s         =  http://a/b/c/g?y#s
+      ;x            =  http://a/b/c/;x
+      g;x           =  http://a/b/c/g;x
+      g;x?y#s       =  http://a/b/c/g;x?y#s
+      .             =  http://a/b/c/
+      ./            =  http://a/b/c/
+      ..            =  http://a/b/
+      ../           =  http://a/b/
+      ../g          =  http://a/b/g
+      ../..         =  http://a/
+      ../../        =  http://a/
+      ../../g       =  http://a/g
+
+C.2.  Abnormal Examples
+
+   Although the following abnormal examples are unlikely to occur in
+   normal practice, all URI parsers should be capable of resolving them
+   consistently.  Each example uses the same base as above.
+
+   An empty reference refers to the start of the current document.
+
+      <>            =  (current document)
+
+   Parsers must be careful in handling the case where there are more
+   relative path ".." segments than there are hierarchical levels in the
+   base URI's path.  Note that the ".." syntax cannot be used to change
+   the authority component of a URI.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 30]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      ../../../g    =  http://a/../g
+      ../../../../g =  http://a/../../g
+
+   In practice, some implementations strip leading relative symbolic
+   elements (".", "..") after applying a relative URI calculation, based
+   on the theory that compensating for obvious author errors is better
+   than allowing the request to fail.  Thus, the above two references
+   will be interpreted as "http://a/g" by some implementations.
+
+   Similarly, parsers must avoid treating "." and ".." as special when
+   they are not complete components of a relative path.
+
+      /./g          =  http://a/./g
+      /../g         =  http://a/../g
+      g.            =  http://a/b/c/g.
+      .g            =  http://a/b/c/.g
+      g..           =  http://a/b/c/g..
+      ..g           =  http://a/b/c/..g
+
+   Less likely are cases where the relative URI uses unnecessary or
+   nonsensical forms of the "." and ".." complete path segments.
+
+      ./../g        =  http://a/b/g
+      ./g/.         =  http://a/b/c/g/
+      g/./h         =  http://a/b/c/g/h
+      g/../h        =  http://a/b/c/h
+      g;x=1/./y     =  http://a/b/c/g;x=1/y
+      g;x=1/../y    =  http://a/b/c/y
+
+   All client applications remove the query component from the base URI
+   before resolving relative URI.  However, some applications fail to
+   separate the reference's query and/or fragment components from a
+   relative path before merging it with the base path.  This error is
+   rarely noticed, since typical usage of a fragment never includes the
+   hierarchy ("/") character, and the query component is not normally
+   used within relative references.
+
+      g?y/./x       =  http://a/b/c/g?y/./x
+      g?y/../x      =  http://a/b/c/g?y/../x
+      g#s/./x       =  http://a/b/c/g#s/./x
+      g#s/../x      =  http://a/b/c/g#s/../x
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 31]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   Some parsers allow the scheme name to be present in a relative URI if
+   it is the same as the base URI scheme.  This is considered to be a
+   loophole in prior specifications of partial URI [RFC1630]. Its use
+   should be avoided.
+
+      http:g        =  http:g           ; for validating parsers
+                    |  http://a/b/c/g   ; for backwards compatibility
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 32]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+D. Embedding the Base URI in HTML documents
+
+   It is useful to consider an example of how the base URI of a document
+   can be embedded within the document's content.  In this appendix, we
+   describe how documents written in the Hypertext Markup Language
+   (HTML) [RFC1866] can include an embedded base URI.  This appendix
+   does not form a part of the URI specification and should not be
+   considered as anything more than a descriptive example.
+
+   HTML defines a special element "BASE" which, when present in the
+   "HEAD" portion of a document, signals that the parser should use the
+   BASE element's "HREF" attribute as the base URI for resolving any
+   relative URI.  The "HREF" attribute must be an absolute URI.  Note
+   that, in HTML, element and attribute names are case-insensitive.  For
+   example:
+
+      <!doctype html public "-//IETF//DTD HTML//EN">
+      <HTML><HEAD>
+      <TITLE>An example HTML document</TITLE>
+      <BASE href="http://www.ics.uci.edu/Test/a/b/c">
+      </HEAD><BODY>
+      ... <A href="../x">a hypertext anchor</A> ...
+      </BODY></HTML>
+
+   A parser reading the example document should interpret the given
+   relative URI "../x" as representing the absolute URI
+
+      <http://www.ics.uci.edu/Test/a/x>
+
+   regardless of the context in which the example document was obtained.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 33]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+E. Recommendations for Delimiting URI in Context
+
+   URI are often transmitted through formats that do not provide a clear
+   context for their interpretation.  For example, there are many
+   occasions when URI are included in plain text; examples include text
+   sent in electronic mail, USENET news messages, and, most importantly,
+   printed on paper.  In such cases, it is important to be able to
+   delimit the URI from the rest of the text, and in particular from
+   punctuation marks that might be mistaken for part of the URI.
+
+   In practice, URI are delimited in a variety of ways, but usually
+   within double-quotes "http://test.com/", angle brackets
+   <http://test.com/>, or just using whitespace
+
+                             http://test.com/
+
+   These wrappers do not form part of the URI.
+
+   In the case where a fragment identifier is associated with a URI
+   reference, the fragment would be placed within the brackets as well
+   (separated from the URI with a "#" character).
+
+   In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may
+   need to be added to break long URI across lines. The whitespace
+   should be ignored when extracting the URI.
+
+   No whitespace should be introduced after a hyphen ("-") character.
+   Because some typesetters and printers may (erroneously) introduce a
+   hyphen at the end of line when breaking a line, the interpreter of a
+   URI containing a line break immediately after a hyphen should ignore
+   all unescaped whitespace around the line break, and should be aware
+   that the hyphen may or may not actually be part of the URI.
+
+   Using <> angle brackets around each URI is especially recommended as
+   a delimiting style for URI that contain whitespace.
+
+   The prefix "URL:" (with or without a trailing space) was recommended
+   as a way to used to help distinguish a URL from other bracketed
+   designators, although this is not common in practice.
+
+   For robustness, software that accepts user-typed URI should attempt
+   to recognize and strip both delimiters and embedded whitespace.
+
+   For example, the text:
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 34]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+      Yes, Jim, I found it under "http://www.w3.org/Addressing/",
+      but you can probably pick it up from <ftp://ds.internic.
+      net/rfc/>.  Note the warning in <http://www.ics.uci.edu/pub/
+      ietf/uri/historical.html#WARNING>.
+
+   contains the URI references
+
+      http://www.w3.org/Addressing/
+      ftp://ds.internic.net/rfc/
+      http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 35]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+F. Abbreviated URLs
+
+   The URL syntax was designed for unambiguous reference to network
+   resources and extensibility via the URL scheme.  However, as URL
+   identification and usage have become commonplace, traditional media
+   (television, radio, newspapers, billboards, etc.) have increasingly
+   used abbreviated URL references.  That is, a reference consisting of
+   only the authority and path portions of the identified resource, such
+   as
+
+      www.w3.org/Addressing/
+
+   or simply the DNS hostname on its own.  Such references are primarily
+   intended for human interpretation rather than machine, with the
+   assumption that context-based heuristics are sufficient to complete
+   the URL (e.g., most hostnames beginning with "www" are likely to have
+   a URL prefix of "http://").  Although there is no standard set of
+   heuristics for disambiguating abbreviated URL references, many client
+   implementations allow them to be entered by the user and
+   heuristically resolved.  It should be noted that such heuristics may
+   change over time, particularly when new URL schemes are introduced.
+
+   Since an abbreviated URL has the same syntax as a relative URL path,
+   abbreviated URL references cannot be used in contexts where relative
+   URLs are expected.  This limits the use of abbreviated URLs to places
+   where there is no defined base URL, such as dialog boxes and off-line
+   advertisements.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 36]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+G. Summary of Non-editorial Changes
+
+G.1. Additions
+
+   Section 4 (URI References) was added to stem the confusion regarding
+   "what is a URI" and how to describe fragment identifiers given that
+   they are not part of the URI, but are part of the URI syntax and
+   parsing concerns.  In addition, it provides a reference definition
+   for use by other IETF specifications (HTML, HTTP, etc.) that have
+   previously attempted to redefine the URI syntax in order to account
+   for the presence of fragment identifiers in URI references.
+
+   Section 2.4 was rewritten to clarify a number of misinterpretations
+   and to leave room for fully internationalized URI.
+
+   Appendix F on abbreviated URLs was added to describe the shortened
+   references often seen on television and magazine advertisements and
+   explain why they are not used in other contexts.
+
+G.2. Modifications from both RFC 1738 and RFC 1808
+
+   Changed to URI syntax instead of just URL.
+
+   Confusion regarding the terms "character encoding", the URI
+   "character set", and the escaping of characters with %<hex><hex>
+   equivalents has (hopefully) been reduced.  Many of the BNF rule names
+   regarding the character sets have been changed to more accurately
+   describe their purpose and to encompass all "characters" rather than
+   just US-ASCII octets.  Unless otherwise noted here, these
+   modifications do not affect the URI syntax.
+
+   Both RFC 1738 and RFC 1808 refer to the "reserved" set of characters
+   as if URI-interpreting software were limited to a single set of
+   characters with a reserved purpose (i.e., as meaning something other
+   than the data to which the characters correspond), and that this set
+   was fixed by the URI scheme.  However, this has not been true in
+   practice; any character that is interpreted differently when it is
+   escaped is, in effect, reserved.  Furthermore, the interpreting
+   engine on a HTTP server is often dependent on the resource, not just
+   the URI scheme.  The description of reserved characters has been
+   changed accordingly.
+
+   The plus "+", dollar "$", and comma "," characters have been added to
+   those in the "reserved" set, since they are treated as reserved
+   within the query component.
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 37]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   The tilde "~" character was added to those in the "unreserved" set,
+   since it is extensively used on the Internet in spite of the
+   difficulty to transcribe it with some keyboards.
+
+   The syntax for URI scheme has been changed to require that all
+   schemes begin with an alpha character.
+
+   The "user:password" form in the previous BNF was changed to a
+   "userinfo" token, and the possibility that it might be
+   "user:password" made scheme specific. In particular, the use of
+   passwords in the clear is not even suggested by the syntax.
+
+   The question-mark "?" character was removed from the set of allowed
+   characters for the userinfo in the authority component, since testing
+   showed that many applications treat it as reserved for separating the
+   query component from the rest of the URI.
+
+   The semicolon ";" character was added to those stated as being
+   reserved within the authority component, since several new schemes
+   are using it as a separator within userinfo to indicate the type of
+   user authentication.
+
+   RFC 1738 specified that the path was separated from the authority
+   portion of a URI by a slash.  RFC 1808 followed suit, but with a
+   fudge of carrying around the separator as a "prefix" in order to
+   describe the parsing algorithm.  RFC 1630 never had this problem,
+   since it considered the slash to be part of the path.  In writing
+   this specification, it was found to be impossible to accurately
+   describe and retain the difference between the two URI
+      <foo:/bar>   and   <foo:bar>
+   without either considering the slash to be part of the path (as
+   corresponds to actual practice) or creating a separate component just
+   to hold that slash.  We chose the former.
+
+G.3. Modifications from RFC 1738
+
+   The definition of specific URL schemes and their scheme-specific
+   syntax and semantics has been moved to separate documents.
+
+   The URL host was defined as a fully-qualified domain name.  However,
+   many URLs are used without fully-qualified domain names (in contexts
+   for which the full qualification is not necessary), without any host
+   (as in some file URLs), or with a host of "localhost".
+
+   The URL port is now *digit instead of 1*digit, since systems are
+   expected to handle the case where the ":" separator between host and
+   port is supplied without a port.
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 38]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+   The recommendations for delimiting URI in context (Appendix E) have
+   been adjusted to reflect current practice.
+
+G.4. Modifications from RFC 1808
+
+   RFC 1808 (Section 4) defined an empty URL reference (a reference
+   containing nothing aside from the fragment identifier) as being a
+   reference to the base URL.  Unfortunately, that definition could be
+   interpreted, upon selection of such a reference, as a new retrieval
+   action on that resource.  Since the normal intent of such references
+   is for the user agent to change its view of the current document to
+   the beginning of the specified fragment within that document, not to
+   make an additional request of the resource, a description of how to
+   correctly interpret an empty reference has been added in Section 4.
+
+   The description of the mythical Base header field has been replaced
+   with a reference to the Content-Location header field defined by
+   MHTML [RFC2110].
+
+   RFC 1808 described various schemes as either having or not having the
+   properties of the generic URI syntax.  However, the only requirement
+   is that the particular document containing the relative references
+   have a base URI that abides by the generic URI syntax, regardless of
+   the URI scheme, so the associated description has been updated to
+   reflect that.
+
+   The BNF term <net_loc> has been replaced with <authority>, since the
+   latter more accurately describes its use and purpose.  Likewise, the
+   authority is no longer restricted to the IP server syntax.
+
+   Extensive testing of current client applications demonstrated that
+   the majority of deployed systems do not use the ";" character to
+   indicate trailing parameter information, and that the presence of a
+   semicolon in a path segment does not affect the relative parsing of
+   that segment.  Therefore, parameters have been removed as a separate
+   component and may now appear in any path segment.  Their influence
+   has been removed from the algorithm for resolving a relative URI
+   reference.  The resolution examples in Appendix C have been modified
+   to reflect this change.
+
+   Implementations are now allowed to work around misformed relative
+   references that are prefixed by the same scheme as the base URI, but
+   only for schemes known to use the <hier_part> syntax.
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 39]
+
+RFC 2396                   URI Generic Syntax                August 1998
+
+
+H.  Full Copyright Statement
+
+   Copyright (C) The Internet Society (1998).  All Rights Reserved.
+
+   This document and translations of it may be copied and furnished to
+   others, and derivative works that comment on or otherwise explain it
+   or assist in its implementation may be prepared, copied, published
+   and distributed, in whole or in part, without restriction of any
+   kind, provided that the above copyright notice and this paragraph are
+   included on all such copies and derivative works.  However, this
+   document itself may not be modified in any way, such as by removing
+   the copyright notice or references to the Internet Society or other
+   Internet organizations, except as needed for the purpose of
+   developing Internet standards in which case the procedures for
+   copyrights defined in the Internet Standards process must be
+   followed, or as required to translate it into languages other than
+   English.
+
+   The limited permissions granted above are perpetual and will not be
+   revoked by the Internet Society or its successors or assigns.
+
+   This document and the information contained herein is provided on an
+   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Berners-Lee, et. al.        Standards Track                    [Page 40]
+
-- 
cgit v1.2.3