summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2186.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc2186.txt')
-rw-r--r--doc/rfc/rfc2186.txt507
1 files changed, 507 insertions, 0 deletions
diff --git a/doc/rfc/rfc2186.txt b/doc/rfc/rfc2186.txt
new file mode 100644
index 0000000..c0cecb3
--- /dev/null
+++ b/doc/rfc/rfc2186.txt
@@ -0,0 +1,507 @@
+
+
+
+
+
+
+Network Working Group D. Wessels
+Request for Comments: 2186 K. Claffy
+Category: Informational National Laboratory for Applied
+ Network Research/UCSD
+ September 1997
+
+ Internet Cache Protocol (ICP), version 2
+
+Status of this Memo
+
+ This memo provides information for the Internet community. This memo
+ does not specify an Internet standard of any kind. Distribution of
+ this memo is unlimited.
+
+Abstract
+
+ This document describes version 2 of the Internet Cache Protocol
+ (ICPv2) as currently implemented in two World-Wide Web proxy cache
+ packages[3,5]. ICP is a lightweight message format used for
+ communicating among Web caches. ICP is used to exchange hints about
+ the existence of URLs in neighbor caches. Caches exchange ICP
+ queries and replies to gather information to use in selecting the
+ most appropriate location from which to retrieve an object.
+
+ This document describes only the format and fields of ICP messages.
+ A companion document (RFC2187) describes the application of ICP to
+ Web caches. Several independent caching implementations now use ICP,
+ and we consider it important to codify the existing practical uses of
+ ICP for those trying to implement, deploy, and extend its use for
+ their own purposes.
+
+1. Introduction
+
+ ICP is a message format used for communicating between Web caches.
+ Although Web caches use HTTP[1] for the transfer of object data,
+ caches benefit from a simpler, lighter communication protocol. ICP
+ is primarily used in a cache mesh to locate specific Web objects in
+ neighboring caches. One cache sends an ICP query to its neighbors.
+ The neighbors send back ICP replies indicating a "HIT" or a "MISS."
+
+
+
+
+
+
+
+
+
+
+
+
+Wessels & Claffy Informational [Page 1]
+
+RFC 2186 ICP September 1997
+
+
+ In current practice, ICP is implemented on top of UDP, but there is
+ no requirement that it be limited to UDP. We feel that ICP over UDP
+ offers features important to Web caching applications. An ICP
+ query/reply exchange needs to occur quickly, typically within a
+ second or two. A cache cannot wait longer than that before beginning
+ to retrieve an object. Failure to receive a reply message most
+ likely means the network path is either congested or broken. In
+ either case we would not want to select that neighbor. As an
+ indication of immediate network conditions between neighbor caches,
+ ICP over a lightweight protocol such as UDP is better than one with
+ the overhead of TCP.
+
+ In addition to its use as an object location protocol, ICP messages
+ can be used for cache selection. Failure to receive a reply from a
+ cache may indicate a network or system failure. The ICP reply may
+ include information that could assist selection of the most
+ appropriate source from which to retrieve an object.
+
+ ICP was initially developed by Peter Danzig, et. al. at the
+ University of Southern California as a central part of hierarchical
+ caching in the Harvest research project[3].
+
+ICP Message Format
+
+ The ICP message format consists of a 20-octet fixed header plus a
+ variable sized payload (see Figure 1).
+
+ NOTE: All fields must be represented in network byte order.
+
+ Opcode
+ One of the opcodes defined below.
+
+ Version
+ The ICP protocol version number. At the time of this writing,
+ both versions two and three are in use. This document describes
+ only version two. The version number field allows for future
+ development of this protocol.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wessels & Claffy Informational [Page 2]
+
+RFC 2186 ICP September 1997
+
+
+ Message Length
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Opcode | Version | Message Length |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Request Number |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Options |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Option Data |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Sender Host Address |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | |
+ | Payload |
+ / /
+ / /
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ FIGURE 1: ICP message format.
+
+ The total length (octets) of the ICP message. ICP messages MUST
+ not exceed 16,384 octets in length.
+
+ Request Number
+ An opaque identifier. When responding to a query, this value must
+ be copied into the reply message.
+
+ Options
+ A 32-bit field of option flags that allows extension of this
+ version of the protocol in certain, limited ways. See "ICP Option
+ Flags" below.
+
+ Option Data
+ A four-octet field to support optional features. The following
+ ICP features make use of this field:
+
+ The ICP_FLAG_SRC_RTT option uses the low 16-bits of Option Data to
+ return RTT measurements. The ICP_FLAG_SRC_RTT option is further
+ described below.
+
+
+
+
+
+
+
+
+Wessels & Claffy Informational [Page 3]
+
+RFC 2186 ICP September 1997
+
+
+ Sender Host Address
+ The IPv4 address of the host sending the ICP message. This field
+ should probably not be trusted over what is provided by getpeer-
+ name(), accept(), and recvfrom(). There is some ambiguity over
+ the original purpose of this field. In practice it is not used.
+
+ Payload
+ The contents of the Payload field vary depending on the Opcode,
+ but most often it contains a null-terminated URL string.
+
+2. ICP Opcodes
+
+ The following table shows currently defined ICP opcodes:
+
+ Value Name
+ ----- -----------------
+ 0 ICP_OP_INVALID
+ 1 ICP_OP_QUERY
+ 2 ICP_OP_HIT
+ 3 ICP_OP_MISS
+ 4 ICP_OP_ERR
+ 5-9 UNUSED
+ 10 ICP_OP_SECHO
+ 11 ICP_OP_DECHO
+ 12-20 UNUSED
+ 21 ICP_OP_MISS_NOFETCH
+ 22 ICP_OP_DENIED
+ 23 ICP_OP_HIT_OBJ
+
+ ICP_OP_INVALID
+ A place holder to detect zero-filled or malformed messages. A
+ cache must never intentionally send an ICP_OP_INVALID message.
+ ICP_OP_ERR should be used instead.
+
+ ICP_OP_QUERY
+ A query message. NOTE this opcode has a different payload format
+ than most of the others. First is the requester's IPv4 address,
+ followed by a URL. The Requester Host Address is not that of the
+ cache generating the ICP message, but rather the address of the
+ caches's client that originated the request. The Requester Host
+ Address is often zero filled. An ICP message with an all-zero
+ Requester Host Address address should be taken as one where the
+ requester address is not specified; it does not indicate a valid
+ IPv4 address.
+
+
+
+
+
+
+
+Wessels & Claffy Informational [Page 4]
+
+RFC 2186 ICP September 1997
+
+
+ ICP_OP_QUERY payload format:
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Requester Host Address |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | |
+ / Null-Terminated URL /
+ / /
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ In response to an ICP_OP_QUERY, the recipient must return one of:
+ ICP_OP_HIT, ICP_OP_MISS, ICP_OP_ERR, ICP_OP_MISS_NOFETCH,
+ ICP_OP_DENIED, or ICP_OP_HIT_OBJ.
+
+ ICP_OP_SECHO
+ Similar to ICP_OP_QUERY, but for use in simulating a query to an
+ origin server. When ICP is used to select the closest neighbor,
+ the origin server can be included in the algorithm by bouncing an
+ ICP_OP_SECHO message off it's echo port. The payload is simply
+ the null-terminated URL.
+
+ NOTE: the echo server will not interpret the data (i.e. we could
+ send it anything). This opcode is used to tell the difference
+ between a legitimate query or response, random garbage, and an
+ echo response.
+
+ ICP_OP_DECHO
+ Similar to ICP_OP_QUERY, but for use in simulating a query to a
+ cache which does not use ICP. When ICP is used to choose the
+ closest neighbor, a non-ICP cache can be included in the algorithm
+ by bouncing an ICP_OP_DECHO message off it's echo port. The
+ payload is simply the null-terminated URL.
+
+ NOTE: one problem with this approach is that while a system's echo
+ port may be functioning perfectly, the cache software may not be
+ running at all.
+
+ One of the following six ICP opcodes are sent in response to an
+ ICP_OP_QUERY message. Unless otherwise noted, the payload must be
+ the null-terminated URL string. Both the URL string and the Request
+ Number field must be exactly the same as from the ICP_OP_QUERY
+ message.
+
+
+
+
+
+
+Wessels & Claffy Informational [Page 5]
+
+RFC 2186 ICP September 1997
+
+
+ ICP_OP_HIT
+ An ICP_OP_HIT response indicates that the requested URL exists in
+ this cache and that the requester is allowed to retrieve it.
+
+ ICP_OP_MISS
+ An ICP_OP_MISS response indicates that the requested URL does not
+ exist in this cache. The querying cache may still choose to fetch
+ the URL from the replying cache.
+
+ ICP_OP_ERR
+ An ICP_OP_ERR response indicates some kind of error in parsing or
+ handling the query message (e.g. invalid URL).
+
+ ICP_OP_MISS_NOFETCH
+ An ICP_OP_MISS_NOFETCH response indicates that this cache is up,
+ but is in a state where it does not want to handle cache misses.
+ An example of such a state is during a startup phase where a cache
+ might be rebuilding its object store. A cache in such a mode may
+ wish to return ICP_OP_HIT for cache hits, but not ICP_OP_MISS for
+ misses. ICP_OP_MISS_NOFETCH essentially means "I am up and
+ running, but please don't fetch this URL from me now."
+
+ Note, ICP_OP_MISS_NOFETCH has a different meaning than
+ ICP_OP_MISS. The ICP_OP_MISS reply is an invitation to fetch the
+ URL from the replying cache (if their relationship allows it), but
+ ICP_OP_MISS_NOFETCH is a request to NOT fetch the URL from the
+ replying cache.
+
+ ICP_OP_DENIED
+ An ICP_OP_DENIED response indicates that the querying site is not
+ allowed to retrieve the named object from this cache. Caches and
+ proxies may implement complex access controls. This reply must be
+ be interpreted to mean "you are not allowed to request this
+ particular URL from me at this particular time."
+
+ Caches receiving a high percentage of ICP_OP_DENIED replies are
+ probably misconfigured. Caches should track percentage of all
+ replies which are ICP_OP_DENIED and disable a neighbor which
+ exceeds a certain threshold (e.g. 95% of 100 or more queries).
+
+ Similarly, a cache should track the percent of ICP_OP_DENIED
+ messages that are sent to a given address. If the percent of
+ denied messages exceeds a certain threshold (e.g. 95% of 100 or
+ more), the cache may choose to ignore all subsequent ICP_OP_QUERY
+ messages from that address until some sort of administrative
+ intervention occurs.
+
+
+
+
+
+Wessels & Claffy Informational [Page 6]
+
+RFC 2186 ICP September 1997
+
+
+ ICP_OP_HIT_OBJ
+ Just like an ICP_OP_HIT response, but the actual object data has
+ been included in this reply message. Many requested objects are
+ small enough that it is possible to include them in the query
+ response and avoid the need to make a subsequent HTTP request for
+ the object.
+
+ CAVEAT: ICP_OP_HIT_OBJ has some negative side effects which make
+ its use undesirable. It transfers object data without HTTP and
+ therefore bypasses the standard HTTP processing, including
+ authorization and age validation. Another negative side effect is
+ that ICP_OP_HIT_OBJ messages will often be much larger than the
+ path MTU, thereby causing fragmentation to occur on the UDP
+ packet. For these reasons, use of ICP_OP_HIT_OBJ is NOT
+ recommended.
+
+ A cache must not send an ICP_OP_HIT_OBJ unless the
+ ICP_FLAG_HIT_OBJ flag is set in the query message Options field.
+
+ ICP_OP_HIT_OBJ payload format:
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | |
+ / Null-Terminated URL /
+ / /
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Object Size | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
+ | |
+ / Object Data /
+ / /
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+
+ The receiving application must check to make sure it actually
+ receives Object Size octets of data. If it does not, then it
+ should treat the ICP_OP_HIT_OBJ reply as though it were a normal
+ ICP_OP_HIT.
+
+ NOTE: the Object Size field does not necessarily begin on a 32-bit
+ boundary as shown in the diagram above. It begins immediately
+ following the NULL byte of the URL string.
+
+
+
+
+
+Wessels & Claffy Informational [Page 7]
+
+RFC 2186 ICP September 1997
+
+
+ UNRECOGNIZED OPCODES
+ ICP messages with unrecognized or unused opcodes should be
+ ignored, i.e. no reply generated. The application may choose to
+ note the anomalous behaviour in a log file.
+
+3. ICP Option Flags
+
+ 0x80000000 ICP_FLAG_HIT_OBJ
+ This flag is set in an ICP_OP_QUERY message indicating that it is
+ okay to respond with an ICP_OP_HIT_OBJ message if the object data
+ will fit in the reply.
+
+ 0x40000000 ICP_FLAG_SRC_RTT
+ This flag is set in an ICP_OP_QUERY message indicating that the
+ requester would like the ICP reply to include the responder's
+ measured RTT to the origin server.
+
+ Upon receipt of an ICP_OP_QUERY with ICP_FLAG_SRC_RTT bit set, a
+ cache should check an internal database of RTT measurements. If
+ available, the RTT value MUST be expressed as a 16-bit integer, in
+ units of milliseconds. If unavailable, the responder may either
+ set the RTT value to zero, or clear the ICP_FLAG_SRC_RTT bit in
+ the ICP reply. The ICP reply MUST not be delayed while waiting
+ for the RTT measurement to occur.
+
+ This flag is set in an ICP reply message (ICP_OP_HIT, ICP_OP_MISS,
+ ICP_OP_MISS_NOFETCH, or ICP_OP_HIT_OBJ) to indicate that the low
+ 16-bits of the Option Data field contain the measured RTT to the
+ host given in the requested URL. If ICP_FLAG_SRC_RTT is clear in
+ the query then it MUST also be clear in the reply. If
+ ICP_FLAG_SRC_RTT is set in the query, then it may or may not be
+ set in the reply.
+
+4. Security Considerations
+
+ The security issues relating to ICP are discussed in the companion
+ document, RFC2187.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wessels & Claffy Informational [Page 8]
+
+RFC 2186 ICP September 1997
+
+
+5. References
+
+ [1] Fielding, R., et. al, "Hypertext Transfer Protocol -- HTTP/1.1",
+ RFC 2068, UC Irvine, January 1997.
+
+ [2] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform Resource
+ Locators (URL)", RFC 1738, CERN, Xerox PARC, University of Minnesota,
+ December 1994.
+
+ [3] Bowman M., Danzig P., Hardy D., Manber U., Schwartz M., and
+ Wessels D., "The Harvest Information Discovery and Access System",
+ Internet Research Task Force - Resource Discovery,
+ http://harvest.transarc.com/.
+
+ [4] Wessels D., Claffy K., "ICP and the Squid Web Cache", National
+ Laboratory for Applied Network Research,
+ http://www.nlanr.net/~wessels/Papers/icp-squid.ps.gz
+
+ [5] Wessels D., "The Squid Internet Object Cache", National
+ Laboratory for Applied Network Research,
+ http://squid.nlanr.net/Squid/
+
+6. Acknowledgments
+
+ The authors wish to thank Paul A Vixie <paul@vix.com> for providing
+ excellent feedback on this document.
+
+7. Authors' Addresses
+
+ Duane Wessels
+ National Laboratory for Applied Network Research
+ 10100 Hopkins Drive
+ La Jolla, CA 92093
+
+ EMail: wessels@nlanr.net
+
+
+ K. Claffy
+ National Laboratory for Applied Network Research
+ 10100 Hopkins Drive
+ La Jolla, CA 92093
+
+ EMail: kc@nlanr.net
+
+
+
+
+
+
+
+
+Wessels & Claffy Informational [Page 9]
+