summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc3875.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc3875.txt')
-rw-r--r--doc/rfc/rfc3875.txt2019
1 files changed, 2019 insertions, 0 deletions
diff --git a/doc/rfc/rfc3875.txt b/doc/rfc/rfc3875.txt
new file mode 100644
index 0000000..41296e1
--- /dev/null
+++ b/doc/rfc/rfc3875.txt
@@ -0,0 +1,2019 @@
+
+
+
+
+
+
+Network Working Group D. Robinson
+Request for Comments: 3875 K. Coar
+Category: Informational The Apache Software Foundation
+ October 2004
+
+
+ The Common Gateway Interface (CGI) Version 1.1
+
+Status of this Memo
+
+ This memo provides information for the Internet community. It does
+ not specify an Internet standard of any kind. Distribution of this
+ memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2004).
+
+IESG Note
+
+ This document is not a candidate for any level of Internet Standard.
+ The IETF disclaims any knowledge of the fitness of this document for
+ any purpose, and in particular notes that it has not had IETF review
+ for such things as security, congestion control or inappropriate
+ interaction with deployed protocols. The RFC Editor has chosen to
+ publish this document at its discretion. Readers of this document
+ should exercise caution in evaluating its value for implementation
+ and deployment.
+
+Abstract
+
+ The Common Gateway Interface (CGI) is a simple interface for running
+ external programs, software or gateways under an information server
+ in a platform-independent manner. Currently, the supported
+ information servers are HTTP servers.
+
+ The interface has been in use by the World-Wide Web (WWW) since 1993.
+ This specification defines the 'current practice' parameters of the
+ 'CGI/1.1' interface developed and documented at the U.S. National
+ Centre for Supercomputing Applications. This document also defines
+ the use of the CGI/1.1 interface on UNIX(R) and other, similar
+ systems.
+
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 1]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+Table of Contents
+
+ 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 4
+ 1.1. Purpose . . . . . . . . . . . . . . . . . . . . . . . . 4
+ 1.2. Requirements . . . . . . . . . . . . . . . . . . . . . . 4
+ 1.3. Specifications . . . . . . . . . . . . . . . . . . . . . 4
+ 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . 5
+
+ 2. Notational Conventions and Generic Grammar. . . . . . . . . . 5
+ 2.1. Augmented BNF . . . . . . . . . . . . . . . . . . . . . 5
+ 2.2. Basic Rules . . . . . . . . . . . . . . . . . . . . . . 6
+ 2.3. URL Encoding . . . . . . . . . . . . . . . . . . . . . . 7
+
+ 3. Invoking the Script . . . . . . . . . . . . . . . . . . . . . 8
+ 3.1. Server Responsibilities . . . . . . . . . . . . . . . . 8
+ 3.2. Script Selection . . . . . . . . . . . . . . . . . . . . 9
+ 3.3. The Script-URI . . . . . . . . . . . . . . . . . . . . . 9
+ 3.4. Execution . . . . . . . . . . . . . . . . . . . . . . . 10
+
+ 4. The CGI Request . . . . . . . . . . . . . . . . . . . . . . . 10
+ 4.1. Request Meta-Variables . . . . . . . . . . . . . . . . . 10
+ 4.1.1. AUTH_TYPE. . . . . . . . . . . . . . . . . . . . 11
+ 4.1.2. CONTENT_LENGTH . . . . . . . . . . . . . . . . . 12
+ 4.1.3. CONTENT_TYPE . . . . . . . . . . . . . . . . . . 12
+ 4.1.4. GATEWAY_INTERFACE. . . . . . . . . . . . . . . . 13
+ 4.1.5. PATH_INFO. . . . . . . . . . . . . . . . . . . . 13
+ 4.1.6. PATH_TRANSLATED. . . . . . . . . . . . . . . . . 14
+ 4.1.7. QUERY_STRING . . . . . . . . . . . . . . . . . . 15
+ 4.1.8. REMOTE_ADDR. . . . . . . . . . . . . . . . . . . 15
+ 4.1.9. REMOTE_HOST. . . . . . . . . . . . . . . . . . . 16
+ 4.1.10. REMOTE_IDENT . . . . . . . . . . . . . . . . . . 16
+ 4.1.11. REMOTE_USER. . . . . . . . . . . . . . . . . . . 16
+ 4.1.12. REQUEST_METHOD . . . . . . . . . . . . . . . . . 17
+ 4.1.13. SCRIPT_NAME. . . . . . . . . . . . . . . . . . . 17
+ 4.1.14. SERVER_NAME. . . . . . . . . . . . . . . . . . . 17
+ 4.1.15. SERVER_PORT. . . . . . . . . . . . . . . . . . . 18
+ 4.1.16. SERVER_PROTOCOL. . . . . . . . . . . . . . . . . 18
+ 4.1.17. SERVER_SOFTWARE. . . . . . . . . . . . . . . . . 19
+ 4.1.18. Protocol-Specific Meta-Variables . . . . . . . . 19
+ 4.2. Request Message-Body . . . . . . . . . . . . . . . . . . 20
+ 4.3. Request Methods . . . . . . . . . . . . . . . . . . . . 20
+ 4.3.1. GET. . . . . . . . . . . . . . . . . . . . . . . 20
+ 4.3.2. POST . . . . . . . . . . . . . . . . . . . . . . 21
+ 4.3.3. HEAD . . . . . . . . . . . . . . . . . . . . . . 21
+ 4.3.4. Protocol-Specific Methods. . . . . . . . . . . . 21
+ 4.4. The Script Command Line. . . . . . . . . . . . . . . . . 21
+
+
+
+
+
+Robinson & Coar Informational [Page 2]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ 5. NPH Scripts . . . . . . . . . . . . . . . . . . . . . . . . . 22
+ 5.1. Identification . . . . . . . . . . . . . . . . . . . . . 22
+ 5.2. NPH Response . . . . . . . . . . . . . . . . . . . . . . 22
+
+ 6. CGI Response. . . . . . . . . . . . . . . . . . . . . . . . . 23
+ 6.1. Response Handling. . . . . . . . . . . . . . . . . . . . 23
+ 6.2. Response Types . . . . . . . . . . . . . . . . . . . . . 23
+ 6.2.1. Document Response. . . . . . . . . . . . . . . . 23
+ 6.2.2. Local Redirect Response. . . . . . . . . . . . . 24
+ 6.2.3. Client Redirect Response . . . . . . . . . . . . 24
+ 6.2.4. Client Redirect Response with Document . . . . . 24
+ 6.3. Response Header Fields . . . . . . . . . . . . . . . . . 25
+ 6.3.1. Content-Type . . . . . . . . . . . . . . . . . . 25
+ 6.3.2. Location . . . . . . . . . . . . . . . . . . . . 26
+ 6.3.3. Status . . . . . . . . . . . . . . . . . . . . . 26
+ 6.3.4. Protocol-Specific Header Fields. . . . . . . . . 27
+ 6.3.5. Extension Header Fields. . . . . . . . . . . . . 27
+ 6.4. Response Message-Body. . . . . . . . . . . . . . . . . . 28
+
+ 7. System Specifications . . . . . . . . . . . . . . . . . . . . 28
+ 7.1. AmigaDOS . . . . . . . . . . . . . . . . . . . . . . . . 28
+ 7.2. UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . 28
+ 7.3. EBCDIC/POSIX . . . . . . . . . . . . . . . . . . . . . . 29
+
+ 8. Implementation. . . . . . . . . . . . . . . . . . . . . . . . 29
+ 8.1. Recommendations for Servers. . . . . . . . . . . . . . . 29
+ 8.2. Recommendations for Scripts. . . . . . . . . . . . . . . 30
+
+ 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30
+ 9.1. Safe Methods . . . . . . . . . . . . . . . . . . . . . . 30
+ 9.2. Header Fields Containing Sensitive Information . . . . . 31
+ 9.3. Data Privacy . . . . . . . . . . . . . . . . . . . . . . 31
+ 9.4. Information Security Model . . . . . . . . . . . . . . . 31
+ 9.5. Script Interference with the Server. . . . . . . . . . . 31
+ 9.6. Data Length and Buffering Considerations . . . . . . . . 32
+ 9.7. Stateless Processing . . . . . . . . . . . . . . . . . . 32
+ 9.8. Relative Paths . . . . . . . . . . . . . . . . . . . . . 33
+ 9.9. Non-parsed Header Output . . . . . . . . . . . . . . . . 33
+
+ 10. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . 33
+
+ 11. References. . . . . . . . . . . . . . . . . . . . . . . . . . 33
+ 11.1. Normative References. . . . . . . . . . . . . . . . . . 33
+ 11.2. Informative References. . . . . . . . . . . . . . . . . 34
+
+ 12. Authors' Addresses. . . . . . . . . . . . . . . . . . . . . . 35
+
+ 13. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 36
+
+
+
+Robinson & Coar Informational [Page 3]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+1. Introduction
+
+1.1. Purpose
+
+ The Common Gateway Interface (CGI) [22] allows an HTTP [1], [4]
+ server and a CGI script to share responsibility for responding to
+ client requests. The client request comprises a Uniform Resource
+ Identifier (URI) [11], a request method and various ancillary
+ information about the request provided by the transport protocol.
+
+ The CGI defines the abstract parameters, known as meta-variables,
+ which describe a client's request. Together with a concrete
+ programmer interface this specifies a platform-independent interface
+ between the script and the HTTP server.
+
+ The server is responsible for managing connection, data transfer,
+ transport and network issues related to the client request, whereas
+ the CGI script handles the application issues, such as data access
+ and document processing.
+
+1.2. Requirements
+
+ The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT',
+ 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY' and 'OPTIONAL' in this
+ document are to be interpreted as described in BCP 14, RFC 2119 [3].
+
+ An implementation is not compliant if it fails to satisfy one or more
+ of the 'must' requirements for the protocols it implements. An
+ implementation that satisfies all of the 'must' and all of the
+ 'should' requirements for its features is said to be 'unconditionally
+ compliant'; one that satisfies all of the 'must' requirements but not
+ all of the 'should' requirements for its features is said to be
+ 'conditionally compliant'.
+
+1.3. Specifications
+
+ Not all of the functions and features of the CGI are defined in the
+ main part of this specification. The following phrases are used to
+ describe the features that are not specified:
+
+ 'system-defined'
+ The feature may differ between systems, but must be the same for
+ different implementations using the same system. A system will
+ usually identify a class of operating systems. Some systems are
+ defined in section 7 of this document. New systems may be defined
+ by new specifications without revision of this document.
+
+
+
+
+
+Robinson & Coar Informational [Page 4]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ 'implementation-defined'
+ The behaviour of the feature may vary from implementation to
+ implementation; a particular implementation must document its
+ behaviour.
+
+1.4. Terminology
+
+ This specification uses many terms defined in the HTTP/1.1
+ specification [4]; however, the following terms are used here in a
+ sense which may not accord with their definitions in that document,
+ or with their common meaning.
+
+ 'meta-variable'
+ A named parameter which carries information from the server to the
+ script. It is not necessarily a variable in the operating
+ system's environment, although that is the most common
+ implementation.
+
+ 'script'
+ The software that is invoked by the server according to this
+ interface. It need not be a standalone program, but could be a
+ dynamically-loaded or shared library, or even a subroutine in the
+ server. It might be a set of statements interpreted at run-time,
+ as the term 'script' is frequently understood, but that is not a
+ requirement and within the context of this specification the term
+ has the broader definition stated.
+
+ 'server'
+ The application program that invokes the script in order to
+ service requests from the client.
+
+2. Notational Conventions and Generic Grammar
+
+2.1. Augmented BNF
+
+ All of the mechanisms specified in this document are described in
+ both prose and an augmented Backus-Naur Form (BNF) similar to that
+ used by RFC 822 [13]. Unless stated otherwise, the elements are
+ case-sensitive. This augmented BNF contains the following
+ constructs:
+
+ name = definition
+ The name of a rule and its definition are separated by the equals
+ character ('='). Whitespace is only significant in that
+ continuation lines of a definition are indented.
+
+
+
+
+
+
+Robinson & Coar Informational [Page 5]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ "literal"
+ Double quotation marks (") surround literal text, except for a
+ literal quotation mark, which is surrounded by angle-brackets ('<'
+ and '>').
+
+ rule1 | rule2
+ Alternative rules are separated by a vertical bar ('|').
+
+ (rule1 rule2 rule3)
+ Elements enclosed in parentheses are treated as a single element.
+
+ *rule
+ A rule preceded by an asterisk ('*') may have zero or more
+ occurrences. The full form is 'n*m rule' indicating at least n
+ and at most m occurrences of the rule. n and m are optional
+ decimal values with default values of 0 and infinity respectively.
+
+ [rule]
+ An element enclosed in square brackets ('[' and ']') is optional,
+ and is equivalent to '*1 rule'.
+
+ N rule
+ A rule preceded by a decimal number represents exactly N
+ occurrences of the rule. It is equivalent to 'N*N rule'.
+
+2.2. Basic Rules
+
+ This specification uses a BNF-like grammar defined in terms of
+ characters. Unlike many specifications which define the bytes
+ allowed by a protocol, here each literal in the grammar corresponds
+ to the character it represents. How these characters are represented
+ in terms of bits and bytes within a system are either system-defined
+ or specified in the particular context. The single exception is the
+ rule 'OCTET', defined below.
+
+ The following rules are used throughout this specification to
+ describe basic parsing constructs.
+
+ alpha = lowalpha | hialpha
+ lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
+ "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
+ "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
+ "y" | "z"
+ hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
+ "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
+ "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
+ "Y" | "Z"
+
+
+
+
+Robinson & Coar Informational [Page 6]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
+ "8" | "9"
+ alphanum = alpha | digit
+ OCTET = <any 8-bit byte>
+ CHAR = alpha | digit | separator | "!" | "#" | "$" |
+ "%" | "&" | "'" | "*" | "+" | "-" | "." | "`" |
+ "^" | "_" | "{" | "|" | "}" | "~" | CTL
+ CTL = <any control character>
+ SP = <space character>
+ HT = <horizontal tab character>
+ NL = <newline>
+ LWSP = SP | HT | NL
+ separator = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" |
+ "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" |
+ "}" | SP | HT
+ token = 1*<any CHAR except CTLs or separators>
+ quoted-string = <"> *qdtext <">
+ qdtext = <any CHAR except <"> and CTLs but including LWSP>
+ TEXT = <any printable character>
+
+ Note that newline (NL) need not be a single control character, but
+ can be a sequence of control characters. A system MAY define TEXT to
+ be a larger set of characters than <any CHAR excluding CTLs but
+ including LWSP>.
+
+2.3. URL Encoding
+
+ Some variables and constructs used here are described as being
+ 'URL-encoded'. This encoding is described in section 2 of RFC 2396
+ [2]. In a URL-encoded string an escape sequence consists of a
+ percent character ("%") followed by two hexadecimal digits, where the
+ two hexadecimal digits form an octet. An escape sequence represents
+ the graphic character that has the octet as its code within the
+ US-ASCII [9] coded character set, if it exists. Currently there is
+ no provision within the URI syntax to identify which character set
+ non-ASCII codes represent, so CGI handles this issue on an ad-hoc
+ basis.
+
+ Note that some unsafe (reserved) characters may have different
+ semantics when encoded. The definition of which characters are
+ unsafe depends on the context; see section 2 of RFC 2396 [2], updated
+ by RFC 2732 [7], for an authoritative treatment. These reserved
+ characters are generally used to provide syntactic structure to the
+ character string, for example as field separators. In all cases, the
+ string is first processed with regard to any reserved characters
+ present, and then the resulting data can be URL-decoded by replacing
+ "%" escape sequences by their character values.
+
+
+
+
+Robinson & Coar Informational [Page 7]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ To encode a character string, all reserved and forbidden characters
+ are replaced by the corresponding "%" escape sequences. The string
+ can then be used in assembling a URI. The reserved characters will
+ vary from context to context, but will always be drawn from this set:
+
+ reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" |
+ "," | "[" | "]"
+
+ The last two characters were added by RFC 2732 [7]. In any
+ particular context, a sub-set of these characters will be reserved;
+ the other characters from this set MUST NOT be encoded when a string
+ is URL-encoded in that context. Other basic rules used to describe
+ URI syntax are:
+
+ hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b"
+ | "c" | "d" | "e" | "f"
+ escaped = "%" hex hex
+ unreserved = alpha | digit | mark
+ mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
+
+3. Invoking the Script
+
+3.1. Server Responsibilities
+
+ The server acts as an application gateway. It receives the request
+ from the client, selects a CGI script to handle the request, converts
+ the client request to a CGI request, executes the script and converts
+ the CGI response into a response for the client. When processing the
+ client request, it is responsible for implementing any protocol or
+ transport level authentication and security. The server MAY also
+ function in a 'non-transparent' manner, modifying the request or
+ response in order to provide some additional service, such as media
+ type transformation or protocol reduction.
+
+ The server MUST perform translations and protocol conversions on the
+ client request data required by this specification. Furthermore, the
+ server retains its responsibility to the client to conform to the
+ relevant network protocol even if the CGI script fails to conform to
+ this specification.
+
+ If the server is applying authentication to the request, then it MUST
+ NOT execute the script unless the request passes all defined access
+ controls.
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 8]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+3.2. Script Selection
+
+ The server determines which CGI is script to be executed based on a
+ generic-form URI supplied by the client. This URI includes a
+ hierarchical path with components separated by "/". For any
+ particular request, the server will identify all or a leading part of
+ this path with an individual script, thus placing the script at a
+ particular point in the path hierarchy. The remainder of the path,
+ if any, is a resource or sub-resource identifier to be interpreted by
+ the script.
+
+ Information about this split of the path is available to the script
+ in the meta-variables, described below. Support for non-hierarchical
+ URI schemes is outside the scope of this specification.
+
+3.3. The Script-URI
+
+ The mapping from client request URI to choice of script is defined by
+ the particular server implementation and its configuration. The
+ server may allow the script to be identified with a set of several
+ different URI path hierarchies, and therefore is permitted to replace
+ the URI by other members of this set during processing and generation
+ of the meta-variables. The server
+
+ 1. MAY preserve the URI in the particular client request; or
+
+ 2. it MAY select a canonical URI from the set of possible values
+ for each script; or
+
+ 3. it can implement any other selection of URI from the set.
+
+ From the meta-variables thus generated, a URI, the 'Script-URI', can
+ be constructed. This MUST have the property that if the client had
+ accessed this URI instead, then the script would have been executed
+ with the same values for the SCRIPT_NAME, PATH_INFO and QUERY_STRING
+ meta-variables. The Script-URI has the structure of a generic URI as
+ defined in section 3 of RFC 2396 [2], with the exception that object
+ parameters and fragment identifiers are not permitted. The various
+ components of the Script-URI are defined by some of the
+ meta-variables (see below);
+
+ script-URI = <scheme> "://" <server-name> ":" <server-port>
+ <script-path> <extra-path> "?" <query-string>
+
+ where <scheme> is found from SERVER_PROTOCOL, <server-name>,
+ <server-port> and <query-string> are the values of the respective
+ meta-variables. The SCRIPT_NAME and PATH_INFO values, URL-encoded
+ with ";", "=" and "?" reserved, give <script-path> and <extra-path>.
+
+
+
+Robinson & Coar Informational [Page 9]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ See section 4.1.5 for more information about the PATH_INFO
+ meta-variable.
+
+ The scheme and the protocol are not identical as the scheme
+ identifies the access method in addition to the application protocol.
+ For example, a resource accessed using Transport Layer Security (TLS)
+ [14] would have a request URI with a scheme of https when using the
+ HTTP protocol [19]. CGI/1.1 provides no generic means for the script
+ to reconstruct this, and therefore the Script-URI as defined includes
+ the base protocol used. However, a script MAY make use of
+ scheme-specific meta-variables to better deduce the URI scheme.
+
+ Note that this definition also allows URIs to be constructed which
+ would invoke the script with any permitted values for the path-info
+ or query-string, by modifying the appropriate components.
+
+3.4. Execution
+
+ The script is invoked in a system-defined manner. Unless specified
+ otherwise, the file containing the script will be invoked as an
+ executable program. The server prepares the CGI request as described
+ in section 4; this comprises the request meta-variables (immediately
+ available to the script on execution) and request message data. The
+ request data need not be immediately available to the script; the
+ script can be executed before all this data has been received by the
+ server from the client. The response from the script is returned to
+ the server as described in sections 5 and 6.
+
+ In the event of an error condition, the server can interrupt or
+ terminate script execution at any time and without warning. That
+ could occur, for example, in the event of a transport failure between
+ the server and the client; so the script SHOULD be prepared to handle
+ abnormal termination.
+
+4. The CGI Request
+
+ Information about a request comes from two different sources; the
+ request meta-variables and any associated message-body.
+
+4.1. Request Meta-Variables
+
+ Meta-variables contain data about the request passed from the server
+ to the script, and are accessed by the script in a system-defined
+ manner. Meta-variables are identified by case-insensitive names;
+ there cannot be two different variables whose names differ in case
+ only. Here they are shown using a canonical representation of
+ capitals plus underscore ("_"). A particular system can define a
+ different representation.
+
+
+
+Robinson & Coar Informational [Page 10]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ meta-variable-name = "AUTH_TYPE" | "CONTENT_LENGTH" |
+ "CONTENT_TYPE" | "GATEWAY_INTERFACE" |
+ "PATH_INFO" | "PATH_TRANSLATED" |
+ "QUERY_STRING" | "REMOTE_ADDR" |
+ "REMOTE_HOST" | "REMOTE_IDENT" |
+ "REMOTE_USER" | "REQUEST_METHOD" |
+ "SCRIPT_NAME" | "SERVER_NAME" |
+ "SERVER_PORT" | "SERVER_PROTOCOL" |
+ "SERVER_SOFTWARE" | scheme |
+ protocol-var-name | extension-var-name
+ protocol-var-name = ( protocol | scheme ) "_" var-name
+ scheme = alpha *( alpha | digit | "+" | "-" | "." )
+ var-name = token
+ extension-var-name = token
+
+ Meta-variables with the same name as a scheme, and names beginning
+ with the name of a protocol or scheme (e.g., HTTP_ACCEPT) are also
+ defined. The number and meaning of these variables may change
+ independently of this specification. (See also section 4.1.18.)
+
+ The server MAY set additional implementation-defined extension meta-
+ variables, whose names SHOULD be prefixed with "X_".
+
+ This specification does not distinguish between zero-length (NULL)
+ values and missing values. For example, a script cannot distinguish
+ between the two requests http://host/script and http://host/script?
+ as in both cases the QUERY_STRING meta-variable would be NULL.
+
+ meta-variable-value = "" | 1*<TEXT, CHAR or tokens of value>
+
+ An optional meta-variable may be omitted (left unset) if its value is
+ NULL. Meta-variable values MUST be considered case-sensitive except
+ as noted otherwise. The representation of the characters in the
+ meta-variables is system-defined; the server MUST convert values to
+ that representation.
+
+4.1.1. AUTH_TYPE
+
+ The AUTH_TYPE variable identifies any mechanism used by the server to
+ authenticate the user. It contains a case-insensitive value defined
+ by the client protocol or server implementation.
+
+ For HTTP, if the client request required authentication for external
+ access, then the server MUST set the value of this variable from the
+ 'auth-scheme' token in the request Authorization header field.
+
+
+
+
+
+
+Robinson & Coar Informational [Page 11]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ AUTH_TYPE = "" | auth-scheme
+ auth-scheme = "Basic" | "Digest" | extension-auth
+ extension-auth = token
+
+ HTTP access authentication schemes are described in RFC 2617 [5].
+
+4.1.2. CONTENT_LENGTH
+
+ The CONTENT_LENGTH variable contains the size of the message-body
+ attached to the request, if any, in decimal number of octets. If no
+ data is attached, then NULL (or unset).
+
+ CONTENT_LENGTH = "" | 1*digit
+
+ The server MUST set this meta-variable if and only if the request is
+ accompanied by a message-body entity. The CONTENT_LENGTH value must
+ reflect the length of the message-body after the server has removed
+ any transfer-codings or content-codings.
+
+4.1.3. CONTENT_TYPE
+
+ If the request includes a message-body, the CONTENT_TYPE variable is
+ set to the Internet Media Type [6] of the message-body.
+
+ CONTENT_TYPE = "" | media-type
+ media-type = type "/" subtype *( ";" parameter )
+ type = token
+ subtype = token
+ parameter = attribute "=" value
+ attribute = token
+ value = token | quoted-string
+
+ The type, subtype and parameter attribute names are not
+ case-sensitive. Parameter values may be case sensitive. Media types
+ and their use in HTTP are described section 3.7 of the HTTP/1.1
+ specification [4].
+
+ There is no default value for this variable. If and only if it is
+ unset, then the script MAY attempt to determine the media type from
+ the data received. If the type remains unknown, then the script MAY
+ choose to assume a type of application/octet-stream or it may reject
+ the request with an error (as described in section 6.3.3).
+
+ Each media-type defines a set of optional and mandatory parameters.
+ This may include a charset parameter with a case-insensitive value
+ defining the coded character set for the message-body. If the
+
+
+
+
+
+Robinson & Coar Informational [Page 12]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ charset parameter is omitted, then the default value should be
+ derived according to whichever of the following rules is the first to
+ apply:
+
+ 1. There MAY be a system-defined default charset for some
+ media-types.
+
+ 2. The default for media-types of type "text" is ISO-8859-1 [4].
+
+ 3. Any default defined in the media-type specification.
+
+ 4. The default is US-ASCII.
+
+ The server MUST set this meta-variable if an HTTP Content-Type field
+ is present in the client request header. If the server receives a
+ request with an attached entity but no Content-Type header field, it
+ MAY attempt to determine the correct content type, otherwise it
+ should omit this meta-variable.
+
+4.1.4. GATEWAY_INTERFACE
+
+ The GATEWAY_INTERFACE variable MUST be set to the dialect of CGI
+ being used by the server to communicate with the script. Syntax:
+
+ GATEWAY_INTERFACE = "CGI" "/" 1*digit "." 1*digit
+
+ Note that the major and minor numbers are treated as separate
+ integers and hence each may be incremented higher than a single
+ digit. Thus CGI/2.4 is a lower version than CGI/2.13 which in turn
+ is lower than CGI/12.3. Leading zeros MUST be ignored by the script
+ and MUST NOT be generated by the server.
+
+ This document defines the 1.1 version of the CGI interface.
+
+4.1.5. PATH_INFO
+
+ The PATH_INFO variable specifies a path to be interpreted by the CGI
+ script. It identifies the resource or sub-resource to be returned by
+ the CGI script, and is derived from the portion of the URI path
+ hierarchy following the part that identifies the script itself.
+ Unlike a URI path, the PATH_INFO is not URL-encoded, and cannot
+ contain path-segment parameters. A PATH_INFO of "/" represents a
+ single void path segment.
+
+ PATH_INFO = "" | ( "/" path )
+ path = lsegment *( "/" lsegment )
+ lsegment = *lchar
+ lchar = <any TEXT or CTL except "/">
+
+
+
+Robinson & Coar Informational [Page 13]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ The value is considered case-sensitive and the server MUST preserve
+ the case of the path as presented in the request URI. The server MAY
+ impose restrictions and limitations on what values it permits for
+ PATH_INFO, and MAY reject the request with an error if it encounters
+ any values considered objectionable. That MAY include any requests
+ that would result in an encoded "/" being decoded into PATH_INFO, as
+ this might represent a loss of information to the script. Similarly,
+ treatment of non US-ASCII characters in the path is system-defined.
+
+ URL-encoded, the PATH_INFO string forms the extra-path component of
+ the Script-URI (see section 3.3) which follows the SCRIPT_NAME part
+ of that path.
+
+4.1.6. PATH_TRANSLATED
+
+ The PATH_TRANSLATED variable is derived by taking the PATH_INFO
+ value, parsing it as a local URI in its own right, and performing any
+ virtual-to-physical translation appropriate to map it onto the
+ server's document repository structure. The set of characters
+ permitted in the result is system-defined.
+
+ PATH_TRANSLATED = *<any character>
+
+ This is the file location that would be accessed by a request for
+
+ <scheme> "://" <server-name> ":" <server-port> <extra-path>
+
+ where <scheme> is the scheme for the original client request and
+ <extra-path> is a URL-encoded version of PATH_INFO, with ";", "=" and
+ "?" reserved. For example, a request such as the following:
+
+ http://somehost.com/cgi-bin/somescript/this%2eis%2epath%3binfo
+
+ would result in a PATH_INFO value of
+
+ /this.is.the.path;info
+
+ An internal URI is constructed from the scheme, server location and
+ the URL-encoded PATH_INFO:
+
+ http://somehost.com/this.is.the.path%3binfo
+
+ This would then be translated to a location in the server's document
+ repository, perhaps a filesystem path something like this:
+
+ /usr/local/www/htdocs/this.is.the.path;info
+
+ The value of PATH_TRANSLATED is the result of the translation.
+
+
+
+Robinson & Coar Informational [Page 14]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ The value is derived in this way irrespective of whether it maps to a
+ valid repository location. The server MUST preserve the case of the
+ extra-path segment unless the underlying repository supports case-
+ insensitive names. If the repository is only case-aware, case-
+ preserving, or case-blind with regard to document names, the server
+ is not required to preserve the case of the original segment through
+ the translation.
+
+ The translation algorithm the server uses to derive PATH_TRANSLATED
+ is implementation-defined; CGI scripts which use this variable may
+ suffer limited portability.
+
+ The server SHOULD set this meta-variable if the request URI includes
+ a path-info component. If PATH_INFO is NULL, then the
+ PATH_TRANSLATED variable MUST be set to NULL (or unset).
+
+4.1.7. QUERY_STRING
+
+ The QUERY_STRING variable contains a URL-encoded search or parameter
+ string; it provides information to the CGI script to affect or refine
+ the document to be returned by the script.
+
+ The URL syntax for a search string is described in section 3 of RFC
+ 2396 [2]. The QUERY_STRING value is case-sensitive.
+
+ QUERY_STRING = query-string
+ query-string = *uric
+ uric = reserved | unreserved | escaped
+
+ When parsing and decoding the query string, the details of the
+ parsing, reserved characters and support for non US-ASCII characters
+ depends on the context. For example, form submission from an HTML
+ document [18] uses application/x-www-form-urlencoded encoding, in
+ which the characters "+", "&" and "=" are reserved, and the ISO
+ 8859-1 encoding may be used for non US-ASCII characters.
+
+ The QUERY_STRING value provides the query-string part of the
+ Script-URI. (See section 3.3).
+
+ The server MUST set this variable; if the Script-URI does not include
+ a query component, the QUERY_STRING MUST be defined as an empty
+ string ("").
+
+4.1.8. REMOTE_ADDR
+
+ The REMOTE_ADDR variable MUST be set to the network address of the
+ client sending the request to the server.
+
+
+
+
+Robinson & Coar Informational [Page 15]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ REMOTE_ADDR = hostnumber
+ hostnumber = ipv4-address | ipv6-address
+ ipv4-address = 1*3digit "." 1*3digit "." 1*3digit "." 1*3digit
+ ipv6-address = hexpart [ ":" ipv4-address ]
+ hexpart = hexseq | ( [ hexseq ] "::" [ hexseq ] )
+ hexseq = 1*4hex *( ":" 1*4hex )
+
+ The format of an IPv6 address is described in RFC 3513 [15].
+
+4.1.9. REMOTE_HOST
+
+ The REMOTE_HOST variable contains the fully qualified domain name of
+ the client sending the request to the server, if available, otherwise
+ NULL. Fully qualified domain names take the form as described in
+ section 3.5 of RFC 1034 [17] and section 2.1 of RFC 1123 [12].
+ Domain names are not case sensitive.
+
+ REMOTE_HOST = "" | hostname | hostnumber
+ hostname = *( domainlabel "." ) toplabel [ "." ]
+ domainlabel = alphanum [ *alphahypdigit alphanum ]
+ toplabel = alpha [ *alphahypdigit alphanum ]
+ alphahypdigit = alphanum | "-"
+
+ The server SHOULD set this variable. If the hostname is not
+ available for performance reasons or otherwise, the server MAY
+ substitute the REMOTE_ADDR value.
+
+4.1.10. REMOTE_IDENT
+
+ The REMOTE_IDENT variable MAY be used to provide identity information
+ reported about the connection by an RFC 1413 [20] request to the
+ remote agent, if available. The server may choose not to support
+ this feature, or not to request the data for efficiency reasons, or
+ not to return available identity data.
+
+ REMOTE_IDENT = *TEXT
+
+ The data returned may be used for authentication purposes, but the
+ level of trust reposed in it should be minimal.
+
+4.1.11. REMOTE_USER
+
+ The REMOTE_USER variable provides a user identification string
+ supplied by client as part of user authentication.
+
+ REMOTE_USER = *TEXT
+
+
+
+
+
+Robinson & Coar Informational [Page 16]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ If the client request required HTTP Authentication [5] (e.g., the
+ AUTH_TYPE meta-variable is set to "Basic" or "Digest"), then the
+ value of the REMOTE_USER meta-variable MUST be set to the user-ID
+ supplied.
+
+4.1.12. REQUEST_METHOD
+
+ The REQUEST_METHOD meta-variable MUST be set to the method which
+ should be used by the script to process the request, as described in
+ section 4.3.
+
+ REQUEST_METHOD = method
+ method = "GET" | "POST" | "HEAD" | extension-method
+ extension-method = "PUT" | "DELETE" | token
+
+ The method is case sensitive. The HTTP methods are described in
+ section 5.1.1 of the HTTP/1.0 specification [1] and section 5.1.1 of
+ the HTTP/1.1 specification [4].
+
+4.1.13. SCRIPT_NAME
+
+ The SCRIPT_NAME variable MUST be set to a URI path (not URL-encoded)
+ which could identify the CGI script (rather than the script's
+ output). The syntax is the same as for PATH_INFO (section 4.1.5)
+
+ SCRIPT_NAME = "" | ( "/" path )
+
+ The leading "/" is not part of the path. It is optional if the path
+ is NULL; however, the variable MUST still be set in that case.
+
+ The SCRIPT_NAME string forms some leading part of the path component
+ of the Script-URI derived in some implementation-defined manner. No
+ PATH_INFO segment (see section 4.1.5) is included in the SCRIPT_NAME
+ value.
+
+4.1.14. SERVER_NAME
+
+ The SERVER_NAME variable MUST be set to the name of the server host
+ to which the client request is directed. It is a case-insensitive
+ hostname or network address. It forms the host part of the
+ Script-URI.
+
+ SERVER_NAME = server-name
+ server-name = hostname | ipv4-address | ( "[" ipv6-address "]" )
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 17]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ A deployed server can have more than one possible value for this
+ variable, where several HTTP virtual hosts share the same IP address.
+ In that case, the server would use the contents of the request's Host
+ header field to select the correct virtual host.
+
+4.1.15. SERVER_PORT
+
+ The SERVER_PORT variable MUST be set to the TCP/IP port number on
+ which this request is received from the client. This value is used
+ in the port part of the Script-URI.
+
+ SERVER_PORT = server-port
+ server-port = 1*digit
+
+ Note that this variable MUST be set, even if the port is the default
+ port for the scheme and could otherwise be omitted from a URI.
+
+4.1.16. SERVER_PROTOCOL
+
+ The SERVER_PROTOCOL variable MUST be set to the name and version of
+ the application protocol used for this CGI request. This MAY differ
+ from the protocol version used by the server in its communication
+ with the client.
+
+ SERVER_PROTOCOL = HTTP-Version | "INCLUDED" | extension-version
+ HTTP-Version = "HTTP" "/" 1*digit "." 1*digit
+ extension-version = protocol [ "/" 1*digit "." 1*digit ]
+ protocol = token
+
+ Here, 'protocol' defines the syntax of some of the information
+ passing between the server and the script (the 'protocol-specific'
+ features). It is not case sensitive and is usually presented in
+ upper case. The protocol is not the same as the scheme part of the
+ script URI, which defines the overall access mechanism used by the
+ client to communicate with the server. For example, a request that
+ reaches the script with a protocol of "HTTP" may have used an "https"
+ scheme.
+
+ A well-known value for SERVER_PROTOCOL which the server MAY use is
+ "INCLUDED", which signals that the current document is being included
+ as part of a composite document, rather than being the direct target
+ of the client request. The script should treat this as an HTTP/1.0
+ request.
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 18]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+4.1.17. SERVER_SOFTWARE
+
+ The SERVER_SOFTWARE meta-variable MUST be set to the name and version
+ of the information server software making the CGI request (and
+ running the gateway). It SHOULD be the same as the server
+ description reported to the client, if any.
+
+ SERVER_SOFTWARE = 1*( product | comment )
+ product = token [ "/" product-version ]
+ product-version = token
+ comment = "(" *( ctext | comment ) ")"
+ ctext = <any TEXT excluding "(" and ")">
+
+4.1.18. Protocol-Specific Meta-Variables
+
+ The server SHOULD set meta-variables specific to the protocol and
+ scheme for the request. Interpretation of protocol-specific
+ variables depends on the protocol version in SERVER_PROTOCOL. The
+ server MAY set a meta-variable with the name of the scheme to a
+ non-NULL value if the scheme is not the same as the protocol. The
+ presence of such a variable indicates to a script which scheme is
+ used by the request.
+
+ Meta-variables with names beginning with "HTTP_" contain values read
+ from the client request header fields, if the protocol used is HTTP.
+ The HTTP header field name is converted to upper case, has all
+ occurrences of "-" replaced with "_" and has "HTTP_" prepended to
+ give the meta-variable name. The header data can be presented as
+ sent by the client, or can be rewritten in ways which do not change
+ its semantics. If multiple header fields with the same field-name
+ are received then the server MUST rewrite them as a single value
+ having the same semantics. Similarly, a header field that spans
+ multiple lines MUST be merged onto a single line. The server MUST,
+ if necessary, change the representation of the data (for example, the
+ character set) to be appropriate for a CGI meta-variable.
+
+ The server is not required to create meta-variables for all the
+ header fields that it receives. In particular, it SHOULD remove any
+ header fields carrying authentication information, such as
+ 'Authorization'; or that are available to the script in other
+ variables, such as 'Content-Length' and 'Content-Type'. The server
+ MAY remove header fields that relate solely to client-side
+ communication issues, such as 'Connection'.
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 19]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+4.2. Request Message-Body
+
+ Request data is accessed by the script in a system-defined method;
+ unless defined otherwise, this will be by reading the 'standard
+ input' file descriptor or file handle.
+
+ Request-Data = [ request-body ] [ extension-data ]
+ request-body = <CONTENT_LENGTH>OCTET
+ extension-data = *OCTET
+
+ A request-body is supplied with the request if the CONTENT_LENGTH is
+ not NULL. The server MUST make at least that many bytes available
+ for the script to read. The server MAY signal an end-of-file
+ condition after CONTENT_LENGTH bytes have been read or it MAY supply
+ extension data. Therefore, the script MUST NOT attempt to read more
+ than CONTENT_LENGTH bytes, even if more data is available. However,
+ it is not obliged to read any of the data.
+
+ For non-parsed header (NPH) scripts (section 5), the server SHOULD
+ attempt to ensure that the data supplied to the script is precisely
+ as supplied by the client and is unaltered by the server.
+
+ As transfer-codings are not supported on the request-body, the server
+ MUST remove any such codings from the message-body, and recalculate
+ the CONTENT_LENGTH. If this is not possible (for example, because of
+ large buffering requirements), the server SHOULD reject the client
+ request. It MAY also remove content-codings from the message-body.
+
+4.3. Request Methods
+
+ The Request Method, as supplied in the REQUEST_METHOD meta-variable,
+ identifies the processing method to be applied by the script in
+ producing a response. The script author can choose to implement the
+ methods most appropriate for the particular application. If the
+ script receives a request with a method it does not support it SHOULD
+ reject it with an error (see section 6.3.3).
+
+4.3.1. GET
+
+ The GET method indicates that the script should produce a document
+ based on the meta-variable values. By convention, the GET method is
+ 'safe' and 'idempotent' and SHOULD NOT have the significance of
+ taking an action other than producing a document.
+
+ The meaning of the GET method may be modified and refined by
+ protocol-specific meta-variables.
+
+
+
+
+
+Robinson & Coar Informational [Page 20]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+4.3.2. POST
+
+ The POST method is used to request the script perform processing and
+ produce a document based on the data in the request message-body, in
+ addition to meta-variable values. A common use is form submission in
+ HTML [18], intended to initiate processing by the script that has a
+ permanent affect, such a change in a database.
+
+ The script MUST check the value of the CONTENT_LENGTH variable before
+ reading the attached message-body, and SHOULD check the CONTENT_TYPE
+ value before processing it.
+
+4.3.3. HEAD
+
+ The HEAD method requests the script to do sufficient processing to
+ return the response header fields, without providing a response
+ message-body. The script MUST NOT provide a response message-body
+ for a HEAD request. If it does, then the server MUST discard the
+ message-body when reading the response from the script.
+
+4.3.4. Protocol-Specific Methods
+
+ The script MAY implement any protocol-specific method, such as
+ HTTP/1.1 PUT and DELETE; it SHOULD check the value of SERVER_PROTOCOL
+ when doing so.
+
+ The server MAY decide that some methods are not appropriate or
+ permitted for a script, and may handle the methods itself or return
+ an error to the client.
+
+4.4. The Script Command Line
+
+ Some systems support a method for supplying an array of strings to
+ the CGI script. This is only used in the case of an 'indexed' HTTP
+ query, which is identified by a 'GET' or 'HEAD' request with a URI
+ query string that does not contain any unencoded "=" characters. For
+ such a request, the server SHOULD treat the query-string as a
+ search-string and parse it into words, using the rules
+
+ search-string = search-word *( "+" search-word )
+ search-word = 1*schar
+ schar = unreserved | escaped | xreserved
+ xreserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "," |
+ "$"
+
+ After parsing, each search-word is URL-decoded, optionally encoded in
+ a system-defined manner and then added to the command line argument
+ list.
+
+
+
+Robinson & Coar Informational [Page 21]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ If the server cannot create any part of the argument list, then the
+ server MUST NOT generate any command line information. For example,
+ the number of arguments may be greater than operating system or
+ server limits, or one of the words may not be representable as an
+ argument.
+
+ The script SHOULD check to see if the QUERY_STRING value contains an
+ unencoded "=" character, and SHOULD NOT use the command line
+ arguments if it does.
+
+5. NPH Scripts
+
+5.1. Identification
+
+ The server MAY support NPH (Non-Parsed Header) scripts; these are
+ scripts to which the server passes all responsibility for response
+ processing.
+
+ This specification provides no mechanism for an NPH script to be
+ identified on the basis of its output data alone. By convention,
+ therefore, any particular script can only ever provide output of one
+ type (NPH or CGI) and hence the script itself is described as an 'NPH
+ script'. A server with NPH support MUST provide an implementation-
+ defined mechanism for identifying NPH scripts, perhaps based on the
+ name or location of the script.
+
+5.2. NPH Response
+
+ There MUST be a system-defined method for the script to send data
+ back to the server or client; a script MUST always return some data.
+ Unless defined otherwise, this will be the same as for conventional
+ CGI scripts.
+
+ Currently, NPH scripts are only defined for HTTP client requests. An
+ (HTTP) NPH script MUST return a complete HTTP response message,
+ currently described in section 6 of the HTTP specifications [1], [4].
+ The script MUST use the SERVER_PROTOCOL variable to determine the
+ appropriate format for a response. It MUST also take account of any
+ generic or protocol-specific meta-variables in the request as might
+ be mandated by the particular protocol specification.
+
+ The server MUST ensure that the script output is sent to the client
+ unmodified. Note that this requires the script to use the correct
+ character set (US-ASCII [9] and ISO 8859-1 [10] for HTTP) in the
+ header fields. The server SHOULD attempt to ensure that the script
+ output is sent directly to the client, with minimal internal and no
+ transport-visible buffering.
+
+
+
+
+Robinson & Coar Informational [Page 22]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ Unless the implementation defines otherwise, the script MUST NOT
+ indicate in its response that the client can send further requests
+ over the same connection.
+
+6. CGI Response
+
+6.1. Response Handling
+
+ A script MUST always provide a non-empty response, and so there is a
+ system-defined method for it to send this data back to the server.
+ Unless defined otherwise, this will be via the 'standard output' file
+ descriptor.
+
+ The script MUST check the REQUEST_METHOD variable when processing the
+ request and preparing its response.
+
+ The server MAY implement a timeout period within which data must be
+ received from the script. If a server implementation defines such a
+ timeout and receives no data from a script within the timeout period,
+ the server MAY terminate the script process.
+
+6.2. Response Types
+
+ The response comprises a message-header and a message-body, separated
+ by a blank line. The message-header contains one or more header
+ fields. The body may be NULL.
+
+ generic-response = 1*header-field NL [ response-body ]
+
+ The script MUST return one of either a document response, a local
+ redirect response or a client redirect (with optional document)
+ response. In the response definitions below, the order of header
+ fields in a response is not significant (despite appearing so in the
+ BNF). The header fields are defined in section 6.3.
+
+ CGI-Response = document-response | local-redir-response |
+ client-redir-response | client-redirdoc-response
+
+6.2.1. Document Response
+
+ The CGI script can return a document to the user in a document
+ response, with an optional error code indicating the success status
+ of the response.
+
+ document-response = Content-Type [ Status ] *other-field NL
+ response-body
+
+
+
+
+
+Robinson & Coar Informational [Page 23]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ The script MUST return a Content-Type header field. A Status header
+ field is optional, and status 200 'OK' is assumed if it is omitted.
+ The server MUST make any appropriate modifications to the script's
+ output to ensure that the response to the client complies with the
+ response protocol version.
+
+6.2.2. Local Redirect Response
+
+ The CGI script can return a URI path and query-string
+ ('local-pathquery') for a local resource in a Location header field.
+ This indicates to the server that it should reprocess the request
+ using the path specified.
+
+ local-redir-response = local-Location NL
+
+ The script MUST NOT return any other header fields or a message-body,
+ and the server MUST generate the response that it would have produced
+ in response to a request containing the URL
+
+ scheme "://" server-name ":" server-port local-pathquery
+
+6.2.3. Client Redirect Response
+
+ The CGI script can return an absolute URI path in a Location header
+ field, to indicate to the client that it should reprocess the request
+ using the URI specified.
+
+ client-redir-response = client-Location *extension-field NL
+
+ The script MUST not provide any other header fields, except for
+ server-defined CGI extension fields. For an HTTP client request, the
+ server MUST generate a 302 'Found' HTTP response message.
+
+6.2.4. Client Redirect Response with Document
+
+ The CGI script can return an absolute URI path in a Location header
+ field together with an attached document, to indicate to the client
+ that it should reprocess the request using the URI specified.
+
+ client-redirdoc-response = client-Location Status Content-Type
+ *other-field NL response-body
+
+ The Status header field MUST be supplied and MUST contain a status
+ value of 302 'Found', or it MAY contain an extension-code, that is,
+ another valid status code that means client redirection. The server
+ MUST make any appropriate modifications to the script's output to
+ ensure that the response to the client complies with the response
+ protocol version.
+
+
+
+Robinson & Coar Informational [Page 24]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+6.3. Response Header Fields
+
+ The response header fields are either CGI or extension header fields
+ to be interpreted by the server, or protocol-specific header fields
+ to be included in the response returned to the client. At least one
+ CGI field MUST be supplied; each CGI field MUST NOT appear more than
+ once in the response. The response header fields have the syntax:
+
+ header-field = CGI-field | other-field
+ CGI-field = Content-Type | Location | Status
+ other-field = protocol-field | extension-field
+ protocol-field = generic-field
+ extension-field = generic-field
+ generic-field = field-name ":" [ field-value ] NL
+ field-name = token
+ field-value = *( field-content | LWSP )
+ field-content = *( token | separator | quoted-string )
+
+ The field-name is not case sensitive. A NULL field value is
+ equivalent to a field not being sent. Note that each header field in
+ a CGI-Response MUST be specified on a single line; CGI/1.1 does not
+ support continuation lines. Whitespace is permitted between the ":"
+ and the field-value (but not between the field-name and the ":"), and
+ also between tokens in the field-value.
+
+6.3.1. Content-Type
+
+ The Content-Type response field sets the Internet Media Type [6] of
+ the entity body.
+
+ Content-Type = "Content-Type:" media-type NL
+
+ If an entity body is returned, the script MUST supply a Content-Type
+ field in the response. If it fails to do so, the server SHOULD NOT
+ attempt to determine the correct content type. The value SHOULD be
+ sent unmodified to the client, except for any charset parameter
+ changes.
+
+ Unless it is otherwise system-defined, the default charset assumed by
+ the client for text media-types is ISO-8859-1 if the protocol is HTTP
+ and US-ASCII otherwise. Hence the script SHOULD include a charset
+ parameter. See section 3.4.1 of the HTTP/1.1 specification [4] for a
+ discussion of this issue.
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 25]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+6.3.2. Location
+
+ The Location header field is used to specify to the server that the
+ script is returning a reference to a document rather than an actual
+ document (see sections 6.2.3 and 6.2.4). It is either an absolute
+ URI (optionally with a fragment identifier), indicating that the
+ client is to fetch the referenced document, or a local URI path
+ (optionally with a query string), indicating that the server is to
+ fetch the referenced document and return it to the client as the
+ response.
+
+ Location = local-Location | client-Location
+ client-Location = "Location:" fragment-URI NL
+ local-Location = "Location:" local-pathquery NL
+ fragment-URI = absoluteURI [ "#" fragment ]
+ fragment = *uric
+ local-pathquery = abs-path [ "?" query-string ]
+ abs-path = "/" path-segments
+ path-segments = segment *( "/" segment )
+ segment = *pchar
+ pchar = unreserved | escaped | extra
+ extra = ":" | "@" | "&" | "=" | "+" | "$" | ","
+
+ The syntax of an absoluteURI is incorporated into this document from
+ that specified in RFC 2396 [2] and RFC 2732 [7]. A valid absoluteURI
+ always starts with the name of scheme followed by ":"; scheme names
+ start with a letter and continue with alphanumerics, "+", "-" or ".".
+ The local URI path and query must be an absolute path, and not a
+ relative path or NULL, and hence must start with a "/".
+
+ Note that any message-body attached to the request (such as for a
+ POST request) may not be available to the resource that is the target
+ of the redirect.
+
+6.3.3. Status
+
+ The Status header field contains a 3-digit integer result code that
+ indicates the level of success of the script's attempt to handle the
+ request.
+
+ Status = "Status:" status-code SP reason-phrase NL
+ status-code = "200" | "302" | "400" | "501" | extension-code
+ extension-code = 3digit
+ reason-phrase = *TEXT
+
+ Status code 200 'OK' indicates success, and is the default value
+ assumed for a document response. Status code 302 'Found' is used
+ with a Location header field and response message-body. Status code
+
+
+
+Robinson & Coar Informational [Page 26]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ 400 'Bad Request' may be used for an unknown request format, such as
+ a missing CONTENT_TYPE. Status code 501 'Not Implemented' may be
+ returned by a script if it receives an unsupported REQUEST_METHOD.
+
+ Other valid status codes are listed in section 6.1.1 of the HTTP
+ specifications [1], [4], and also the IANA HTTP Status Code Registry
+ [8] and MAY be used in addition to or instead of the ones listed
+ above. The script SHOULD check the value of SERVER_PROTOCOL before
+ using HTTP/1.1 status codes. The script MAY reject with error 405
+ 'Method Not Allowed' HTTP/1.1 requests made using a method it does
+ not support.
+
+ Note that returning an error status code does not have to mean an
+ error condition with the script itself. For example, a script that
+ is invoked as an error handler by the server should return the code
+ appropriate to the server's error condition.
+
+ The reason-phrase is a textual description of the error to be
+ returned to the client for human consumption.
+
+6.3.4. Protocol-Specific Header Fields
+
+ The script MAY return any other header fields that relate to the
+ response message defined by the specification for the SERVER_PROTOCOL
+ (HTTP/1.0 [1] or HTTP/1.1 [4]). The server MUST translate the header
+ data from the CGI header syntax to the HTTP header syntax if these
+ differ. For example, the character sequence for newline (such as
+ UNIX's US-ASCII LF) used by CGI scripts may not be the same as that
+ used by HTTP (US-ASCII CR followed by LF).
+
+ The script MUST NOT return any header fields that relate to
+ client-side communication issues and could affect the server's
+ ability to send the response to the client. The server MAY remove
+ any such header fields returned by the client. It SHOULD resolve any
+ conflicts between header fields returned by the script and header
+ fields that it would otherwise send itself.
+
+6.3.5. Extension Header Fields
+
+ There may be additional implementation-defined CGI header fields,
+ whose field names SHOULD begin with "X-CGI-". The server MAY ignore
+ (and delete) any unrecognised header fields with names beginning "X-
+ CGI-" that are received from the script.
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 27]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+6.4. Response Message-Body
+
+ The response message-body is an attached document to be returned to
+ the client by the server. The server MUST read all the data provided
+ by the script, until the script signals the end of the message-body
+ by way of an end-of-file condition. The message-body SHOULD be sent
+ unmodified to the client, except for HEAD requests or any required
+ transfer-codings, content-codings or charset conversions.
+
+ response-body = *OCTET
+
+7. System Specifications
+
+7.1. AmigaDOS
+
+ Meta-Variables
+ Meta-variables are passed to the script in identically named
+ environment variables. These are accessed by the DOS library
+ routine GetVar(). The flags argument SHOULD be 0. Case is
+ ignored, but upper case is recommended for compatibility with
+ case-sensitive systems.
+
+ The current working directory
+ The current working directory for the script is set to the
+ directory containing the script.
+
+ Character set
+ The US-ASCII character set [9] is used for the definition of
+ meta-variables, header fields and values; the newline (NL)
+ sequence is LF; servers SHOULD also accept CR LF as a newline.
+
+7.2. UNIX
+
+ For UNIX compatible operating systems, the following are defined:
+
+ Meta-Variables
+ Meta-variables are passed to the script in identically named
+ environment variables. These are accessed by the C library
+ routine getenv() or variable environ.
+
+ The command line
+ This is accessed using the argc and argv arguments to main(). The
+ words have any characters which are 'active' in the Bourne shell
+ escaped with a backslash.
+
+ The current working directory
+ The current working directory for the script SHOULD be set to the
+ directory containing the script.
+
+
+
+Robinson & Coar Informational [Page 28]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ Character set
+ The US-ASCII character set [9], excluding NUL, is used for the
+ definition of meta-variables, header fields and CHAR values; TEXT
+ values use ISO-8859-1. The PATH_TRANSLATED value can contain any
+ 8-bit byte except NUL. The newline (NL) sequence is LF; servers
+ should also accept CR LF as a newline.
+
+7.3. EBCDIC/POSIX
+
+ For POSIX compatible operating systems using the EBCDIC character
+ set, the following are defined:
+
+ Meta-Variables
+ Meta-variables are passed to the script in identically named
+ environment variables. These are accessed by the C library
+ routine getenv().
+
+ The command line
+ This is accessed using the argc and argv arguments to main(). The
+ words have any characters which are 'active' in the Bourne shell
+ escaped with a backslash.
+
+ The current working directory
+ The current working directory for the script SHOULD be set to the
+ directory containing the script.
+
+ Character set
+ The IBM1047 character set [21], excluding NUL, is used for the
+ definition of meta-variables, header fields, values, TEXT strings
+ and the PATH_TRANSLATED value. The newline (NL) sequence is LF;
+ servers should also accept CR LF as a newline.
+
+ media-type charset default
+ The default charset value for text (and other implementation-
+ defined) media types is IBM1047.
+
+8. Implementation
+
+8.1. Recommendations for Servers
+
+ Although the server and the CGI script need not be consistent in
+ their handling of URL paths (client URLs and the PATH_INFO data,
+ respectively), server authors may wish to impose consistency. So the
+ server implementation should specify its behaviour for the following
+ cases:
+
+ 1. define any restrictions on allowed path segments, in particular
+ whether non-terminal NULL segments are permitted;
+
+
+
+Robinson & Coar Informational [Page 29]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ 2. define the behaviour for "." or ".." path segments; i.e.,
+ whether they are prohibited, treated as ordinary path segments
+ or interpreted in accordance with the relative URL
+ specification [2];
+
+ 3. define any limits of the implementation, including limits on
+ path or search string lengths, and limits on the volume of
+ header fields the server will parse.
+
+8.2. Recommendations for Scripts
+
+ If the script does not intend processing the PATH_INFO data, then it
+ should reject the request with 404 Not Found if PATH_INFO is not
+ NULL.
+
+ If the output of a form is being processed, check that CONTENT_TYPE
+ is "application/x-www-form-urlencoded" [18] or "multipart/form-data"
+ [16]. If CONTENT_TYPE is blank, the script can reject the request
+ with a 415 'Unsupported Media Type' error, where supported by the
+ protocol.
+
+ When parsing PATH_INFO, PATH_TRANSLATED or SCRIPT_NAME the script
+ should be careful of void path segments ("//") and special path
+ segments ("." and ".."). They should either be removed from the path
+ before use in OS system calls, or the request should be rejected with
+ 404 'Not Found'.
+
+ When returning header fields, the script should try to send the CGI
+ header fields as soon as possible, and should send them before any
+ HTTP header fields. This may help reduce the server's memory
+ requirements.
+
+ Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST
+ meta-variables (see sections 4.1.8 and 4.1.9) may not identify the
+ ultimate source of the request. They identify the client for the
+ immediate request to the server; that client may be a proxy, gateway,
+ or other intermediary acting on behalf of the actual source client.
+
+9. Security Considerations
+
+9.1. Safe Methods
+
+ As discussed in the security considerations of the HTTP
+ specifications [1], [4], the convention has been established that the
+ GET and HEAD methods should be 'safe' and 'idempotent' (repeated
+ requests have the same effect as a single request). See section 9.1
+ of RFC 2616 [4] for a full discussion.
+
+
+
+
+Robinson & Coar Informational [Page 30]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+9.2. Header Fields Containing Sensitive Information
+
+ Some HTTP header fields may carry sensitive information which the
+ server should not pass on to the script unless explicitly configured
+ to do so. For example, if the server protects the script by using
+ the Basic authentication scheme, then the client will send an
+ Authorization header field containing a username and password. The
+ server validates this information and so it should not pass on the
+ password via the HTTP_AUTHORIZATION meta-variable without careful
+ consideration. This also applies to the Proxy-Authorization header
+ field and the corresponding HTTP_PROXY_AUTHORIZATION meta-variable.
+
+9.3. Data Privacy
+
+ Confidential data in a request should be placed in a message-body as
+ part of a POST request, and not placed in the URI or message headers.
+ On some systems, the environment used to pass meta-variables to a
+ script may be visible to other scripts or users. In addition, many
+ existing servers, proxies and clients will permanently record the URI
+ where it might be visible to third parties.
+
+9.4. Information Security Model
+
+ For a client connection using TLS, the security model applies between
+ the client and the server, and not between the client and the script.
+ It is the server's responsibility to handle the TLS session, and thus
+ it is the server which is authenticated to the client, not the CGI
+ script.
+
+ This specification provides no mechanism for the script to
+ authenticate the server which invoked it. There is no enforced
+ integrity on the CGI request and response messages.
+
+9.5. Script Interference with the Server
+
+ The most common implementation of CGI invokes the script as a child
+ process using the same user and group as the server process. It
+ should therefore be ensured that the script cannot interfere with the
+ server process, its configuration, documents or log files.
+
+ If the script is executed by calling a function linked in to the
+ server software (either at compile-time or run-time) then precautions
+ should be taken to protect the core memory of the server, or to
+ ensure that untrusted code cannot be executed.
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 31]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+9.6. Data Length and Buffering Considerations
+
+ This specification places no limits on the length of the message-body
+ presented to the script. The script should not assume that
+ statically allocated buffers of any size are sufficient to contain
+ the entire submission at one time. Use of a fixed length buffer
+ without careful overflow checking may result in an attacker
+ exploiting 'stack-smashing' or 'stack-overflow' vulnerabilities of
+ the operating system. The script may spool large submissions to disk
+ or other buffering media, but a rapid succession of large submissions
+ may result in denial of service conditions. If the CONTENT_LENGTH of
+ a message-body is larger than resource considerations allow, scripts
+ should respond with an error status appropriate for the protocol
+ version; potentially applicable status codes include 503 'Service
+ Unavailable' (HTTP/1.0 and HTTP/1.1), 413 'Request Entity Too Large'
+ (HTTP/1.1), and 414 'Request-URI Too Large' (HTTP/1.1).
+
+ Similar considerations apply to the server's handling of the CGI
+ response from the script. There is no limit on the length of the
+ header or message-body returned by the script; the server should not
+ assume that statically allocated buffers of any size are sufficient
+ to contain the entire response.
+
+9.7. Stateless Processing
+
+ The stateless nature of the Web makes each script execution and
+ resource retrieval independent of all others even when multiple
+ requests constitute a single conceptual Web transaction. Because of
+ this, a script should not make any assumptions about the context of
+ the user-agent submitting a request. In particular, scripts should
+ examine data obtained from the client and verify that they are valid,
+ both in form and content, before allowing them to be used for
+ sensitive purposes such as input to other applications, commands, or
+ operating system services. These uses include (but are not limited
+ to) system call arguments, database writes, dynamically evaluated
+ source code, and input to billing or other secure processes. It is
+ important that applications be protected from invalid input
+ regardless of whether the invalidity is the result of user error,
+ logic error, or malicious action.
+
+ Authors of scripts involved in multi-request transactions should be
+ particularly cautious about validating the state information;
+ undesirable effects may result from the substitution of dangerous
+ values for portions of the submission which might otherwise be
+ presumed safe. Subversion of this type occurs when alterations are
+ made to data from a prior stage of the transaction that were not
+ meant to be controlled by the client (e.g., hidden HTML form
+ elements, cookies, embedded URLs, etc.).
+
+
+
+Robinson & Coar Informational [Page 32]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+9.8. Relative Paths
+
+ The server should be careful of ".." path segments in the request
+ URI. These should be removed or resolved in the request URI before
+ it is split into the script-path and extra-path. Alternatively, when
+ the extra-path is used to find the PATH_TRANSLATED, care should be
+ taken to avoid the path resolution from providing translated paths
+ outside an expected path hierarchy.
+
+9.9. Non-parsed Header Output
+
+ If a script returns a non-parsed header output, to be interpreted by
+ the client in its native protocol, then the script must address all
+ security considerations relating to that protocol.
+
+10. Acknowledgements
+
+ This work is based on the original CGI interface that arose out of
+ discussions on the 'www-talk' mailing list. In particular, Rob
+ McCool, John Franks, Ari Luotonen, George Phillips and Tony Sanders
+ deserve special recognition for their efforts in defining and
+ implementing the early versions of this interface.
+
+ This document has also greatly benefited from the comments and
+ suggestions made Chris Adie, Dave Kristol and Mike Meyer; also David
+ Morris, Jeremy Madea, Patrick McManus, Adam Donahue, Ross Patterson
+ and Harald Alvestrand.
+
+11. References
+
+11.1 Normative References
+
+ [1] Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext
+ Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996.
+
+ [2] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource
+ Identifiers (URI) : Generic Syntax", RFC 2396, August 1998.
+
+ [3] Bradner, S., "Key words for use in RFCs to Indicate Requirements
+ Levels", BCP 14, RFC 2119, March 1997.
+
+ [4] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L.,
+ Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol --
+ HTTP/1.1", RFC 2616, June 1999.
+
+ [5] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S.,
+ Leach, P., Luotonen, A., and L. Stewart, "HTTP Authentication:
+ Basic and Digest Access Authentication", RFC 2617, June 1999.
+
+
+
+Robinson & Coar Informational [Page 33]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ [6] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part Two: Media Types", RFC 2046, November
+ 1996.
+
+ [7] Hinden, R., Carpenter, B., and L. Masinter, "Format for Literal
+ IPv6 Addresses in URL's", RFC 2732, December 1999.
+
+ [8] "HTTP Status Code Registry",
+ http://www.iana.org/assignments/http-status-codes, IANA.
+
+ [9] "Information Systems -- Coded Character Sets -- 7-bit American
+ Standard Code for Information Interchange (7-Bit ASCII)", ANSI
+ INCITS.4-1986 (R2002).
+
+ [10] "Information technology -- 8-bit single-byte coded graphic
+ character sets -- Part 1: Latin alphabet No. 1", ISO/IEC
+ 8859-1:1998.
+
+11.2. Informative References
+
+ [11] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
+ Unifying Syntax for the Expression of Names and Addresses of
+ Objects on the Network as used in the World-Wide Web", RFC 1630,
+ June 1994.
+
+ [12] Braden, R., Ed., "Requirements for Internet Hosts -- Application
+ and Support", STD 3, RFC 1123, October 1989.
+
+ [13] Crocker, D., "Standard for the Format of ARPA Internet Text
+ Messages", STD 11, RFC 822, August 1982.
+
+ [14] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC
+ 2246, January 1999.
+
+ [15] Hinden R. and S. Deering, "Internet Protocol Version 6 (IPv6)
+ Addressing Architecture", RFC 3513, April 2003.
+
+ [16] Masinter, L., "Returning Values from Forms:
+ multipart/form-data", RFC 2388, August 1998.
+
+ [17] Mockapetris, P., "Domain Names - Concepts and Facilities", STD
+ 13, RFC 1034, November 1987.
+
+ [18] Raggett, D., Le Hors, A., and I. Jacobs, Eds., "HTML 4.01
+ Specification", W3C Recommendation December 1999,
+ http://www.w3.org/TR/html401/.
+
+ [19] Rescola, E. "HTTP Over TLS", RFC 2818, May 2000.
+
+
+
+Robinson & Coar Informational [Page 34]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+ [20] St. Johns, M., "Identification Protocol", RFC 1413, February
+ 1993.
+
+ [21] IBM National Language Support Reference Manual Volume 2,
+ SE09-8002-01, March 1990.
+
+ [22] "The Common Gateway Interface",
+ http://hoohoo.ncsa.uiuc.edu/cgi/, NCSA, University of Illinois.
+
+12. Authors' Addresses
+
+ David Robinson
+ The Apache Software Foundation
+
+ EMail: drtr@apache.org
+
+
+ Ken A. L. Coar
+ The Apache Software Foundation
+
+ EMail: coar@apache.org
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 35]
+
+RFC 3875 CGI Version 1.1 October 2004
+
+
+13. Full Copyright Statement
+
+ Copyright (C) The Internet Society (2004). This document is subject
+ to the rights, licenses and restrictions contained in BCP 78 and at
+ www.rfc-editor.org, and except as set forth therein, the authors
+ retain all their rights.
+
+ This document and the information contained herein are provided on an
+ "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+ OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
+ ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
+ INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
+ INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+ WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+ Intellectual Property
+
+ The IETF takes no position regarding the validity or scope of any
+ Intellectual Property Rights or other rights that might be claimed to
+ pertain to the implementation or use of the technology described in
+ this document or the extent to which any license under such rights
+ might or might not be available; nor does it represent that it has
+ made any independent effort to identify any such rights. Information
+ on the ISOC's procedures with respect to rights in ISOC Documents can
+ be found in BCP 78 and BCP 79.
+
+ Copies of IPR disclosures made to the IETF Secretariat and any
+ assurances of licenses to be made available, or the result of an
+ attempt made to obtain a general license or permission for the use of
+ such proprietary rights by implementers or users of this
+ specification can be obtained from the IETF on-line IPR repository at
+ http://www.ietf.org/ipr.
+
+ The IETF invites any interested party to bring to its attention any
+ copyrights, patents or patent applications, or other proprietary
+ rights that may cover technology that may be required to implement
+ this standard. Please address the information to the IETF at ietf-
+ ipr@ietf.org.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+Robinson & Coar Informational [Page 36]
+