summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc7464.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc7464.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc7464.txt')
-rw-r--r--doc/rfc/rfc7464.txt451
1 files changed, 451 insertions, 0 deletions
diff --git a/doc/rfc/rfc7464.txt b/doc/rfc/rfc7464.txt
new file mode 100644
index 0000000..55fe2a2
--- /dev/null
+++ b/doc/rfc/rfc7464.txt
@@ -0,0 +1,451 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) N. Williams
+Request for Comments: 7464 Cryptonector
+Category: Standards Track February 2015
+ISSN: 2070-1721
+
+
+ JavaScript Object Notation (JSON) Text Sequences
+
+Abstract
+
+ This document describes the JavaScript Object Notation (JSON) text
+ sequence format and associated media type "application/json-seq". A
+ JSON text sequence consists of any number of JSON texts, all encoded
+ in UTF-8, each prefixed by an ASCII Record Separator (0x1E), and each
+ ending with an ASCII Line Feed character (0x0A).
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 5741.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc7464.
+
+Copyright Notice
+
+ Copyright (c) 2015 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+
+
+
+
+
+
+Williams Standards Track [Page 1]
+
+RFC 7464 JSON Text Sequences February 2015
+
+
+Table of Contents
+
+ 1. Introduction and Motivation .....................................2
+ 1.1. Conventions Used in This Document ..........................2
+ 2. JSON Text Sequence Format .......................................3
+ 2.1. JSON Text Sequence Parsing .................................3
+ 2.2. JSON Text Sequence Encoding ................................4
+ 2.3. Incomplete/Invalid JSON Texts Need Not Be Fatal ............4
+ 2.4. Top-Level Values: numbers, true, false, and null ...........5
+ 3. Security Considerations .........................................6
+ 4. IANA Considerations .............................................6
+ 5. Normative References ............................................7
+ Acknowledgements ...................................................8
+ Author's Address ...................................................8
+
+1. Introduction and Motivation
+
+ The JavaScript Object Notation (JSON) [RFC7159] is a very handy
+ serialization format. However, when serializing a large sequence of
+ values as an array, or a possibly indeterminate-length or never-
+ ending sequence of values, JSON becomes difficult to work with.
+
+ Consider a sequence of one million values, each possibly one kilobyte
+ when encoded -- roughly one gigabyte. It is often desirable to
+ process such a dataset in an incremental manner without having to
+ first read all of it before beginning to produce results.
+ Traditionally, the way to do this with JSON is to use a "streaming"
+ parser, but these are not widely available, widely used, or easy to
+ use.
+
+ This document describes the concept and format of "JSON text
+ sequences", which are specifically not JSON texts themselves but are
+ composed of (possible) JSON texts. JSON text sequences can be parsed
+ (and produced) incrementally without having to have a streaming
+ parser (nor streaming encoder).
+
+1.1. Conventions Used in This Document
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ [RFC2119].
+
+
+
+
+
+
+
+
+
+Williams Standards Track [Page 2]
+
+RFC 7464 JSON Text Sequences February 2015
+
+
+2. JSON Text Sequence Format
+
+ Two different sets of ABNF rules are provided for the definition of
+ JSON text sequences: one for parsers and one for encoders. Having
+ two different sets of rules permits recovery by parsers from
+ sequences where some of the elements are truncated for whatever
+ reason. The syntax for parsers is specified in terms of octet
+ strings that are then interpreted as JSON texts, if possible. The
+ syntax for encoders, on the other hand, assumes that sequence
+ elements are not truncated.
+
+ JSON text sequences MUST use UTF-8 encoding; other encodings of JSON
+ (i.e., UTF-16 and UTF-32) MUST NOT be used.
+
+2.1. JSON Text Sequence Parsing
+
+ The ABNF [RFC5234] for the JSON text sequence parser is as given in
+ Figure 1.
+
+ input-JSON-sequence = *(1*RS possible-JSON)
+ RS = %x1E; "record separator" (RS), see RFC 20
+ ; Also known as: Unicode Character INFORMATION SEPARATOR
+ ; TWO (U+001E)
+ possible-JSON = 1*(not-RS); attempt to parse as UTF-8-encoded
+ ; JSON text (see RFC 7159)
+ not-RS = %x00-1d / %x1f-ff; any octets other than RS
+
+ Figure 1: JSON Text Sequence ABNF
+
+ In prose: a series of octet strings, each containing any octet other
+ than a record separator (RS) (0x1E) [RFC20]. All octet strings are
+ preceded by an RS byte. Each octet string in the sequence is to be
+ parsed as a JSON text in the UTF-8 encoding [RFC3629].
+
+ If parsing of such an octet string as a UTF-8-encoded JSON text
+ fails, the parser SHOULD nonetheless continue parsing the remainder
+ of the sequence. The parser can report such failures to
+ applications, which might then choose to terminate parsing of a
+ sequence. Multiple consecutive RS octets do not denote empty
+ sequence elements between them and can be ignored.
+
+ This document does not define a mechanism for reliably identifying
+ text sequence by position (for example, when sending individual
+ elements of an array as unique text sequences). For applications
+ where truncation is a possibility, this means that intended sequence
+ elements can be truncated and can even be missing entirely;
+ therefore, a reference to an nth element would be unreliable.
+
+
+
+
+Williams Standards Track [Page 3]
+
+RFC 7464 JSON Text Sequences February 2015
+
+
+ There is no end of sequence indicator.
+
+2.2. JSON Text Sequence Encoding
+
+ The ABNF for the JSON text sequence encoder is given in Figure 2.
+
+ JSON-sequence = *(RS JSON-text LF)
+ RS = %x1E; see RFC 20
+ ; Also known as: Unicode Character INFORMATION SEPARATOR
+ ; TWO (U+001E)
+ LF = %x0A; "line feed" (LF), see RFC 20
+ JSON-text = <given by RFC 7159, using UTF-8 encoding>
+
+ Figure 2: JSON Text Sequence ABNF
+
+ In prose: any number of JSON texts, each encoded in UTF-8 [RFC3629],
+ each preceded by one ASCII RS character, and each followed by a line
+ feed (LF). Since RS is an ASCII control character, it may only
+ appear in JSON strings in escaped form (see [RFC7159]), and since RS
+ may not appear in JSON texts in any other form, RS unambiguously
+ delimits the start of any element in the sequence. RS is sufficient
+ to unambiguously delimit all top-level JSON value types other than
+ numbers. Following each JSON text in the sequence with an LF allows
+ detection of truncated JSON texts consisting of a number at the top-
+ level; see Section 2.4.
+
+ JSON text sequence encoders are expected to ensure that the sequence
+ elements are properly formed. When the JSON text sequence encoder
+ does the JSON text encoding, the sequence elements will naturally be
+ properly formed. When the JSON text sequence encoder accepts
+ already-encoded JSON texts, the JSON text sequence encoder ought to
+ parse them before adding them to a sequence.
+
+ Note that on some systems it"s possible to input RS by typing
+ "ctrl-^"; on some system or applications, the correct sequence may be
+ "ctrl-v ctrl-^". This is helpful when constructing a sequence
+ manually with a text editor.
+
+2.3. Incomplete/Invalid JSON Texts Need Not Be Fatal
+
+ Per Section 2.1, JSON text sequence parsers should not abort when an
+ octet string contains a malformed JSON text. Instead, the JSON text
+ sequence parser should skip to the next RS. Such a situation may
+ arise in contexts where, for example, data that is appended to log
+ files to log files is truncated by the filesystem (e.g., due to a
+ crash or administrative process termination).
+
+
+
+
+
+Williams Standards Track [Page 4]
+
+RFC 7464 JSON Text Sequences February 2015
+
+
+ Incremental JSON text parsers may be used, though of course failure
+ to parse a given text may result after first producing some
+ incremental parse results.
+
+ Sequence parsers should have an option to warn about truncated JSON
+ texts.
+
+2.4. Top-Level Values: numbers, true, false, and null
+
+ While objects, arrays, and strings are self-delimited in JSON texts,
+ numbers and the values 'true', 'false', and 'null' are not. Only
+ whitespace can delimit the latter four kinds of values.
+
+ JSON text sequences use 0x0A as a "canary" octet to detect
+ truncation.
+
+ Parsers MUST check that any JSON texts that are a top-level number,
+ or that might be 'true', 'false', or 'null', include JSON whitespace
+ (at least one byte matching the "ws" ABNF rule from [RFC7159]) after
+ that value; otherwise, the JSON-text may have been truncated. Note
+ that the LF following each JSON text matches the "ws" ABNF rule.
+
+ Parsers MUST drop JSON-text sequence elements consisting of non-self-
+ delimited top-level values that may have been truncated (that are not
+ delimited by whitespace). Parsers can report such texts as warnings
+ (including, optionally, the parsed text and/or the original octet
+ string).
+
+ For example, '<RS>123<RS>' might have been intended to carry the top-
+ level number 1234, but it got truncated. Similarly, '<RS>true<RS>'
+ might have been intended to carry the invalid text 'trueish'.
+ '<RS>truefalse<RS>' is not two top-level values, 'true', and 'false';
+ it is simply not a valid JSON text.
+
+ Implementations may produce a value when parsing '<RS>"foo"<RS>'
+ because their JSON text parser might be able to consume bytes
+ incrementally; since the JSON text in this case is a self-delimiting
+ top-level value, the parser can produce the result without consuming
+ an additional byte. Such implementations ought to skip to the next
+ RS byte, possibly reporting any intervening non-whitespace bytes.
+
+ Detection of truncation of non-self-delimited sequence elements
+ (numbers, true, false, and null) is only possible when the sequence
+ encoder produces or receives complete JSON texts. Implementations
+ where the sequence encoder is not also in charge of encoding the
+ individual JSON texts should ensure that those JSON texts are
+ complete.
+
+
+
+
+Williams Standards Track [Page 5]
+
+RFC 7464 JSON Text Sequences February 2015
+
+
+3. Security Considerations
+
+ All the security considerations of JSON [RFC7159] apply. This format
+ provides no cryptographic integrity protection of any kind.
+
+ As usual, parsers must operate on input that is assumed to be
+ untrusted. This means that parsers must fail gracefully in the face
+ of malicious inputs.
+
+ Note that incremental JSON text parsers can produce partial results
+ and later indicate failure to parse the remainder of a text. A
+ sequence parser that uses an incremental JSON text parser might treat
+ a sequence like '<RS>"foo"<LF>456<LF><RS>' as a sequence of one
+ element ("foo"), while a sequence parser that uses a non-incremental
+ JSON text parser might treat the same sequence as being empty. This
+ effect, and texts that fail to parse and are ignored, can be used to
+ smuggle data past sequence parsers that don't warn about JSON text
+ failures.
+
+ Repeated parsing and re-encoding of a JSON text sequence can result
+ in the addition (or stripping) of trailing LF bytes from (to)
+ individual sequence element JSON texts. This can break signature
+ validation. JSON has no canonical form for JSON texts, therefore
+ neither does the JSON text sequence format.
+
+4. IANA Considerations
+
+ The MIME media type for JSON text sequences is application/json-seq.
+
+ Type name: application
+
+ Subtype name: json-seq
+
+ Required parameters: N/A
+
+ Optional parameters: N/A
+
+ Encoding considerations: binary
+
+ Security considerations: See RFC 7464, Section 3.
+
+ Interoperability considerations: Described herein.
+
+ Published specification: RFC 7464.
+
+
+
+
+
+
+
+Williams Standards Track [Page 6]
+
+RFC 7464 JSON Text Sequences February 2015
+
+
+ Applications that use this media type:
+
+ <https://stedolan.github.io/jq>
+ <https://github.com/mapbox/cligj>
+ <https://github.com/hildjj/json-text-sequence>
+
+ Fragment identifier considerations: N/A
+
+ Additional information:
+
+ o Deprecated alias names for this type: N/A
+
+ o Magic number(s): N/A
+
+ o File extension(s): N/A
+
+ o Macintosh file type code(s): N/A
+
+ Person & email address to contact for further information:
+
+ json@ietf.org
+
+ Intended usage: COMMON
+
+ Author: Nicolas Williams (nico@cryptonector.com)
+
+ Change controller: IETF
+
+5. Normative References
+
+ [RFC20] Cerf, V., "ASCII format for network interchange", STD 80,
+ RFC 20, October 1969,
+ <http://www.rfc-editor.org/info/rfc20>.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997,
+ <http://www.rfc-editor.org/info/rfc2119>.
+
+ [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
+ 10646", STD 63, RFC 3629, November 2003,
+ <http://www.rfc-editor.org/info/rfc3629>.
+
+ [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax
+ Specifications: ABNF", STD 68, RFC 5234, January 2008,
+ <http://www.rfc-editor.org/info/rfc5234>.
+
+
+
+
+
+
+Williams Standards Track [Page 7]
+
+RFC 7464 JSON Text Sequences February 2015
+
+
+ [RFC7159] Bray, T., Ed., "The JavaScript Object Notation (JSON) Data
+ Interchange Format", RFC 7159, March 2014,
+ <http://www.rfc-editor.org/info/rfc7159>.
+
+Acknowledgements
+
+ Phillip Hallam-Baker proposed the use of JSON text sequences for
+ logfiles and pointed out the need for resynchronization. Stephen
+ Dolan created <https://github.com/stedolan/jq>, which uses something
+ like JSON text sequences (with LF as the separator between texts on
+ output, and requiring only such whitespace as needed to disambiguate
+ on input). Carsten Bormann suggested the use of ASCII RS, and Joe
+ Hildebrand suggested the use of LF in addition to RS for
+ disambiguating top-level number values. Paul Hoffman shepherded the
+ document. Many others contributed reviews and comments on the JSON
+ Working Group mailing list.
+
+Author's Address
+
+ Nicolas Williams
+ Cryptonector, LLC
+
+ EMail: nico@cryptonector.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Williams Standards Track [Page 8]
+