From 4bfd864f10b68b71482b35c818559068ef8d5797 Mon Sep 17 00:00:00 2001
From: Thomas Voss <mail@thomasvoss.com>
Date: Wed, 27 Nov 2024 20:54:24 +0100
Subject: doc: Add RFC documents

---
 doc/rfc/rfc7049.txt | 3027 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 3027 insertions(+)
 create mode 100644 doc/rfc/rfc7049.txt

(limited to 'doc/rfc/rfc7049.txt')

diff --git a/doc/rfc/rfc7049.txt b/doc/rfc/rfc7049.txt
new file mode 100644
index 0000000..5d29907
--- /dev/null
+++ b/doc/rfc/rfc7049.txt
@@ -0,0 +1,3027 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                        C. Bormann
+Request for Comments: 7049                       Universitaet Bremen TZI
+Category: Standards Track                                     P. Hoffman
+ISSN: 2070-1721                                           VPN Consortium
+                                                            October 2013
+
+
+              Concise Binary Object Representation (CBOR)
+
+Abstract
+
+   The Concise Binary Object Representation (CBOR) is a data format
+   whose design goals include the possibility of extremely small code
+   size, fairly small message size, and extensibility without the need
+   for version negotiation.  These design goals make it different from
+   earlier binary serializations such as ASN.1 and MessagePack.
+
+Status of This Memo
+
+   This is an Internet Standards Track document.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Further information on
+   Internet Standards is available in Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc7049.
+
+Copyright Notice
+
+   Copyright (c) 2013 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 1]
+
+RFC 7049                          CBOR                      October 2013
+
+
+Table of Contents
+
+   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
+     1.1.  Objectives  . . . . . . . . . . . . . . . . . . . . . . .   4
+     1.2.  Terminology . . . . . . . . . . . . . . . . . . . . . . .   5
+   2.  Specification of the CBOR Encoding  . . . . . . . . . . . . .   6
+     2.1.  Major Types . . . . . . . . . . . . . . . . . . . . . . .   7
+     2.2.  Indefinite Lengths for Some Major Types . . . . . . . . .   9
+       2.2.1.  Indefinite-Length Arrays and Maps . . . . . . . . . .   9
+       2.2.2.  Indefinite-Length Byte Strings and Text Strings . . .  11
+     2.3.  Floating-Point Numbers and Values with No Content . . . .  12
+     2.4.  Optional Tagging of Items . . . . . . . . . . . . . . . .  14
+       2.4.1.  Date and Time . . . . . . . . . . . . . . . . . . . .  16
+       2.4.2.  Bignums . . . . . . . . . . . . . . . . . . . . . . .  16
+       2.4.3.  Decimal Fractions and Bigfloats . . . . . . . . . . .  17
+       2.4.4.  Content Hints . . . . . . . . . . . . . . . . . . . .  18
+         2.4.4.1.  Encoded CBOR Data Item  . . . . . . . . . . . . .  18
+         2.4.4.2.  Expected Later Encoding for CBOR-to-JSON
+                   Converters  . . . . . . . . . . . . . . . . . . .  18
+         2.4.4.3.  Encoded Text  . . . . . . . . . . . . . . . . . .  19
+       2.4.5.  Self-Describe CBOR  . . . . . . . . . . . . . . . . .  19
+   3.  Creating CBOR-Based Protocols . . . . . . . . . . . . . . . .  20
+     3.1.  CBOR in Streaming Applications  . . . . . . . . . . . . .  20
+     3.2.  Generic Encoders and Decoders . . . . . . . . . . . . . .  21
+     3.3.  Syntax Errors . . . . . . . . . . . . . . . . . . . . . .  21
+       3.3.1.  Incomplete CBOR Data Items  . . . . . . . . . . . . .  22
+       3.3.2.  Malformed Indefinite-Length Items . . . . . . . . . .  22
+       3.3.3.  Unknown Additional Information Values . . . . . . . .  23
+     3.4.  Other Decoding Errors . . . . . . . . . . . . . . . . . .  23
+     3.5.  Handling Unknown Simple Values and Tags . . . . . . . . .  24
+     3.6.  Numbers . . . . . . . . . . . . . . . . . . . . . . . . .  24
+     3.7.  Specifying Keys for Maps  . . . . . . . . . . . . . . . .  25
+     3.8.  Undefined Values  . . . . . . . . . . . . . . . . . . . .  26
+     3.9.  Canonical CBOR  . . . . . . . . . . . . . . . . . . . . .  26
+     3.10. Strict Mode . . . . . . . . . . . . . . . . . . . . . . .  28
+   4.  Converting Data between CBOR and JSON . . . . . . . . . . . .  29
+     4.1.  Converting from CBOR to JSON  . . . . . . . . . . . . . .  29
+     4.2.  Converting from JSON to CBOR  . . . . . . . . . . . . . .  30
+   5.  Future Evolution of CBOR  . . . . . . . . . . . . . . . . . .  31
+     5.1.  Extension Points  . . . . . . . . . . . . . . . . . . . .  32
+     5.2.  Curating the Additional Information Space . . . . . . . .  33
+   6.  Diagnostic Notation . . . . . . . . . . . . . . . . . . . . .  33
+     6.1.  Encoding Indicators . . . . . . . . . . . . . . . . . . .  34
+   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  35
+     7.1.  Simple Values Registry  . . . . . . . . . . . . . . . . .  35
+     7.2.  Tags Registry . . . . . . . . . . . . . . . . . . . . . .  35
+     7.3.  Media Type ("MIME Type")  . . . . . . . . . . . . . . . .  36
+     7.4.  CoAP Content-Format . . . . . . . . . . . . . . . . . . .  37
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 2]
+
+RFC 7049                          CBOR                      October 2013
+
+
+     7.5.  The +cbor Structured Syntax Suffix Registration . . . . .  37
+   8.  Security Considerations . . . . . . . . . . . . . . . . . . .  38
+   9.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  38
+   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  39
+     10.1.  Normative References . . . . . . . . . . . . . . . . . .  39
+     10.2.  Informative References . . . . . . . . . . . . . . . . .  40
+   Appendix A.  Examples . . . . . . . . . . . . . . . . . . . . . .  41
+   Appendix B.  Jump Table . . . . . . . . . . . . . . . . . . . . .  45
+   Appendix C.  Pseudocode . . . . . . . . . . . . . . . . . . . . .  48
+   Appendix D.  Half-Precision . . . . . . . . . . . . . . . . . . .  50
+   Appendix E.  Comparison of Other Binary Formats to CBOR's Design
+                Objectives . . . . . . . . . . . . . . . . . . . . .  51
+     E.1.  ASN.1 DER, BER, and PER . . . . . . . . . . . . . . . . .  52
+     E.2.  MessagePack . . . . . . . . . . . . . . . . . . . . . . .  52
+     E.3.  BSON  . . . . . . . . . . . . . . . . . . . . . . . . . .  53
+     E.4.  UBJSON  . . . . . . . . . . . . . . . . . . . . . . . . .  53
+     E.5.  MSDTP: RFC 713  . . . . . . . . . . . . . . . . . . . . .  53
+     E.6.  Conciseness on the Wire . . . . . . . . . . . . . . . . .  53
+
+1.  Introduction
+
+   There are hundreds of standardized formats for binary representation
+   of structured data (also known as binary serialization formats).  Of
+   those, some are for specific domains of information, while others are
+   generalized for arbitrary data.  In the IETF, probably the best-known
+   formats in the latter category are ASN.1's BER and DER [ASN.1].
+
+   The format defined here follows some specific design goals that are
+   not well met by current formats.  The underlying data model is an
+   extended version of the JSON data model [RFC4627].  It is important
+   to note that this is not a proposal that the grammar in RFC 4627 be
+   extended in general, since doing so would cause a significant
+   backwards incompatibility with already deployed JSON documents.
+   Instead, this document simply defines its own data model that starts
+   from JSON.
+
+   Appendix E lists some existing binary formats and discusses how well
+   they do or do not fit the design objectives of the Concise Binary
+   Object Representation (CBOR).
+
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 3]
+
+RFC 7049                          CBOR                      October 2013
+
+
+1.1.  Objectives
+
+   The objectives of CBOR, roughly in decreasing order of importance,
+   are:
+
+   1.  The representation must be able to unambiguously encode most
+       common data formats used in Internet standards.
+
+       *  It must represent a reasonable set of basic data types and
+          structures using binary encoding.  "Reasonable" here is
+          largely influenced by the capabilities of JSON, with the major
+          addition of binary byte strings.  The structures supported are
+          limited to arrays and trees; loops and lattice-style graphs
+          are not supported.
+
+       *  There is no requirement that all data formats be uniquely
+          encoded; that is, it is acceptable that the number "7" might
+          be encoded in multiple different ways.
+
+   2.  The code for an encoder or decoder must be able to be compact in
+       order to support systems with very limited memory, processor
+       power, and instruction sets.
+
+       *  An encoder and a decoder need to be implementable in a very
+          small amount of code (for example, in class 1 constrained
+          nodes as defined in [CNN-TERMS]).
+
+       *  The format should use contemporary machine representations of
+          data (for example, not requiring binary-to-decimal
+          conversion).
+
+   3.  Data must be able to be decoded without a schema description.
+
+       *  Similar to JSON, encoded data should be self-describing so
+          that a generic decoder can be written.
+
+   4.  The serialization must be reasonably compact, but data
+       compactness is secondary to code compactness for the encoder and
+       decoder.
+
+       *  "Reasonable" here is bounded by JSON as an upper bound in
+          size, and by implementation complexity maintaining a lower
+          bound.  Using either general compression schemes or extensive
+          bit-fiddling violates the complexity goals.
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 4]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   5.  The format must be applicable to both constrained nodes and high-
+       volume applications.
+
+       *  This means it must be reasonably frugal in CPU usage for both
+          encoding and decoding.  This is relevant both for constrained
+          nodes and for potential usage in applications with a very high
+          volume of data.
+
+   6.  The format must support all JSON data types for conversion to and
+       from JSON.
+
+       *  It must support a reasonable level of conversion as long as
+          the data represented is within the capabilities of JSON.  It
+          must be possible to define a unidirectional mapping towards
+          JSON for all types of data.
+
+   7.  The format must be extensible, and the extended data must be
+       decodable by earlier decoders.
+
+       *  The format is designed for decades of use.
+
+       *  The format must support a form of extensibility that allows
+          fallback so that a decoder that does not understand an
+          extension can still decode the message.
+
+       *  The format must be able to be extended in the future by later
+          IETF standards.
+
+1.2.  Terminology
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119, BCP 14
+   [RFC2119] and indicate requirement levels for compliant CBOR
+   implementations.
+
+   The term "byte" is used in its now-customary sense as a synonym for
+   "octet".  All multi-byte values are encoded in network byte order
+   (that is, most significant byte first, also known as "big-endian").
+
+   This specification makes use of the following terminology:
+
+   Data item:  A single piece of CBOR data.  The structure of a data
+      item may contain zero, one, or more nested data items.  The term
+      is used both for the data item in representation format and for
+      the abstract idea that can be derived from that by a decoder.
+
+
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 5]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   Decoder:  A process that decodes a CBOR data item and makes it
+      available to an application.  Formally speaking, a decoder
+      contains a parser to break up the input using the syntax rules of
+      CBOR, as well as a semantic processor to prepare the data in a
+      form suitable to the application.
+
+   Encoder:  A process that generates the representation format of a
+      CBOR data item from application information.
+
+   Data Stream:  A sequence of zero or more data items, not further
+      assembled into a larger containing data item.  The independent
+      data items that make up a data stream are sometimes also referred
+      to as "top-level data items".
+
+   Well-formed:  A data item that follows the syntactic structure of
+      CBOR.  A well-formed data item uses the initial bytes and the byte
+      strings and/or data items that are implied by their values as
+      defined in CBOR and is not followed by extraneous data.
+
+   Valid:  A data item that is well-formed and also follows the semantic
+      restrictions that apply to CBOR data items.
+
+   Stream decoder:  A process that decodes a data stream and makes each
+      of the data items in the sequence available to an application as
+      they are received.
+
+   Where bit arithmetic or data types are explained, this document uses
+   the notation familiar from the programming language C, except that
+   "**" denotes exponentiation.  Similar to the "0x" notation for
+   hexadecimal numbers, numbers in binary notation are prefixed with
+   "0b".  Underscores can be added to such a number solely for
+   readability, so 0b00100001 (0x21) might be written 0b001_00001 to
+   emphasize the desired interpretation of the bits in the byte; in this
+   case, it is split into three bits and five bits.
+
+2.  Specification of the CBOR Encoding
+
+   A CBOR-encoded data item is structured and encoded as described in
+   this section.  The encoding is summarized in Table 5.
+
+   The initial byte of each data item contains both information about
+   the major type (the high-order 3 bits, described in Section 2.1) and
+   additional information (the low-order 5 bits).  When the value of the
+   additional information is less than 24, it is directly used as a
+   small unsigned integer.  When it is 24 to 27, the additional bytes
+   for a variable-length integer immediately follow; the values 24 to 27
+   of the additional information specify that its length is a 1-, 2-,
+   4-, or 8-byte unsigned integer, respectively.  Additional information
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 6]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   value 31 is used for indefinite-length items, described in
+   Section 2.2.  Additional information values 28 to 30 are reserved for
+   future expansion.
+
+   In all additional information values, the resulting integer is
+   interpreted depending on the major type.  It may represent the actual
+   data: for example, in integer types, the resulting integer is used
+   for the value itself.  It may instead supply length information: for
+   example, in byte strings it gives the length of the byte string data
+   that follows.
+
+   A CBOR decoder implementation can be based on a jump table with all
+   256 defined values for the initial byte (Table 5).  A decoder in a
+   constrained implementation can instead use the structure of the
+   initial byte and following bytes for more compact code (see
+   Appendix C for a rough impression of how this could look).
+
+2.1.  Major Types
+
+   The following lists the major types and the additional information
+   and other bytes associated with the type.
+
+   Major type 0:  an unsigned integer.  The 5-bit additional information
+      is either the integer itself (for additional information values 0
+      through 23) or the length of additional data.  Additional
+      information 24 means the value is represented in an additional
+      uint8_t, 25 means a uint16_t, 26 means a uint32_t, and 27 means a
+      uint64_t.  For example, the integer 10 is denoted as the one byte
+      0b000_01010 (major type 0, additional information 10).  The
+      integer 500 would be 0b000_11001 (major type 0, additional
+      information 25) followed by the two bytes 0x01f4, which is 500 in
+      decimal.
+
+   Major type 1:  a negative integer.  The encoding follows the rules
+      for unsigned integers (major type 0), except that the value is
+      then -1 minus the encoded unsigned integer.  For example, the
+      integer -500 would be 0b001_11001 (major type 1, additional
+      information 25) followed by the two bytes 0x01f3, which is 499 in
+      decimal.
+
+   Major type 2:  a byte string.  The string's length in bytes is
+      represented following the rules for positive integers (major type
+      0).  For example, a byte string whose length is 5 would have an
+      initial byte of 0b010_00101 (major type 2, additional information
+      5 for the length), followed by 5 bytes of binary content.  A byte
+      string whose length is 500 would have 3 initial bytes of
+
+
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 7]
+
+RFC 7049                          CBOR                      October 2013
+
+
+      0b010_11001 (major type 2, additional information 25 to indicate a
+      two-byte length) followed by the two bytes 0x01f4 for a length of
+      500, followed by 500 bytes of binary content.
+
+   Major type 3:  a text string, specifically a string of Unicode
+      characters that is encoded as UTF-8 [RFC3629].  The format of this
+      type is identical to that of byte strings (major type 2), that is,
+      as with major type 2, the length gives the number of bytes.  This
+      type is provided for systems that need to interpret or display
+      human-readable text, and allows the differentiation between
+      unstructured bytes and text that has a specified repertoire and
+      encoding.  In contrast to formats such as JSON, the Unicode
+      characters in this type are never escaped.  Thus, a newline
+      character (U+000A) is always represented in a string as the byte
+      0x0a, and never as the bytes 0x5c6e (the characters "\" and "n")
+      or as 0x5c7530303061 (the characters "\", "u", "0", "0", "0", and
+      "a").
+
+   Major type 4:  an array of data items.  Arrays are also called lists,
+      sequences, or tuples.  The array's length follows the rules for
+      byte strings (major type 2), except that the length denotes the
+      number of data items, not the length in bytes that the array takes
+      up.  Items in an array do not need to all be of the same type.
+      For example, an array that contains 10 items of any type would
+      have an initial byte of 0b100_01010 (major type of 4, additional
+      information of 10 for the length) followed by the 10 remaining
+      items.
+
+   Major type 5:  a map of pairs of data items.  Maps are also called
+      tables, dictionaries, hashes, or objects (in JSON).  A map is
+      comprised of pairs of data items, each pair consisting of a key
+      that is immediately followed by a value.  The map's length follows
+      the rules for byte strings (major type 2), except that the length
+      denotes the number of pairs, not the length in bytes that the map
+      takes up.  For example, a map that contains 9 pairs would have an
+      initial byte of 0b101_01001 (major type of 5, additional
+      information of 9 for the number of pairs) followed by the 18
+      remaining items.  The first item is the first key, the second item
+      is the first value, the third item is the second key, and so on.
+      A map that has duplicate keys may be well-formed, but it is not
+      valid, and thus it causes indeterminate decoding; see also
+      Section 3.7.
+
+   Major type 6:  optional semantic tagging of other major types.  See
+      Section 2.4.
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 8]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   Major type 7:  floating-point numbers and simple data types that need
+      no content, as well as the "break" stop code.  See Section 2.3.
+
+   These eight major types lead to a simple table showing which of the
+   256 possible values for the initial byte of a data item are used
+   (Table 5).
+
+   In major types 6 and 7, many of the possible values are reserved for
+   future specification.  See Section 7 for more information on these
+   values.
+
+2.2.  Indefinite Lengths for Some Major Types
+
+   Four CBOR items (arrays, maps, byte strings, and text strings) can be
+   encoded with an indefinite length using additional information value
+   31.  This is useful if the encoding of the item needs to begin before
+   the number of items inside the array or map, or the total length of
+   the string, is known.  (The application of this is often referred to
+   as "streaming" within a data item.)
+
+   Indefinite-length arrays and maps are dealt with differently than
+   indefinite-length byte strings and text strings.
+
+2.2.1.  Indefinite-Length Arrays and Maps
+
+   Indefinite-length arrays and maps are simply opened without
+   indicating the number of data items that will be included in the
+   array or map, using the additional information value of 31.  The
+   initial major type and additional information byte is followed by the
+   elements of the array or map, just as they would be in other arrays
+   or maps.  The end of the array or map is indicated by encoding a
+   "break" stop code in a place where the next data item would normally
+   have been included.  The "break" is encoded with major type 7 and
+   additional information value 31 (0b111_11111) but is not itself a
+   data item: it is just a syntactic feature to close the array or map.
+   That is, the "break" stop code comes after the last item in the array
+   or map, and it cannot occur anywhere else in place of a data item.
+   In this way, indefinite-length arrays and maps look identical to
+   other arrays and maps except for beginning with the additional
+   information value 31 and ending with the "break" stop code.
+
+   Arrays and maps with indefinite lengths allow any number of items
+   (for arrays) and key/value pairs (for maps) to be given before the
+   "break" stop code.  There is no restriction against nesting
+   indefinite-length array or map items.  A "break" only terminates a
+   single item, so nested indefinite-length items need exactly as many
+   "break" stop codes as there are type bytes starting an indefinite-
+   length item.
+
+
+
+Bormann & Hoffman            Standards Track                    [Page 9]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   For example, assume an encoder wants to represent the abstract array
+   [1, [2, 3], [4, 5]].  The definite-length encoding would be
+   0x8301820203820405:
+
+   83        -- Array of length 3
+      01     -- 1
+      82     -- Array of length 2
+         02  -- 2
+         03  -- 3
+      82     -- Array of length 2
+         04  -- 4
+         05  -- 5
+
+   Indefinite-length encoding could be applied independently to each of
+   the three arrays encoded in this data item, as required, leading to
+   representations such as:
+
+   0x9f018202039f0405ffff
+   9F        -- Start indefinite-length array
+      01     -- 1
+      82     -- Array of length 2
+         02  -- 2
+         03  -- 3
+      9F     -- Start indefinite-length array
+         04  -- 4
+         05  -- 5
+         FF  -- "break" (inner array)
+      FF     -- "break" (outer array)
+
+
+   0x9f01820203820405ff
+   9F        -- Start indefinite-length array
+      01     -- 1
+      82     -- Array of length 2
+         02  -- 2
+         03  -- 3
+      82     -- Array of length 2
+         04  -- 4
+         05  -- 5
+      FF     -- "break"
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 10]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   0x83018202039f0405ff
+   83        -- Array of length 3
+      01     -- 1
+      82     -- Array of length 2
+         02  -- 2
+         03  -- 3
+      9F     -- Start indefinite-length array
+         04  -- 4
+         05  -- 5
+         FF  -- "break"
+
+
+   0x83019f0203ff820405
+   83        -- Array of length 3
+      01     -- 1
+      9F     -- Start indefinite-length array
+         02  -- 2
+         03  -- 3
+         FF  -- "break"
+      82     -- Array of length 2
+         04  -- 4
+         05  -- 5
+
+
+   An example of an indefinite-length map (that happens to have two
+   key/value pairs) might be:
+
+   0xbf6346756ef563416d7421ff
+   BF           -- Start indefinite-length map
+      63        -- First key, UTF-8 string length 3
+         46756e --   "Fun"
+      F5        -- First value, true
+      63        -- Second key, UTF-8 string length 3
+         416d74 --   "Amt"
+      21        -- -2
+      FF        -- "break"
+
+2.2.2.  Indefinite-Length Byte Strings and Text Strings
+
+   Indefinite-length byte strings and text strings are actually a
+   concatenation of zero or more definite-length byte or text strings
+   ("chunks") that are together treated as one contiguous string.
+   Indefinite-length strings are opened with the major type and
+   additional information value of 31, but what follows are a series of
+   byte or text strings that have definite lengths (the chunks).  The
+   end of the series of chunks is indicated by encoding the "break" stop
+   code (0b111_11111) in a place where the next chunk in the series
+   would occur.  The contents of the chunks are concatenated together,
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 11]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   and the overall length of the indefinite-length string will be the
+   sum of the lengths of all of the chunks.  In summary, an indefinite-
+   length string is encoded similarly to how an indefinite-length array
+   of its chunks would be encoded, except that the major type of the
+   indefinite-length string is that of a (text or byte) string and
+   matches the major types of its chunks.
+
+   For indefinite-length byte strings, every data item (chunk) between
+   the indefinite-length indicator and the "break" MUST be a definite-
+   length byte string item; if the parser sees any item type other than
+   a byte string before it sees the "break", it is an error.
+
+   For example, assume the sequence:
+
+   0b010_11111 0b010_00100 0xaabbccdd 0b010_00011 0xeeff99 0b111_11111
+
+   5F              -- Start indefinite-length byte string
+      44           -- Byte string of length 4
+         aabbccdd  -- Bytes content
+      43           -- Byte string of length 3
+         eeff99    -- Bytes content
+      FF           -- "break"
+
+   After decoding, this results in a single byte string with seven
+   bytes: 0xaabbccddeeff99.
+
+   Text strings with indefinite lengths act the same as byte strings
+   with indefinite lengths, except that all their chunks MUST be
+   definite-length text strings.  Note that this implies that the bytes
+   of a single UTF-8 character cannot be spread between chunks: a new
+   chunk can only be started at a character boundary.
+
+2.3.  Floating-Point Numbers and Values with No Content
+
+   Major type 7 is for two types of data: floating-point numbers and
+   "simple values" that do not need any content.  Each value of the
+   5-bit additional information in the initial byte has its own separate
+   meaning, as defined in Table 1.  Like the major types for integers,
+   items of this major type do not carry content data; all the
+   information is in the initial bytes.
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 12]
+
+RFC 7049                          CBOR                      October 2013
+
+
+    +-------------+--------------------------------------------------+
+    | 5-Bit Value | Semantics                                        |
+    +-------------+--------------------------------------------------+
+    | 0..23       | Simple value (value 0..23)                       |
+    |             |                                                  |
+    | 24          | Simple value (value 32..255 in following byte)   |
+    |             |                                                  |
+    | 25          | IEEE 754 Half-Precision Float (16 bits follow)   |
+    |             |                                                  |
+    | 26          | IEEE 754 Single-Precision Float (32 bits follow) |
+    |             |                                                  |
+    | 27          | IEEE 754 Double-Precision Float (64 bits follow) |
+    |             |                                                  |
+    | 28-30       | (Unassigned)                                     |
+    |             |                                                  |
+    | 31          | "break" stop code for indefinite-length items    |
+    +-------------+--------------------------------------------------+
+
+        Table 1: Values for Additional Information in Major Type 7
+
+   As with all other major types, the 5-bit value 24 signifies a single-
+   byte extension: it is followed by an additional byte to represent the
+   simple value.  (To minimize confusion, only the values 32 to 255 are
+   used.)  This maintains the structure of the initial bytes: as for the
+   other major types, the length of these always depends on the
+   additional information in the first byte.  Table 2 lists the values
+   assigned and available for simple types.
+
+                       +---------+-----------------+
+                       | Value   | Semantics       |
+                       +---------+-----------------+
+                       | 0..19   | (Unassigned)    |
+                       |         |                 |
+                       | 20      | False           |
+                       |         |                 |
+                       | 21      | True            |
+                       |         |                 |
+                       | 22      | Null            |
+                       |         |                 |
+                       | 23      | Undefined value |
+                       |         |                 |
+                       | 24..31  | (Reserved)      |
+                       |         |                 |
+                       | 32..255 | (Unassigned)    |
+                       +---------+-----------------+
+
+                          Table 2: Simple Values
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 13]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   The 5-bit values of 25, 26, and 27 are for 16-bit, 32-bit, and 64-bit
+   IEEE 754 binary floating-point values.  These floating-point values
+   are encoded in the additional bytes of the appropriate size.  (See
+   Appendix D for some information about 16-bit floating point.)
+
+2.4.  Optional Tagging of Items
+
+   In CBOR, a data item can optionally be preceded by a tag to give it
+   additional semantics while retaining its structure.  The tag is major
+   type 6, and represents an integer number as indicated by the tag's
+   integer value; the (sole) data item is carried as content data.  If a
+   tag requires structured data, this structure is encoded into the
+   nested data item.  The definition of a tag usually restricts what
+   kinds of nested data item or items can be carried by a tag.
+
+   The initial bytes of the tag follow the rules for positive integers
+   (major type 0).  The tag is followed by a single data item of any
+   type.  For example, assume that a byte string of length 12 is marked
+   with a tag to indicate it is a positive bignum (Section 2.4.2).  This
+   would be marked as 0b110_00010 (major type 6, additional information
+   2 for the tag) followed by 0b010_01100 (major type 2, additional
+   information of 12 for the length) followed by the 12 bytes of the
+   bignum.
+
+   Decoders do not need to understand tags, and thus tags may be of
+   little value in applications where the implementation creating a
+   particular CBOR data item and the implementation decoding that stream
+   know the semantic meaning of each item in the data flow.  Their
+   primary purpose in this specification is to define common data types
+   such as dates.  A secondary purpose is to allow optional tagging when
+   the decoder is a generic CBOR decoder that might be able to benefit
+   from hints about the content of items.  Understanding the semantic
+   tags is optional for a decoder; it can just jump over the initial
+   bytes of the tag and interpret the tagged data item itself.
+
+   A tag always applies to the item that is directly followed by it.
+   Thus, if tag A is followed by tag B, which is followed by data item
+   C, tag A applies to the result of applying tag B on data item C.
+   That is, a tagged item is a data item consisting of a tag and a
+   value.  The content of the tagged item is the data item (the value)
+   that is being tagged.
+
+   IANA maintains a registry of tag values as described in Section 7.2.
+   Table 3 provides a list of initial values, with definitions in the
+   rest of this section.
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 14]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   +--------------+------------------+---------------------------------+
+   | Tag          | Data Item        | Semantics                       |
+   +--------------+------------------+---------------------------------+
+   | 0            | UTF-8 string     | Standard date/time string; see  |
+   |              |                  | Section 2.4.1                   |
+   |              |                  |                                 |
+   | 1            | multiple         | Epoch-based date/time; see      |
+   |              |                  | Section 2.4.1                   |
+   |              |                  |                                 |
+   | 2            | byte string      | Positive bignum; see Section    |
+   |              |                  | 2.4.2                           |
+   |              |                  |                                 |
+   | 3            | byte string      | Negative bignum; see Section    |
+   |              |                  | 2.4.2                           |
+   |              |                  |                                 |
+   | 4            | array            | Decimal fraction; see Section   |
+   |              |                  | 2.4.3                           |
+   |              |                  |                                 |
+   | 5            | array            | Bigfloat; see Section 2.4.3     |
+   |              |                  |                                 |
+   | 6..20        | (Unassigned)     | (Unassigned)                    |
+   |              |                  |                                 |
+   | 21           | multiple         | Expected conversion to          |
+   |              |                  | base64url encoding; see         |
+   |              |                  | Section 2.4.4.2                 |
+   |              |                  |                                 |
+   | 22           | multiple         | Expected conversion to base64   |
+   |              |                  | encoding; see Section 2.4.4.2   |
+   |              |                  |                                 |
+   | 23           | multiple         | Expected conversion to base16   |
+   |              |                  | encoding; see Section 2.4.4.2   |
+   |              |                  |                                 |
+   | 24           | byte string      | Encoded CBOR data item; see     |
+   |              |                  | Section 2.4.4.1                 |
+   |              |                  |                                 |
+   | 25..31       | (Unassigned)     | (Unassigned)                    |
+   |              |                  |                                 |
+   | 32           | UTF-8 string     | URI; see Section 2.4.4.3        |
+   |              |                  |                                 |
+   | 33           | UTF-8 string     | base64url; see Section 2.4.4.3  |
+   |              |                  |                                 |
+   | 34           | UTF-8 string     | base64; see Section 2.4.4.3     |
+   |              |                  |                                 |
+   | 35           | UTF-8 string     | Regular expression; see         |
+   |              |                  | Section 2.4.4.3                 |
+   |              |                  |                                 |
+   | 36           | UTF-8 string     | MIME message; see Section       |
+   |              |                  | 2.4.4.3                         |
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 15]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   |              |                  |                                 |
+   | 37..55798    | (Unassigned)     | (Unassigned)                    |
+   |              |                  |                                 |
+   | 55799        | multiple         | Self-describe CBOR; see         |
+   |              |                  | Section 2.4.5                   |
+   |              |                  |                                 |
+   | 55800+       | (Unassigned)     | (Unassigned)                    |
+   +--------------+------------------+---------------------------------+
+
+                         Table 3: Values for Tags
+
+2.4.1.  Date and Time
+
+   Tag value 0 is for date/time strings that follow the standard format
+   described in [RFC3339], as refined by Section 3.3 of [RFC4287].
+
+   Tag value 1 is for numerical representation of seconds relative to
+   1970-01-01T00:00Z in UTC time.  (For the non-negative values that the
+   Portable Operating System Interface (POSIX) defines, the number of
+   seconds is counted in the same way as for POSIX "seconds since the
+   epoch" [TIME_T].)  The tagged item can be a positive or negative
+   integer (major types 0 and 1), or a floating-point number (major type
+   7 with additional information 25, 26, or 27).  Note that the number
+   can be negative (time before 1970-01-01T00:00Z) and, if a floating-
+   point number, indicate fractional seconds.
+
+2.4.2.  Bignums
+
+   Bignums are integers that do not fit into the basic integer
+   representations provided by major types 0 and 1.  They are encoded as
+   a byte string data item, which is interpreted as an unsigned integer
+   n in network byte order.  For tag value 2, the value of the bignum is
+   n.  For tag value 3, the value of the bignum is -1 - n.  Decoders
+   that understand these tags MUST be able to decode bignums that have
+   leading zeroes.
+
+   For example, the number 18446744073709551616 (2**64) is represented
+   as 0b110_00010 (major type 6, tag 2), followed by 0b010_01001 (major
+   type 2, length 9), followed by 0x010000000000000000 (one byte 0x01
+   and eight bytes 0x00).  In hexadecimal:
+
+   C2                        -- Tag 2
+      29                     -- Byte string of length 9
+         010000000000000000  -- Bytes content
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 16]
+
+RFC 7049                          CBOR                      October 2013
+
+
+2.4.3.  Decimal Fractions and Bigfloats
+
+   Decimal fractions combine an integer mantissa with a base-10 scaling
+   factor.  They are most useful if an application needs the exact
+   representation of a decimal fraction such as 1.1 because there is no
+   exact representation for many decimal fractions in binary floating
+   point.
+
+   Bigfloats combine an integer mantissa with a base-2 scaling factor.
+   They are binary floating-point values that can exceed the range or
+   the precision of the three IEEE 754 formats supported by CBOR
+   (Section 2.3).  Bigfloats may also be used by constrained
+   applications that need some basic binary floating-point capability
+   without the need for supporting IEEE 754.
+
+   A decimal fraction or a bigfloat is represented as a tagged array
+   that contains exactly two integer numbers: an exponent e and a
+   mantissa m.  Decimal fractions (tag 4) use base-10 exponents; the
+   value of a decimal fraction data item is m*(10**e).  Bigfloats (tag
+   5) use base-2 exponents; the value of a bigfloat data item is
+   m*(2**e).  The exponent e MUST be represented in an integer of major
+   type 0 or 1, while the mantissa also can be a bignum (Section 2.4.2).
+
+   An example of a decimal fraction is that the number 273.15 could be
+   represented as 0b110_00100 (major type of 6 for the tag, additional
+   information of 4 for the type of tag), followed by 0b100_00010 (major
+   type of 4 for the array, additional information of 2 for the length
+   of the array), followed by 0b001_00001 (major type of 1 for the first
+   integer, additional information of 1 for the value of -2), followed
+   by 0b000_11001 (major type of 0 for the second integer, additional
+   information of 25 for a two-byte value), followed by
+   0b0110101010110011 (27315 in two bytes).  In hexadecimal:
+
+   C4             -- Tag 4
+      82          -- Array of length 2
+         21       -- -2
+         19 6ab3  -- 27315
+
+   An example of a bigfloat is that the number 1.5 could be represented
+   as 0b110_00101 (major type of 6 for the tag, additional information
+   of 5 for the type of tag), followed by 0b100_00010 (major type of 4
+   for the array, additional information of 2 for the length of the
+   array), followed by 0b001_00000 (major type of 1 for the first
+   integer, additional information of 0 for the value of -1), followed
+   by 0b000_00011 (major type of 0 for the second integer, additional
+   information of 3 for the value of 3).  In hexadecimal:
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 17]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   C5             -- Tag 5
+      82          -- Array of length 2
+         20       -- -1
+         03       -- 3
+
+   Decimal fractions and bigfloats provide no representation of
+   Infinity, -Infinity, or NaN; if these are needed in place of a
+   decimal fraction or bigfloat, the IEEE 754 half-precision
+   representations from Section 2.3 can be used.  For constrained
+   applications, where there is a choice between representing a specific
+   number as an integer and as a decimal fraction or bigfloat (such as
+   when the exponent is small and non-negative), there is a quality-of-
+   implementation expectation that the integer representation is used
+   directly.
+
+2.4.4.  Content Hints
+
+   The tags in this section are for content hints that might be used by
+   generic CBOR processors.
+
+2.4.4.1.  Encoded CBOR Data Item
+
+   Sometimes it is beneficial to carry an embedded CBOR data item that
+   is not meant to be decoded immediately at the time the enclosing data
+   item is being parsed.  Tag 24 (CBOR data item) can be used to tag the
+   embedded byte string as a data item encoded in CBOR format.
+
+2.4.4.2.  Expected Later Encoding for CBOR-to-JSON Converters
+
+   Tags 21 to 23 indicate that a byte string might require a specific
+   encoding when interoperating with a text-based representation.  These
+   tags are useful when an encoder knows that the byte string data it is
+   writing is likely to be later converted to a particular JSON-based
+   usage.  That usage specifies that some strings are encoded as base64,
+   base64url, and so on.  The encoder uses byte strings instead of doing
+   the encoding itself to reduce the message size, to reduce the code
+   size of the encoder, or both.  The encoder does not know whether or
+   not the converter will be generic, and therefore wants to say what it
+   believes is the proper way to convert binary strings to JSON.
+
+   The data item tagged can be a byte string or any other data item.  In
+   the latter case, the tag applies to all of the byte string data items
+   contained in the data item, except for those contained in a nested
+   data item tagged with an expected conversion.
+
+   These three tag types suggest conversions to three of the base data
+   encodings defined in [RFC4648].  For base64url encoding, padding is
+   not used (see Section 3.2 of RFC 4648); that is, all trailing equals
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 18]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   signs ("=") are removed from the base64url-encoded string.  Later
+   tags might be defined for other data encodings of RFC 4648 or for
+   other ways to encode binary data in strings.
+
+2.4.4.3.  Encoded Text
+
+   Some text strings hold data that have formats widely used on the
+   Internet, and sometimes those formats can be validated and presented
+   to the application in appropriate form by the decoder.  There are
+   tags for some of these formats.
+
+   o  Tag 32 is for URIs, as defined in [RFC3986];
+
+   o  Tags 33 and 34 are for base64url- and base64-encoded text strings,
+      as defined in [RFC4648];
+
+   o  Tag 35 is for regular expressions in Perl Compatible Regular
+      Expressions (PCRE) / JavaScript syntax [ECMA262].
+
+   o  Tag 36 is for MIME messages (including all headers), as defined in
+      [RFC2045];
+
+   Note that tags 33 and 34 differ from 21 and 22 in that the data is
+   transported in base-encoded form for the former and in raw byte
+   string form for the latter.
+
+2.4.5.  Self-Describe CBOR
+
+   In many applications, it will be clear from the context that CBOR is
+   being employed for encoding a data item.  For instance, a specific
+   protocol might specify the use of CBOR, or a media type is indicated
+   that specifies its use.  However, there may be applications where
+   such context information is not available, such as when CBOR data is
+   stored in a file and disambiguating metadata is not in use.  Here, it
+   may help to have some distinguishing characteristics for the data
+   itself.
+
+   Tag 55799 is defined for this purpose.  It does not impart any
+   special semantics on the data item that follows; that is, the
+   semantics of a data item tagged with tag 55799 is exactly identical
+   to the semantics of the data item itself.
+
+   The serialization of this tag is 0xd9d9f7, which appears not to be in
+   use as a distinguishing mark for frequently used file types.  In
+   particular, it is not a valid start of a Unicode text in any Unicode
+   encoding if followed by a valid CBOR data item.
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 19]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   For instance, a decoder might be able to parse both CBOR and JSON.
+   Such a decoder would need to mechanically distinguish the two
+   formats.  An easy way for an encoder to help the decoder would be to
+   tag the entire CBOR item with tag 55799, the serialization of which
+   will never be found at the beginning of a JSON text.
+
+3.  Creating CBOR-Based Protocols
+
+   Data formats such as CBOR are often used in environments where there
+   is no format negotiation.  A specific design goal of CBOR is to not
+   need any included or assumed schema: a decoder can take a CBOR item
+   and decode it with no other knowledge.
+
+   Of course, in real-world implementations, the encoder and the decoder
+   will have a shared view of what should be in a CBOR data item.  For
+   example, an agreed-to format might be "the item is an array whose
+   first value is a UTF-8 string, second value is an integer, and
+   subsequent values are zero or more floating-point numbers" or "the
+   item is a map that has byte strings for keys and contains at least
+   one pair whose key is 0xab01".
+
+   This specification puts no restrictions on CBOR-based protocols.  An
+   encoder can be capable of encoding as many or as few types of values
+   as is required by the protocol in which it is used; a decoder can be
+   capable of understanding as many or as few types of values as is
+   required by the protocols in which it is used.  This lack of
+   restrictions allows CBOR to be used in extremely constrained
+   environments.
+
+   This section discusses some considerations in creating CBOR-based
+   protocols.  It is advisory only and explicitly excludes any language
+   from RFC 2119 other than words that could be interpreted as "MAY" in
+   the sense of RFC 2119.
+
+3.1.  CBOR in Streaming Applications
+
+   In a streaming application, a data stream may be composed of a
+   sequence of CBOR data items concatenated back-to-back.  In such an
+   environment, the decoder immediately begins decoding a new data item
+   if data is found after the end of a previous data item.
+
+   Not all of the bytes making up a data item may be immediately
+   available to the decoder; some decoders will buffer additional data
+   until a complete data item can be presented to the application.
+   Other decoders can present partial information about a top-level data
+   item to an application, such as the nested data items that could
+   already be decoded, or even parts of a byte string that hasn't
+   completely arrived yet.
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 20]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   Note that some applications and protocols will not want to use
+   indefinite-length encoding.  Using indefinite-length encoding allows
+   an encoder to not need to marshal all the data for counting, but it
+   requires a decoder to allocate increasing amounts of memory while
+   waiting for the end of the item.  This might be fine for some
+   applications but not others.
+
+3.2.  Generic Encoders and Decoders
+
+   A generic CBOR decoder can decode all well-formed CBOR data and
+   present them to an application.  CBOR data is well-formed if it uses
+   the initial bytes, as well as the byte strings and/or data items that
+   are implied by their values, in the manner defined by CBOR, and no
+   extraneous data follows (Appendix C).
+
+   Even though CBOR attempts to minimize these cases, not all well-
+   formed CBOR data is valid: for example, the format excludes simple
+   values below 32 that are encoded with an extension byte.  Also,
+   specific tags may make semantic constraints that may be violated,
+   such as by including a tag in a bignum tag or by following a byte
+   string within a date tag.  Finally, the data may be invalid, such as
+   invalid UTF-8 strings or date strings that do not conform to
+   [RFC3339].  There is no requirement that generic encoders and
+   decoders make unnatural choices for their application interface to
+   enable the processing of invalid data.  Generic encoders and decoders
+   are expected to forward simple values and tags even if their specific
+   codepoints are not registered at the time the encoder/decoder is
+   written (Section 3.5).
+
+   Generic decoders provide ways to present well-formed CBOR values,
+   both valid and invalid, to an application.  The diagnostic notation
+   (Section 6) may be used to present well-formed CBOR values to humans.
+
+   Generic encoders provide an application interface that allows the
+   application to specify any well-formed value, including simple values
+   and tags unknown to the encoder.
+
+3.3.  Syntax Errors
+
+   A decoder encountering a CBOR data item that is not well-formed
+   generally can choose to completely fail the decoding (issue an error
+   and/or stop processing altogether), substitute the problematic data
+   and data items using a decoder-specific convention that clearly
+   indicates there has been a problem, or take some other action.
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 21]
+
+RFC 7049                          CBOR                      October 2013
+
+
+3.3.1.  Incomplete CBOR Data Items
+
+   The representation of a CBOR data item has a specific length,
+   determined by its initial bytes and by the structure of any data
+   items enclosed in the data items.  If less data is available, this
+   can be treated as a syntax error.  A decoder may also implement
+   incremental parsing, that is, decode the data item as far as it is
+   available and present the data found so far (such as in an event-
+   based interface), with the option of continuing the decoding once
+   further data is available.
+
+   Examples of incomplete data items include:
+
+   o  A decoder expects a certain number of array or map entries but
+      instead encounters the end of the data.
+
+   o  A decoder processes what it expects to be the last pair in a map
+      and comes to the end of the data.
+
+   o  A decoder has just seen a tag and then encounters the end of the
+      data.
+
+   o  A decoder has seen the beginning of an indefinite-length item but
+      encounters the end of the data before it sees the "break" stop
+      code.
+
+3.3.2.  Malformed Indefinite-Length Items
+
+   Examples of malformed indefinite-length data items include:
+
+   o  Within an indefinite-length byte string or text, a decoder finds
+      an item that is not of the appropriate major type before it finds
+      the "break" stop code.
+
+   o  Within an indefinite-length map, a decoder encounters the "break"
+      stop code immediately after reading a key (the value is missing).
+
+   Another error is finding a "break" stop code at a point in the data
+   where there is no immediately enclosing (unclosed) indefinite-length
+   item.
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 22]
+
+RFC 7049                          CBOR                      October 2013
+
+
+3.3.3.  Unknown Additional Information Values
+
+   At the time of writing, some additional information values are
+   unassigned and reserved for future versions of this document (see
+   Section 5.2).  Since the overall syntax for these additional
+   information values is not yet defined, a decoder that sees an
+   additional information value that it does not understand cannot
+   continue parsing.
+
+3.4.  Other Decoding Errors
+
+   A CBOR data item may be syntactically well-formed but present a
+   problem with interpreting the data encoded in it in the CBOR data
+   model.  Generally speaking, a decoder that finds a data item with
+   such a problem might issue a warning, might stop processing
+   altogether, might handle the error and make the problematic value
+   available to the application as such, or take some other type of
+   action.
+
+   Such problems might include:
+
+   Duplicate keys in a map:  Generic decoders (Section 3.2) make data
+      available to applications using the native CBOR data model.  That
+      data model includes maps (key-value mappings with unique keys),
+      not multimaps (key-value mappings where multiple entries can have
+      the same key).  Thus, a generic decoder that gets a CBOR map item
+      that has duplicate keys will decode to a map with only one
+      instance of that key, or it might stop processing altogether.  On
+      the other hand, a "streaming decoder" may not even be able to
+      notice (Section 3.7).
+
+   Inadmissible type on the value following a tag:  Tags (Section 2.4)
+      specify what type of data item is supposed to follow the tag; for
+      example, the tags for positive or negative bignums are supposed to
+      be put on byte strings.  A decoder that decodes the tagged data
+      item into a native representation (a native big integer in this
+      example) is expected to check the type of the data item being
+      tagged.  Even decoders that don't have such native representations
+      available in their environment may perform the check on those tags
+      known to them and react appropriately.
+
+   Invalid UTF-8 string:  A decoder might or might not want to verify
+      that the sequence of bytes in a UTF-8 string (major type 3) is
+      actually valid UTF-8 and react appropriately.
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 23]
+
+RFC 7049                          CBOR                      October 2013
+
+
+3.5.  Handling Unknown Simple Values and Tags
+
+   A decoder that comes across a simple value (Section 2.3) that it does
+   not recognize, such as a value that was added to the IANA registry
+   after the decoder was deployed or a value that the decoder chose not
+   to implement, might issue a warning, might stop processing
+   altogether, might handle the error by making the unknown value
+   available to the application as such (as is expected of generic
+   decoders), or take some other type of action.
+
+   A decoder that comes across a tag (Section 2.4) that it does not
+   recognize, such as a tag that was added to the IANA registry after
+   the decoder was deployed or a tag that the decoder chose not to
+   implement, might issue a warning, might stop processing altogether,
+   might handle the error and present the unknown tag value together
+   with the contained data item to the application (as is expected of
+   generic decoders), might ignore the tag and simply present the
+   contained data item only to the application, or take some other type
+   of action.
+
+3.6.  Numbers
+
+   For the purposes of this specification, all number representations
+   for the same numeric value are equivalent.  This means that an
+   encoder can encode a floating-point value of 0.0 as the integer 0.
+   It, however, also means that an application that expects to find
+   integer values only might find floating-point values if the encoder
+   decides these are desirable, such as when the floating-point value is
+   more compact than a 64-bit integer.
+
+   An application or protocol that uses CBOR might restrict the
+   representations of numbers.  For instance, a protocol that only deals
+   with integers might say that floating-point numbers may not be used
+   and that decoders of that protocol do not need to be able to handle
+   floating-point numbers.  Similarly, a protocol or application that
+   uses CBOR might say that decoders need to be able to handle either
+   type of number.
+
+   CBOR-based protocols should take into account that different language
+   environments pose different restrictions on the range and precision
+   of numbers that are representable.  For example, the JavaScript
+   number system treats all numbers as floating point, which may result
+   in silent loss of precision in decoding integers with more than 53
+   significant bits.  A protocol that uses numbers should define its
+   expectations on the handling of non-trivial numbers in decoders and
+   receiving applications.
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 24]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   A CBOR-based protocol that includes floating-point numbers can
+   restrict which of the three formats (half-precision, single-
+   precision, and double-precision) are to be supported.  For an
+   integer-only application, a protocol may want to completely exclude
+   the use of floating-point values.
+
+   A CBOR-based protocol designed for compactness may want to exclude
+   specific integer encodings that are longer than necessary for the
+   application, such as to save the need to implement 64-bit integers.
+   There is an expectation that encoders will use the most compact
+   integer representation that can represent a given value.  However, a
+   compact application should accept values that use a longer-than-
+   needed encoding (such as encoding "0" as 0b000_11101 followed by two
+   bytes of 0x00) as long as the application can decode an integer of
+   the given size.
+
+3.7.  Specifying Keys for Maps
+
+   The encoding and decoding applications need to agree on what types of
+   keys are going to be used in maps.  In applications that need to
+   interwork with JSON-based applications, keys probably should be
+   limited to UTF-8 strings only; otherwise, there has to be a specified
+   mapping from the other CBOR types to Unicode characters, and this
+   often leads to implementation errors.  In applications where keys are
+   numeric in nature and numeric ordering of keys is important to the
+   application, directly using the numbers for the keys is useful.
+
+   If multiple types of keys are to be used, consideration should be
+   given to how these types would be represented in the specific
+   programming environments that are to be used.  For example, in
+   JavaScript objects, a key of integer 1 cannot be distinguished from a
+   key of string "1".  This means that, if integer keys are used, the
+   simultaneous use of string keys that look like numbers needs to be
+   avoided.  Again, this leads to the conclusion that keys should be of
+   a single CBOR type.
+
+   Decoders that deliver data items nested within a CBOR data item
+   immediately on decoding them ("streaming decoders") often do not keep
+   the state that is necessary to ascertain uniqueness of a key in a
+   map.  Similarly, an encoder that can start encoding data items before
+   the enclosing data item is completely available ("streaming encoder")
+   may want to reduce its overhead significantly by relying on its data
+   source to maintain uniqueness.
+
+   A CBOR-based protocol should make an intentional decision about what
+   to do when a receiving application does see multiple identical keys
+   in a map.  The resulting rule in the protocol should respect the CBOR
+   data model: it cannot prescribe a specific handling of the entries
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 25]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   with the identical keys, except that it might have a rule that having
+   identical keys in a map indicates a malformed map and that the
+   decoder has to stop with an error.  Duplicate keys are also
+   prohibited by CBOR decoders that are using strict mode
+   (Section 3.10).
+
+   The CBOR data model for maps does not allow ascribing semantics to
+   the order of the key/value pairs in the map representation.
+   Thus, it would be a very bad practice to define a CBOR-based protocol
+   in such a way that changing the key/value pair order in a map would
+   change the semantics, apart from trivial aspects (cache usage, etc.).
+   (A CBOR-based protocol can prescribe a specific order of
+   serialization, such as for canonicalization.)
+
+   Applications for constrained devices that have maps with 24 or fewer
+   frequently used keys should consider using small integers (and those
+   with up to 48 frequently used keys should consider also using small
+   negative integers) because the keys can then be encoded in a single
+   byte.
+
+3.8.  Undefined Values
+
+   In some CBOR-based protocols, the simple value (Section 2.3) of
+   Undefined might be used by an encoder as a substitute for a data item
+   with an encoding problem, in order to allow the rest of the enclosing
+   data items to be encoded without harm.
+
+3.9.  Canonical CBOR
+
+   Some protocols may want encoders to only emit CBOR in a particular
+   canonical format; those protocols might also have the decoders check
+   that their input is canonical.  Those protocols are free to define
+   what they mean by a canonical format and what encoders and decoders
+   are expected to do.  This section lists some suggestions for such
+   protocols.
+
+   If a protocol considers "canonical" to mean that two encoder
+   implementations starting with the same input data will produce the
+   same CBOR output, the following four rules would suffice:
+
+   o  Integers must be as small as possible.
+
+      *  0 to 23 and -1 to -24 must be expressed in the same byte as the
+         major type;
+
+      *  24 to 255 and -25 to -256 must be expressed only with an
+         additional uint8_t;
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 26]
+
+RFC 7049                          CBOR                      October 2013
+
+
+      *  256 to 65535 and -257 to -65536 must be expressed only with an
+         additional uint16_t;
+
+      *  65536 to 4294967295 and -65537 to -4294967296 must be expressed
+         only with an additional uint32_t.
+
+   o  The expression of lengths in major types 2 through 5 must be as
+      short as possible.  The rules for these lengths follow the above
+      rule for integers.
+
+   o  The keys in every map must be sorted lowest value to highest.
+      Sorting is performed on the bytes of the representation of the key
+      data items without paying attention to the 3/5 bit splitting for
+      major types.  (Note that this rule allows maps that have keys of
+      different types, even though that is probably a bad practice that
+      could lead to errors in some canonicalization implementations.)
+      The sorting rules are:
+
+      *  If two keys have different lengths, the shorter one sorts
+         earlier;
+
+      *  If two keys have the same length, the one with the lower value
+         in (byte-wise) lexical order sorts earlier.
+
+   o  Indefinite-length items must be made into definite-length items.
+
+   If a protocol allows for IEEE floats, then additional
+   canonicalization rules might need to be added.  One example rule
+   might be to have all floats start as a 64-bit float, then do a test
+   conversion to a 32-bit float; if the result is the same numeric
+   value, use the shorter value and repeat the process with a test
+   conversion to a 16-bit float.  (This rule selects 16-bit float for
+   positive and negative Infinity as well.)  Also, there are many
+   representations for NaN.  If NaN is an allowed value, it must always
+   be represented as 0xf97e00.
+
+   CBOR tags present additional considerations for canonicalization.
+   The absence or presence of tags in a canonical format is determined
+   by the optionality of the tags in the protocol.  In a CBOR-based
+   protocol that allows optional tagging anywhere, the canonical format
+   must not allow them.  In a protocol that requires tags in certain
+   places, the tag needs to appear in the canonical format.  A CBOR-
+   based protocol that uses canonicalization might instead say that all
+   tags that appear in a message must be retained regardless of whether
+   they are optional.
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 27]
+
+RFC 7049                          CBOR                      October 2013
+
+
+3.10.  Strict Mode
+
+   Some areas of application of CBOR do not require canonicalization
+   (Section 3.9) but may require that different decoders reach the same
+   (semantically equivalent) results, even in the presence of
+   potentially malicious data.  This can be required if one application
+   (such as a firewall or other protecting entity) makes a decision
+   based on the data that another application, which independently
+   decodes the data, relies on.
+
+   Normally, it is the responsibility of the sender to avoid ambiguously
+   decodable data.  However, the sender might be an attacker specially
+   making up CBOR data such that it will be interpreted differently by
+   different decoders in an attempt to exploit that as a vulnerability.
+   Generic decoders used in applications where this might be a problem
+   need to support a strict mode in which it is also the responsibility
+   of the receiver to reject ambiguously decodable data.  It is expected
+   that firewalls and other security systems that decode CBOR will only
+   decode in strict mode.
+
+   A decoder in strict mode will reliably reject any data that could be
+   interpreted by other decoders in different ways.  It will reliably
+   reject data items with syntax errors (Section 3.3).  It will also
+   expend the effort to reliably detect other decoding errors
+   (Section 3.4).  In particular, a strict decoder needs to have an API
+   that reports an error (and does not return data) for a CBOR data item
+   that contains any of the following:
+
+   o  a map (major type 5) that has more than one entry with the same
+      key
+
+   o  a tag that is used on a data item of the incorrect type
+
+   o  a data item that is incorrectly formatted for the type given to
+      it, such as invalid UTF-8 or data that cannot be interpreted with
+      the specific tag that it has been tagged with
+
+   A decoder in strict mode can do one of two things when it encounters
+   a tag or simple value that it does not recognize:
+
+   o  It can report an error (and not return data).
+
+   o  It can emit the unknown item (type, value, and, for tags, the
+      decoded tagged data item) to the application calling the decoder
+      with an indication that the decoder did not recognize that tag or
+      simple value.
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 28]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   The latter approach, which is also appropriate for non-strict
+   decoders, supports forward compatibility with newly registered tags
+   and simple values without the requirement to update the encoder at
+   the same time as the calling application.  (For this, the API for the
+   decoder needs to have a way to mark unknown items so that the calling
+   application can handle them in a manner appropriate for the program.)
+
+   Since some of this processing may have an appreciable cost (in
+   particular with duplicate detection for maps), support of strict mode
+   is not a requirement placed on all CBOR decoders.
+
+   Some encoders will rely on their applications to provide input data
+   in such a way that unambiguously decodable CBOR results.  A generic
+   encoder also may want to provide a strict mode where it reliably
+   limits its output to unambiguously decodable CBOR, independent of
+   whether or not its application is providing API-conformant data.
+
+4.  Converting Data between CBOR and JSON
+
+   This section gives non-normative advice about converting between CBOR
+   and JSON.  Implementations of converters are free to use whichever
+   advice here they want.
+
+   It is worth noting that a JSON text is a sequence of characters, not
+   an encoded sequence of bytes, while a CBOR data item consists of
+   bytes, not characters.
+
+4.1.  Converting from CBOR to JSON
+
+   Most of the types in CBOR have direct analogs in JSON.  However, some
+   do not, and someone implementing a CBOR-to-JSON converter has to
+   consider what to do in those cases.  The following non-normative
+   advice deals with these by converting them to a single substitute
+   value, such as a JSON null.
+
+   o  An integer (major type 0 or 1) becomes a JSON number.
+
+   o  A byte string (major type 2) that is not embedded in a tag that
+      specifies a proposed encoding is encoded in base64url without
+      padding and becomes a JSON string.
+
+   o  A UTF-8 string (major type 3) becomes a JSON string.  Note that
+      JSON requires escaping certain characters (RFC 4627, Section 2.5):
+      quotation mark (U+0022), reverse solidus (U+005C), and the "C0
+      control characters" (U+0000 through U+001F).  All other characters
+      are copied unchanged into the JSON UTF-8 string.
+
+   o  An array (major type 4) becomes a JSON array.
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 29]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   o  A map (major type 5) becomes a JSON object.  This is possible
+      directly only if all keys are UTF-8 strings.  A converter might
+      also convert other keys into UTF-8 strings (such as by converting
+      integers into strings containing their decimal representation);
+      however, doing so introduces a danger of key collision.
+
+   o  False (major type 7, additional information 20) becomes a JSON
+      false.
+
+   o  True (major type 7, additional information 21) becomes a JSON
+      true.
+
+   o  Null (major type 7, additional information 22) becomes a JSON
+      null.
+
+   o  A floating-point value (major type 7, additional information 25
+      through 27) becomes a JSON number if it is finite (that is, it can
+      be represented in a JSON number); if the value is non-finite (NaN,
+      or positive or negative Infinity), it is represented by the
+      substitute value.
+
+   o  Any other simple value (major type 7, any additional information
+      value not yet discussed) is represented by the substitute value.
+
+   o  A bignum (major type 6, tag value 2 or 3) is represented by
+      encoding its byte string in base64url without padding and becomes
+      a JSON string.  For tag value 3 (negative bignum), a "~" (ASCII
+      tilde) is inserted before the base-encoded value.  (The conversion
+      to a binary blob instead of a number is to prevent a likely
+      numeric overflow for the JSON decoder.)
+
+   o  A byte string with an encoding hint (major type 6, tag value 21
+      through 23) is encoded as described and becomes a JSON string.
+
+   o  For all other tags (major type 6, any other tag value), the
+      embedded CBOR item is represented as a JSON value; the tag value
+      is ignored.
+
+   o  Indefinite-length items are made definite before conversion.
+
+4.2.  Converting from JSON to CBOR
+
+   All JSON values, once decoded, directly map into one or more CBOR
+   values.  As with any kind of CBOR generation, decisions have to be
+   made with respect to number representation.  In a suggested
+   conversion:
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 30]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   o  JSON numbers without fractional parts (integer numbers) are
+      represented as integers (major types 0 and 1, possibly major type
+      6 tag value 2 and 3), choosing the shortest form; integers longer
+      than an implementation-defined threshold (which is usually either
+      32 or 64 bits) may instead be represented as floating-point
+      values.  (If the JSON was generated from a JavaScript
+      implementation, its precision is already limited to 53 bits
+      maximum.)
+
+   o  Numbers with fractional parts are represented as floating-point
+      values.  Preferably, the shortest exact floating-point
+      representation is used; for instance, 1.5 is represented in a
+      16-bit floating-point value (not all implementations will be
+      capable of efficiently finding the minimum form, though).  There
+      may be an implementation-defined limit to the precision that will
+      affect the precision of the represented values.  Decimal
+      representation should only be used if that is specified in a
+      protocol.
+
+   CBOR has been designed to generally provide a more compact encoding
+   than JSON.  One implementation strategy that might come to mind is to
+   perform a JSON-to-CBOR encoding in place in a single buffer.  This
+   strategy would need to carefully consider a number of pathological
+   cases, such as that some strings represented with no or very few
+   escapes and longer (or much longer) than 255 bytes may expand when
+   encoded as UTF-8 strings in CBOR.  Similarly, a few of the binary
+   floating-point representations might cause expansion from some short
+   decimal representations (1.1, 1e9) in JSON.  This may be hard to get
+   right, and any ensuing vulnerabilities may be exploited by an
+   attacker.
+
+5.  Future Evolution of CBOR
+
+   Successful protocols evolve over time.  New ideas appear,
+   implementation platforms improve, related protocols are developed and
+   evolve, and new requirements from applications and protocols are
+   added.  Facilitating protocol evolution is therefore an important
+   design consideration for any protocol development.
+
+   For protocols that will use CBOR, CBOR provides some useful
+   mechanisms to facilitate their evolution.  Best practices for this
+   are well known, particularly from JSON format development of JSON-
+   based protocols.  Therefore, such best practices are outside the
+   scope of this specification.
+
+   However, facilitating the evolution of CBOR itself is very well
+   within its scope.  CBOR is designed to both provide a stable basis
+   for development of CBOR-based protocols and to be able to evolve.
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 31]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   Since a successful protocol may live for decades, CBOR needs to be
+   designed for decades of use and evolution.  This section provides
+   some guidance for the evolution of CBOR.  It is necessarily more
+   subjective than other parts of this document.  It is also necessarily
+   incomplete, lest it turn into a textbook on protocol development.
+
+5.1.  Extension Points
+
+   In a protocol design, opportunities for evolution are often included
+   in the form of extension points.  For example, there may be a
+   codepoint space that is not fully allocated from the outset, and the
+   protocol is designed to tolerate and embrace implementations that
+   start using more codepoints than initially allocated.
+
+   Sizing the codepoint space may be difficult because the range
+   required may be hard to predict.  An attempt should be made to make
+   the codepoint space large enough so that it can slowly be filled over
+   the intended lifetime of the protocol.
+
+   CBOR has three major extension points:
+
+   o  the "simple" space (values in major type 7).  Of the 24 efficient
+      (and 224 slightly less efficient) values, only a small number have
+      been allocated.  Implementations receiving an unknown simple data
+      item may be able to process it as such, given that the structure
+      of the value is indeed simple.  The IANA registry in Section 7.1
+      is the appropriate way to address the extensibility of this
+      codepoint space.
+
+   o  the "tag" space (values in major type 6).  Again, only a small
+      part of the codepoint space has been allocated, and the space is
+      abundant (although the early numbers are more efficient than the
+      later ones).  Implementations receiving an unknown tag can choose
+      to simply ignore it or to process it as an unknown tag wrapping
+      the following data item.  The IANA registry in Section 7.2 is the
+      appropriate way to address the extensibility of this codepoint
+      space.
+
+   o  the "additional information" space.  An implementation receiving
+      an unknown additional information value has no way to continue
+      parsing, so allocating codepoints to this space is a major step.
+      There are also very few codepoints left.
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 32]
+
+RFC 7049                          CBOR                      October 2013
+
+
+5.2.  Curating the Additional Information Space
+
+   The human mind is sometimes drawn to filling in little perceived gaps
+   to make something neat.  We expect the remaining gaps in the
+   codepoint space for the additional information values to be an
+   attractor for new ideas, just because they are there.
+
+   The present specification does not manage the additional information
+   codepoint space by an IANA registry.  Instead, allocations out of
+   this space can only be done by updating this specification.
+
+   For an additional information value of n >= 24, the size of the
+   additional data typically is 2**(n-24) bytes.  Therefore, additional
+   information values 28 and 29 should be viewed as candidates for
+   128-bit and 256-bit quantities, in case a need arises to add them to
+   the protocol.  Additional information value 30 is then the only
+   additional information value available for general allocation, and
+   there should be a very good reason for allocating it before assigning
+   it through an update of this protocol.
+
+6.  Diagnostic Notation
+
+   CBOR is a binary interchange format.  To facilitate documentation and
+   debugging, and in particular to facilitate communication between
+   entities cooperating in debugging, this section defines a simple
+   human-readable diagnostic notation.  All actual interchange always
+   happens in the binary format.
+
+   Note that this truly is a diagnostic format; it is not meant to be
+   parsed.  Therefore, no formal definition (as in ABNF) is given in
+   this document.  (Implementers looking for a text-based format for
+   representing CBOR data items in configuration files may also want to
+   consider YAML [YAML].)
+
+   The diagnostic notation is loosely based on JSON as it is defined in
+   RFC 4627, extending it where needed.
+
+   The notation borrows the JSON syntax for numbers (integer and
+   floating point), True (>true<), False (>false<), Null (>null<), UTF-8
+   strings, arrays, and maps (maps are called objects in JSON; the
+   diagnostic notation extends JSON here by allowing any data item in
+   the key position).  Undefined is written >undefined< as in
+   JavaScript.  The non-finite floating-point numbers Infinity,
+   -Infinity, and NaN are written exactly as in this sentence (this is
+   also a way they can be written in JavaScript, although JSON does not
+   allow them).  A tagged item is written as an integer number for the
+   tag followed by the item in parentheses; for instance, an RFC 3339
+   (ISO 8601) date could be notated as:
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 33]
+
+RFC 7049                          CBOR                      October 2013
+
+
+      0("2013-03-21T20:04:00Z")
+
+   or the equivalent relative time as
+
+      1(1363896240)
+
+   Byte strings are notated in one of the base encodings, without
+   padding, enclosed in single quotes, prefixed by >h< for base16, >b32<
+   for base32, >h32< for base32hex, >b64< for base64 or base64url (the
+   actual encodings do not overlap, so the string remains unambiguous).
+   For example, the byte string 0x12345678 could be written h'12345678',
+   b32'CI2FM6A', or b64'EjRWeA'.
+
+   Unassigned simple values are given as "simple()" with the appropriate
+   integer in the parentheses.  For example, "simple(42)" indicates
+   major type 7, value 42.
+
+6.1.  Encoding Indicators
+
+   Sometimes it is useful to indicate in the diagnostic notation which
+   of several alternative representations were actually used; for
+   example, a data item written >1.5< by a diagnostic decoder might have
+   been encoded as a half-, single-, or double-precision float.
+
+   The convention for encoding indicators is that anything starting with
+   an underscore and all following characters that are alphanumeric or
+   underscore, is an encoding indicator, and can be ignored by anyone
+   not interested in this information.  Encoding indicators are always
+   optional.
+
+   A single underscore can be written after the opening brace of a map
+   or the opening bracket of an array to indicate that the data item was
+   represented in indefinite-length format.  For example, [_ 1, 2]
+   contains an indicator that an indefinite-length representation was
+   used to represent the data item [1, 2].
+
+   An underscore followed by a decimal digit n indicates that the
+   preceding item (or, for arrays and maps, the item starting with the
+   preceding bracket or brace) was encoded with an additional
+   information value of 24+n.  For example, 1.5_1 is a half-precision
+   floating-point number, while 1.5_3 is encoded as double precision.
+   This encoding indicator is not shown in Appendix A.  (Note that the
+   encoding indicator "_" is thus an abbreviation of the full form "_7",
+   which is not used.)
+
+   As a special case, byte and text strings of indefinite length can be
+   notated in the form (_ h'0123', h'4567') and (_ "foo", "bar").
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 34]
+
+RFC 7049                          CBOR                      October 2013
+
+
+7.  IANA Considerations
+
+   IANA has created two registries for new CBOR values.  The registries
+   are separate, that is, not under an umbrella registry, and follow the
+   rules in [RFC5226].  IANA has also assigned a new MIME media type and
+   an associated Constrained Application Protocol (CoAP) Content-Format
+   entry.
+
+7.1.  Simple Values Registry
+
+   IANA has created the "Concise Binary Object Representation (CBOR)
+   Simple Values" registry.  The initial values are shown in Table 2.
+
+   New entries in the range 0 to 19 are assigned by Standards Action.
+   It is suggested that these Standards Actions allocate values starting
+   with the number 16 in order to reserve the lower numbers for
+   contiguous blocks (if any).
+
+   New entries in the range 32 to 255 are assigned by Specification
+   Required.
+
+7.2.  Tags Registry
+
+   IANA has created the "Concise Binary Object Representation (CBOR)
+   Tags" registry.  The initial values are shown in Table 3.
+
+   New entries in the range 0 to 23 are assigned by Standards Action.
+   New entries in the range 24 to 255 are assigned by Specification
+   Required.  New entries in the range 256 to 18446744073709551615 are
+   assigned by First Come First Served.  The template for registration
+   requests is:
+
+   o  Data item
+
+   o  Semantics (short form)
+
+   In addition, First Come First Served requests should include:
+
+   o  Point of contact
+
+   o  Description of semantics (URL)
+      This description is optional; the URL can point to something like
+      an Internet-Draft or a web page.
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 35]
+
+RFC 7049                          CBOR                      October 2013
+
+
+7.3.  Media Type ("MIME Type")
+
+   The Internet media type [RFC6838] for CBOR data is application/cbor.
+
+   Type name: application
+
+   Subtype name: cbor
+
+   Required parameters: n/a
+
+   Optional parameters: n/a
+
+   Encoding considerations:  binary
+
+   Security considerations:  See Section 8 of this document
+
+   Interoperability considerations: n/a
+
+   Published specification: This document
+
+   Applications that use this media type:  None yet, but it is expected
+      that this format will be deployed in protocols and applications.
+
+   Additional information:
+      Magic number(s): n/a
+      File extension(s): .cbor
+      Macintosh file type code(s): n/a
+
+   Person & email address to contact for further information:
+      Carsten Bormann
+      cabo@tzi.org
+
+   Intended usage: COMMON
+
+   Restrictions on usage: none
+
+   Author:
+      Carsten Bormann <cabo@tzi.org>
+
+   Change controller:
+      The IESG <iesg@ietf.org>
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 36]
+
+RFC 7049                          CBOR                      October 2013
+
+
+7.4.  CoAP Content-Format
+
+   Media Type: application/cbor
+
+   Encoding: -
+
+   Id: 60
+
+   Reference: [RFC7049]
+
+7.5.  The +cbor Structured Syntax Suffix Registration
+
+   Name: Concise Binary Object Representation (CBOR)
+
+   +suffix: +cbor
+
+   References: [RFC7049]
+
+   Encoding Considerations: CBOR is a binary format.
+
+   Interoperability Considerations: n/a
+
+   Fragment Identifier Considerations:
+      The syntax and semantics of fragment identifiers specified for
+      +cbor SHOULD be as specified for "application/cbor".  (At
+      publication of this document, there is no fragment identification
+      syntax defined for "application/cbor".)
+
+      The syntax and semantics for fragment identifiers for a specific
+      "xxx/yyy+cbor" SHOULD be processed as follows:
+
+      For cases defined in +cbor, where the fragment identifier resolves
+      per the +cbor rules, then process as specified in +cbor.
+
+      For cases defined in +cbor, where the fragment identifier does not
+      resolve per the +cbor rules, then process as specified in
+      "xxx/yyy+cbor".
+
+      For cases not defined in +cbor, then process as specified in
+      "xxx/yyy+cbor".
+
+   Security Considerations:  See Section 8 of this document
+
+   Contact:
+      Apps Area Working Group (apps-discuss@ietf.org)
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 37]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   Author/Change Controller:
+      The Apps Area Working Group.
+      The IESG has change control over this registration.
+
+8.  Security Considerations
+
+   A network-facing application can exhibit vulnerabilities in its
+   processing logic for incoming data.  Complex parsers are well known
+   as a likely source of such vulnerabilities, such as the ability to
+   remotely crash a node, or even remotely execute arbitrary code on it.
+   CBOR attempts to narrow the opportunities for introducing such
+   vulnerabilities by reducing parser complexity, by giving the entire
+   range of encodable values a meaning where possible.
+
+   Resource exhaustion attacks might attempt to lure a decoder into
+   allocating very big data items (strings, arrays, maps) or exhaust the
+   stack depth by setting up deeply nested items.  Decoders need to have
+   appropriate resource management to mitigate these attacks.  (Items
+   for which very large sizes are given can also attempt to exploit
+   integer overflow vulnerabilities.)
+
+   Applications where a CBOR data item is examined by a gatekeeper
+   function and later used by a different application may exhibit
+   vulnerabilities when multiple interpretations of the data item are
+   possible.  For example, an attacker could make use of duplicate keys
+   in maps and precision issues in numbers to make the gatekeeper base
+   its decisions on a different interpretation than the one that will be
+   used by the second application.  Protocols that are used in a
+   security context should be defined in such a way that these multiple
+   interpretations are reliably reduced to a single one.  To facilitate
+   this, encoder and decoder implementations used in such contexts
+   should provide at least one strict mode of operation (Section 3.10).
+
+9.  Acknowledgements
+
+   CBOR was inspired by MessagePack.  MessagePack was developed and
+   promoted by Sadayuki Furuhashi ("frsyuki").  This reference to
+   MessagePack is solely for attribution; CBOR is not intended as a
+   version of or replacement for MessagePack, as it has different design
+   goals and requirements.
+
+   The need for functionality beyond the original MessagePack
+   Specification became obvious to many people at about the same time
+   around the year 2012.  BinaryPack is a minor derivation of
+   MessagePack that was developed by Eric Zhang for the binaryjs
+   project.  A similar, but different, extension was made by Tim Caswell
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 38]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   for his msgpack-js and msgpack-js-browser projects.  Many people have
+   contributed to the recent discussion about extending MessagePack to
+   separate text string representation from byte string representation.
+
+   The encoding of the additional information in CBOR was inspired by
+   the encoding of length information designed by Klaus Hartke for CoAP.
+
+   This document also incorporates suggestions made by many people,
+   notably Dan Frost, James Manger, Joe Hildebrand, Keith Moore, Matthew
+   Lepinski, Nico Williams, Phillip Hallam-Baker, Ray Polk, Tim Bray,
+   Tony Finch, Tony Hansen, and Yaron Sheffer.
+
+10.  References
+
+10.1.  Normative References
+
+   [ECMA262]  European Computer Manufacturers Association, "ECMAScript
+              Language Specification 5.1 Edition", ECMA Standard
+              ECMA-262, June 2011, <http://www.ecma-international.org/
+              publications/files/ecma-st/ECMA-262.pdf>.
+
+   [RFC2045]  Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+              Extensions (MIME) Part One: Format of Internet Message
+              Bodies", RFC 2045, November 1996.
+
+   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
+              Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+   [RFC3339]  Klyne, G., Ed. and C. Newman, "Date and Time on the
+              Internet: Timestamps", RFC 3339, July 2002.
+
+   [RFC3629]  Yergeau, F., "UTF-8, a transformation format of ISO
+              10646", STD 63, RFC 3629, November 2003.
+
+   [RFC3986]  Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
+              Resource Identifier (URI): Generic Syntax", STD 66, RFC
+              3986, January 2005.
+
+   [RFC4287]  Nottingham, M., Ed. and R. Sayre, Ed., "The Atom
+              Syndication Format", RFC 4287, December 2005.
+
+   [RFC4648]  Josefsson, S., "The Base16, Base32, and Base64 Data
+              Encodings", RFC 4648, October 2006.
+
+   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
+              IANA Considerations Section in RFCs", BCP 26, RFC 5226,
+              May 2008.
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 39]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   [TIME_T]   The Open Group Base Specifications, "Vol. 1: Base
+              Definitions, Issue 7", Section 4.15 'Seconds Since the
+              Epoch', IEEE Std 1003.1, 2013 Edition, 2013,
+              <http://pubs.opengroup.org/onlinepubs/9699919799/
+              basedefs/V1_chap04.html#tag_04_15>.
+
+10.2.  Informative References
+
+   [ASN.1]    International Telecommunication Union, "Information
+              Technology -- ASN.1 encoding rules: Specification of Basic
+              Encoding Rules (BER), Canonical Encoding Rules (CER) and
+              Distinguished Encoding Rules (DER)", ITU-T Recommendation
+              X.690, 1994.
+
+   [BSON]     Various, "BSON - Binary JSON", 2013,
+              <http://bsonspec.org/>.
+
+   [CNN-TERMS]
+              Bormann, C., Ersue, M., and A. Keranen, "Terminology for
+              Constrained Node Networks", Work in Progress, July 2013.
+
+   [MessagePack]
+              Furuhashi, S., "MessagePack", 2013, <http://msgpack.org/>.
+
+   [RFC0713]  Haverty, J., "MSDTP-Message Services Data Transmission
+              Protocol", RFC 713, April 1976.
+
+   [RFC4627]  Crockford, D., "The application/json Media Type for
+              JavaScript Object Notation (JSON)", RFC 4627, July 2006.
+
+   [RFC6838]  Freed, N., Klensin, J., and T. Hansen, "Media Type
+              Specifications and Registration Procedures", BCP 13, RFC
+              6838, January 2013.
+
+   [UBJSON]   The Buzz Media, "Universal Binary JSON Specification",
+              2013, <http://ubjson.org/>.
+
+   [YAML]     Ben-Kiki, O., Evans, C., and I. Net, "YAML Ain't Markup
+              Language (YAML[TM]) Version 1.2", 3rd Edition, October
+              2009, <http://www.yaml.org/spec/1.2/spec.html>.
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 40]
+
+RFC 7049                          CBOR                      October 2013
+
+
+Appendix A.  Examples
+
+   The following table provides some CBOR-encoded values in hexadecimal
+   (right column), together with diagnostic notation for these values
+   (left column).  Note that the string "\u00fc" is one form of
+   diagnostic notation for a UTF-8 string containing the single Unicode
+   character U+00FC, LATIN SMALL LETTER U WITH DIAERESIS (u umlaut).
+   Similarly, "\u6c34" is a UTF-8 string in diagnostic notation with a
+   single character U+6C34 (CJK UNIFIED IDEOGRAPH-6C34, often
+   representing "water"), and "\ud800\udd51" is a UTF-8 string in
+   diagnostic notation with a single character U+10151 (GREEK ACROPHONIC
+   ATTIC FIFTY STATERS).  (Note that all these single-character strings
+   could also be represented in native UTF-8 in diagnostic notation,
+   just not in an ASCII-only specification like the present one.)  In
+   the diagnostic notation provided for bignums, their intended numeric
+   value is shown as a decimal number (such as 18446744073709551616)
+   instead of showing a tagged byte string (such as
+   2(h'010000000000000000')).
+
+   +------------------------------+------------------------------------+
+   | Diagnostic                   | Encoded                            |
+   +------------------------------+------------------------------------+
+   | 0                            | 0x00                               |
+   |                              |                                    |
+   | 1                            | 0x01                               |
+   |                              |                                    |
+   | 10                           | 0x0a                               |
+   |                              |                                    |
+   | 23                           | 0x17                               |
+   |                              |                                    |
+   | 24                           | 0x1818                             |
+   |                              |                                    |
+   | 25                           | 0x1819                             |
+   |                              |                                    |
+   | 100                          | 0x1864                             |
+   |                              |                                    |
+   | 1000                         | 0x1903e8                           |
+   |                              |                                    |
+   | 1000000                      | 0x1a000f4240                       |
+   |                              |                                    |
+   | 1000000000000                | 0x1b000000e8d4a51000               |
+   |                              |                                    |
+   | 18446744073709551615         | 0x1bffffffffffffffff               |
+   |                              |                                    |
+   | 18446744073709551616         | 0xc249010000000000000000           |
+   |                              |                                    |
+   | -18446744073709551616        | 0x3bffffffffffffffff               |
+   |                              |                                    |
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 41]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   | -18446744073709551617        | 0xc349010000000000000000           |
+   |                              |                                    |
+   | -1                           | 0x20                               |
+   |                              |                                    |
+   | -10                          | 0x29                               |
+   |                              |                                    |
+   | -100                         | 0x3863                             |
+   |                              |                                    |
+   | -1000                        | 0x3903e7                           |
+   |                              |                                    |
+   | 0.0                          | 0xf90000                           |
+   |                              |                                    |
+   | -0.0                         | 0xf98000                           |
+   |                              |                                    |
+   | 1.0                          | 0xf93c00                           |
+   |                              |                                    |
+   | 1.1                          | 0xfb3ff199999999999a               |
+   |                              |                                    |
+   | 1.5                          | 0xf93e00                           |
+   |                              |                                    |
+   | 65504.0                      | 0xf97bff                           |
+   |                              |                                    |
+   | 100000.0                     | 0xfa47c35000                       |
+   |                              |                                    |
+   | 3.4028234663852886e+38       | 0xfa7f7fffff                       |
+   |                              |                                    |
+   | 1.0e+300                     | 0xfb7e37e43c8800759c               |
+   |                              |                                    |
+   | 5.960464477539063e-8         | 0xf90001                           |
+   |                              |                                    |
+   | 0.00006103515625             | 0xf90400                           |
+   |                              |                                    |
+   | -4.0                         | 0xf9c400                           |
+   |                              |                                    |
+   | -4.1                         | 0xfbc010666666666666               |
+   |                              |                                    |
+   | Infinity                     | 0xf97c00                           |
+   |                              |                                    |
+   | NaN                          | 0xf97e00                           |
+   |                              |                                    |
+   | -Infinity                    | 0xf9fc00                           |
+   |                              |                                    |
+   | Infinity                     | 0xfa7f800000                       |
+   |                              |                                    |
+   | NaN                          | 0xfa7fc00000                       |
+   |                              |                                    |
+   | -Infinity                    | 0xfaff800000                       |
+   |                              |                                    |
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 42]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   | Infinity                     | 0xfb7ff0000000000000               |
+   |                              |                                    |
+   | NaN                          | 0xfb7ff8000000000000               |
+   |                              |                                    |
+   | -Infinity                    | 0xfbfff0000000000000               |
+   |                              |                                    |
+   | false                        | 0xf4                               |
+   |                              |                                    |
+   | true                         | 0xf5                               |
+   |                              |                                    |
+   | null                         | 0xf6                               |
+   |                              |                                    |
+   | undefined                    | 0xf7                               |
+   |                              |                                    |
+   | simple(16)                   | 0xf0                               |
+   |                              |                                    |
+   | simple(24)                   | 0xf818                             |
+   |                              |                                    |
+   | simple(255)                  | 0xf8ff                             |
+   |                              |                                    |
+   | 0("2013-03-21T20:04:00Z")    | 0xc074323031332d30332d32315432303a |
+   |                              | 30343a30305a                       |
+   |                              |                                    |
+   | 1(1363896240)                | 0xc11a514b67b0                     |
+   |                              |                                    |
+   | 1(1363896240.5)              | 0xc1fb41d452d9ec200000             |
+   |                              |                                    |
+   | 23(h'01020304')              | 0xd74401020304                     |
+   |                              |                                    |
+   | 24(h'6449455446')            | 0xd818456449455446                 |
+   |                              |                                    |
+   | 32("http://www.example.com") | 0xd82076687474703a2f2f7777772e6578 |
+   |                              | 616d706c652e636f6d                 |
+   |                              |                                    |
+   | h''                          | 0x40                               |
+   |                              |                                    |
+   | h'01020304'                  | 0x4401020304                       |
+   |                              |                                    |
+   | ""                           | 0x60                               |
+   |                              |                                    |
+   | "a"                          | 0x6161                             |
+   |                              |                                    |
+   | "IETF"                       | 0x6449455446                       |
+   |                              |                                    |
+   | "\"\\"                       | 0x62225c                           |
+   |                              |                                    |
+   | "\u00fc"                     | 0x62c3bc                           |
+   |                              |                                    |
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 43]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   | "\u6c34"                     | 0x63e6b0b4                         |
+   |                              |                                    |
+   | "\ud800\udd51"               | 0x64f0908591                       |
+   |                              |                                    |
+   | []                           | 0x80                               |
+   |                              |                                    |
+   | [1, 2, 3]                    | 0x83010203                         |
+   |                              |                                    |
+   | [1, [2, 3], [4, 5]]          | 0x8301820203820405                 |
+   |                              |                                    |
+   | [1, 2, 3, 4, 5, 6, 7, 8, 9,  | 0x98190102030405060708090a0b0c0d0e |
+   | 10, 11, 12, 13, 14, 15, 16,  | 0f101112131415161718181819         |
+   | 17, 18, 19, 20, 21, 22, 23,  |                                    |
+   | 24, 25]                      |                                    |
+   |                              |                                    |
+   | {}                           | 0xa0                               |
+   |                              |                                    |
+   | {1: 2, 3: 4}                 | 0xa201020304                       |
+   |                              |                                    |
+   | {"a": 1, "b": [2, 3]}        | 0xa26161016162820203               |
+   |                              |                                    |
+   | ["a", {"b": "c"}]            | 0x826161a161626163                 |
+   |                              |                                    |
+   | {"a": "A", "b": "B", "c":    | 0xa5616161416162614261636143616461 |
+   | "C", "d": "D", "e": "E"}     | 4461656145                         |
+   |                              |                                    |
+   | (_ h'0102', h'030405')       | 0x5f42010243030405ff               |
+   |                              |                                    |
+   | (_ "strea", "ming")          | 0x7f657374726561646d696e67ff       |
+   |                              |                                    |
+   | [_ ]                         | 0x9fff                             |
+   |                              |                                    |
+   | [_ 1, [2, 3], [_ 4, 5]]      | 0x9f018202039f0405ffff             |
+   |                              |                                    |
+   | [_ 1, [2, 3], [4, 5]]        | 0x9f01820203820405ff               |
+   |                              |                                    |
+   | [1, [2, 3], [_ 4, 5]]        | 0x83018202039f0405ff               |
+   |                              |                                    |
+   | [1, [_ 2, 3], [4, 5]]        | 0x83019f0203ff820405               |
+   |                              |                                    |
+   | [_ 1, 2, 3, 4, 5, 6, 7, 8,   | 0x9f0102030405060708090a0b0c0d0e0f |
+   | 9, 10, 11, 12, 13, 14, 15,   | 101112131415161718181819ff         |
+   | 16, 17, 18, 19, 20, 21, 22,  |                                    |
+   | 23, 24, 25]                  |                                    |
+   |                              |                                    |
+   | {_ "a": 1, "b": [_ 2, 3]}    | 0xbf61610161629f0203ffff           |
+   |                              |                                    |
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 44]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   | ["a", {_ "b": "c"}]          | 0x826161bf61626163ff               |
+   |                              |                                    |
+   | {_ "Fun": true, "Amt": -2}   | 0xbf6346756ef563416d7421ff         |
+   +------------------------------+------------------------------------+
+
+               Table 4: Examples of Encoded CBOR Data Items
+
+Appendix B.  Jump Table
+
+   For brevity, this jump table does not show initial bytes that are
+   reserved for future extension.  It also only shows a selection of the
+   initial bytes that can be used for optional features.  (All unsigned
+   integers are in network byte order.)
+
+   +-----------------+-------------------------------------------------+
+   | Byte            | Structure/Semantics                             |
+   +-----------------+-------------------------------------------------+
+   | 0x00..0x17      | Integer 0x00..0x17 (0..23)                      |
+   |                 |                                                 |
+   | 0x18            | Unsigned integer (one-byte uint8_t follows)     |
+   |                 |                                                 |
+   | 0x19            | Unsigned integer (two-byte uint16_t follows)    |
+   |                 |                                                 |
+   | 0x1a            | Unsigned integer (four-byte uint32_t follows)   |
+   |                 |                                                 |
+   | 0x1b            | Unsigned integer (eight-byte uint64_t follows)  |
+   |                 |                                                 |
+   | 0x20..0x37      | Negative integer -1-0x00..-1-0x17 (-1..-24)     |
+   |                 |                                                 |
+   | 0x38            | Negative integer -1-n (one-byte uint8_t for n   |
+   |                 | follows)                                        |
+   |                 |                                                 |
+   | 0x39            | Negative integer -1-n (two-byte uint16_t for n  |
+   |                 | follows)                                        |
+   |                 |                                                 |
+   | 0x3a            | Negative integer -1-n (four-byte uint32_t for n |
+   |                 | follows)                                        |
+   |                 |                                                 |
+   | 0x3b            | Negative integer -1-n (eight-byte uint64_t for  |
+   |                 | n follows)                                      |
+   |                 |                                                 |
+   | 0x40..0x57      | byte string (0x00..0x17 bytes follow)           |
+   |                 |                                                 |
+   | 0x58            | byte string (one-byte uint8_t for n, and then n |
+   |                 | bytes follow)                                   |
+   |                 |                                                 |
+   | 0x59            | byte string (two-byte uint16_t for n, and then  |
+   |                 | n bytes follow)                                 |
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 45]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   |                 |                                                 |
+   | 0x5a            | byte string (four-byte uint32_t for n, and then |
+   |                 | n bytes follow)                                 |
+   |                 |                                                 |
+   | 0x5b            | byte string (eight-byte uint64_t for n, and     |
+   |                 | then n bytes follow)                            |
+   |                 |                                                 |
+   | 0x5f            | byte string, byte strings follow, terminated by |
+   |                 | "break"                                         |
+   |                 |                                                 |
+   | 0x60..0x77      | UTF-8 string (0x00..0x17 bytes follow)          |
+   |                 |                                                 |
+   | 0x78            | UTF-8 string (one-byte uint8_t for n, and then  |
+   |                 | n bytes follow)                                 |
+   |                 |                                                 |
+   | 0x79            | UTF-8 string (two-byte uint16_t for n, and then |
+   |                 | n bytes follow)                                 |
+   |                 |                                                 |
+   | 0x7a            | UTF-8 string (four-byte uint32_t for n, and     |
+   |                 | then n bytes follow)                            |
+   |                 |                                                 |
+   | 0x7b            | UTF-8 string (eight-byte uint64_t for n, and    |
+   |                 | then n bytes follow)                            |
+   |                 |                                                 |
+   | 0x7f            | UTF-8 string, UTF-8 strings follow, terminated  |
+   |                 | by "break"                                      |
+   |                 |                                                 |
+   | 0x80..0x97      | array (0x00..0x17 data items follow)            |
+   |                 |                                                 |
+   | 0x98            | array (one-byte uint8_t for n, and then n data  |
+   |                 | items follow)                                   |
+   |                 |                                                 |
+   | 0x99            | array (two-byte uint16_t for n, and then n data |
+   |                 | items follow)                                   |
+   |                 |                                                 |
+   | 0x9a            | array (four-byte uint32_t for n, and then n     |
+   |                 | data items follow)                              |
+   |                 |                                                 |
+   | 0x9b            | array (eight-byte uint64_t for n, and then n    |
+   |                 | data items follow)                              |
+   |                 |                                                 |
+   | 0x9f            | array, data items follow, terminated by "break" |
+   |                 |                                                 |
+   | 0xa0..0xb7      | map (0x00..0x17 pairs of data items follow)     |
+   |                 |                                                 |
+   | 0xb8            | map (one-byte uint8_t for n, and then n pairs   |
+   |                 | of data items follow)                           |
+   |                 |                                                 |
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 46]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   | 0xb9            | map (two-byte uint16_t for n, and then n pairs  |
+   |                 | of data items follow)                           |
+   |                 |                                                 |
+   | 0xba            | map (four-byte uint32_t for n, and then n pairs |
+   |                 | of data items follow)                           |
+   |                 |                                                 |
+   | 0xbb            | map (eight-byte uint64_t for n, and then n      |
+   |                 | pairs of data items follow)                     |
+   |                 |                                                 |
+   | 0xbf            | map, pairs of data items follow, terminated by  |
+   |                 | "break"                                         |
+   |                 |                                                 |
+   | 0xc0            | Text-based date/time (data item follows; see    |
+   |                 | Section 2.4.1)                                  |
+   |                 |                                                 |
+   | 0xc1            | Epoch-based date/time (data item follows; see   |
+   |                 | Section 2.4.1)                                  |
+   |                 |                                                 |
+   | 0xc2            | Positive bignum (data item "byte string"        |
+   |                 | follows)                                        |
+   |                 |                                                 |
+   | 0xc3            | Negative bignum (data item "byte string"        |
+   |                 | follows)                                        |
+   |                 |                                                 |
+   | 0xc4            | Decimal Fraction (data item "array" follows;    |
+   |                 | see Section 2.4.3)                              |
+   |                 |                                                 |
+   | 0xc5            | Bigfloat (data item "array" follows; see        |
+   |                 | Section 2.4.3)                                  |
+   |                 |                                                 |
+   | 0xc6..0xd4      | (tagged item)                                   |
+   |                 |                                                 |
+   | 0xd5..0xd7      | Expected Conversion (data item follows; see     |
+   |                 | Section 2.4.4.2)                                |
+   |                 |                                                 |
+   | 0xd8..0xdb      | (more tagged items, 1/2/4/8 bytes and then a    |
+   |                 | data item follow)                               |
+   |                 |                                                 |
+   | 0xe0..0xf3      | (simple value)                                  |
+   |                 |                                                 |
+   | 0xf4            | False                                           |
+   |                 |                                                 |
+   | 0xf5            | True                                            |
+   |                 |                                                 |
+   | 0xf6            | Null                                            |
+   |                 |                                                 |
+   | 0xf7            | Undefined                                       |
+   |                 |                                                 |
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 47]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   | 0xf8            | (simple value, one byte follows)                |
+   |                 |                                                 |
+   | 0xf9            | Half-Precision Float (two-byte IEEE 754)        |
+   |                 |                                                 |
+   | 0xfa            | Single-Precision Float (four-byte IEEE 754)     |
+   |                 |                                                 |
+   | 0xfb            | Double-Precision Float (eight-byte IEEE 754)    |
+   |                 |                                                 |
+   | 0xff            | "break" stop code                               |
+   +-----------------+-------------------------------------------------+
+
+                   Table 5: Jump Table for Initial Byte
+
+Appendix C.  Pseudocode
+
+   The well-formedness of a CBOR item can be checked by the pseudocode
+   in Figure 1.  The data is well-formed if and only if:
+
+   o  the pseudocode does not "fail";
+
+   o  after execution of the pseudocode, no bytes are left in the input
+      (except in streaming applications)
+
+   The pseudocode has the following prerequisites:
+
+   o  take(n) reads n bytes from the input data and returns them as a
+      byte string.  If n bytes are no longer available, take(n) fails.
+
+   o  uint() converts a byte string into an unsigned integer by
+      interpreting the byte string in network byte order.
+
+   o  Arithmetic works as in C.
+
+   o  All variables are unsigned integers of sufficient range.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 48]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   well_formed (breakable = false) {
+     // process initial bytes
+     ib = uint(take(1));
+     mt = ib >> 5;
+     val = ai = ib & 0x1f;
+     switch (ai) {
+       case 24: val = uint(take(1)); break;
+       case 25: val = uint(take(2)); break;
+       case 26: val = uint(take(4)); break;
+       case 27: val = uint(take(8)); break;
+       case 28: case 29: case 30: fail();
+       case 31:
+         return well_formed_indefinite(mt, breakable);
+     }
+     // process content
+     switch (mt) {
+       // case 0, 1, 7 do not have content; just use val
+       case 2: case 3: take(val); break; // bytes/UTF-8
+       case 4: for (i = 0; i < val; i++) well_formed(); break;
+       case 5: for (i = 0; i < val*2; i++) well_formed(); break;
+       case 6: well_formed(); break;     // 1 embedded data item
+     }
+     return mt;                    // finite data item
+   }
+
+   well_formed_indefinite(mt, breakable) {
+     switch (mt) {
+       case 2: case 3:
+         while ((it = well_formed(true)) != -1)
+           if (it != mt)           // need finite embedded
+             fail();               //    of same type
+         break;
+       case 4: while (well_formed(true) != -1); break;
+       case 5: while (well_formed(true) != -1) well_formed(); break;
+       case 7:
+         if (breakable)
+           return -1;              // signal break out
+         else fail();              // no enclosing indefinite
+       default: fail();            // wrong mt
+     }
+     return 0;                     // no break out
+   }
+
+              Figure 1: Pseudocode for Well-Formedness Check
+
+   Note that the remaining complexity of a complete CBOR decoder is
+   about presenting data that has been parsed to the application in an
+   appropriate form.
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 49]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   Major types 0 and 1 are designed in such a way that they can be
+   encoded in C from a signed integer without actually doing an if-then-
+   else for positive/negative (Figure 2).  This uses the fact that
+   (-1-n), the transformation for major type 1, is the same as ~n
+   (bitwise complement) in C unsigned arithmetic; ~n can then be
+   expressed as (-1)^n for the negative case, while 0^n leaves n
+   unchanged for non-negative.  The sign of a number can be converted to
+   -1 for negative and 0 for non-negative (0 or positive) by arithmetic-
+   shifting the number by one bit less than the bit length of the number
+   (for example, by 63 for 64-bit numbers).
+
+   void encode_sint(int64_t n) {
+     uint64t ui = n >> 63;    // extend sign to whole length
+     mt = ui & 0x20;          // extract major type
+     ui ^= n;                 // complement negatives
+     if (ui < 24)
+       *p++ = mt + ui;
+     else if (ui < 256) {
+       *p++ = mt + 24;
+       *p++ = ui;
+     } else
+          ...
+
+            Figure 2: Pseudocode for Encoding a Signed Integer
+
+Appendix D.  Half-Precision
+
+   As half-precision floating-point numbers were only added to IEEE 754
+   in 2008, today's programming platforms often still only have limited
+   support for them.  It is very easy to include at least decoding
+   support for them even without such support.  An example of a small
+   decoder for half-precision floating-point numbers in the C language
+   is shown in Figure 3.  A similar program for Python is in Figure 4;
+   this code assumes that the 2-byte value has already been decoded as
+   an (unsigned short) integer in network byte order (as would be done
+   by the pseudocode in Appendix C).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 50]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   #include <math.h>
+
+   double decode_half(unsigned char *halfp) {
+     int half = (halfp[0] << 8) + halfp[1];
+     int exp = (half >> 10) & 0x1f;
+     int mant = half & 0x3ff;
+     double val;
+     if (exp == 0) val = ldexp(mant, -24);
+     else if (exp != 31) val = ldexp(mant + 1024, exp - 25);
+     else val = mant == 0 ? INFINITY : NAN;
+     return half & 0x8000 ? -val : val;
+   }
+
+               Figure 3: C Code for a Half-Precision Decoder
+
+   import struct
+   from math import ldexp
+
+   def decode_single(single):
+       return struct.unpack("!f", struct.pack("!I", single))[0]
+
+   def decode_half(half):
+       valu = (half & 0x7fff) << 13 | (half & 0x8000) << 16
+       if ((half & 0x7c00) != 0x7c00):
+           return ldexp(decode_single(valu), 112)
+       return decode_single(valu | 0x7f800000)
+
+            Figure 4: Python Code for a Half-Precision Decoder
+
+Appendix E.  Comparison of Other Binary Formats to CBOR's Design
+             Objectives
+
+   The proposal for CBOR follows a history of binary formats that is as
+   long as the history of computers themselves.  Different formats have
+   had different objectives.  In most cases, the objectives of the
+   format were never stated, although they can sometimes be implied by
+   the context where the format was first used.  Some formats were meant
+   to be universally usable, although history has proven that no binary
+   format meets the needs of all protocols and applications.
+
+   CBOR differs from many of these formats due to it starting with a set
+   of objectives and attempting to meet just those.  This section
+   compares a few of the dozens of formats with CBOR's objectives in
+   order to help the reader decide if they want to use CBOR or a
+   different format for a particular protocol or application.
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 51]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   Note that the discussion here is not meant to be a criticism of any
+   format: to the best of our knowledge, no format before CBOR was meant
+   to cover CBOR's objectives in the priority we have assigned them.  A
+   brief recap of the objectives from Section 1.1 is:
+
+   1.  unambiguous encoding of most common data formats from Internet
+       standards
+
+   2.  code compactness for encoder or decoder
+
+   3.  no schema description needed
+
+   4.  reasonably compact serialization
+
+   5.  applicability to constrained and unconstrained applications
+
+   6.  good JSON conversion
+
+   7.  extensibility
+
+E.1.  ASN.1 DER, BER, and PER
+
+   [ASN.1] has many serializations.  In the IETF, DER and BER are the
+   most common.  The serialized output is not particularly compact for
+   many items, and the code needed to decode numeric items can be
+   complex on a constrained device.
+
+   Few (if any) IETF protocols have adopted one of the several variants
+   of Packed Encoding Rules (PER).  There could be many reasons for
+   this, but one that is commonly stated is that PER makes use of the
+   schema even for parsing the surface structure of the data stream,
+   requiring significant tool support.  There are different versions of
+   the ASN.1 schema language in use, which has also hampered adoption.
+
+E.2.  MessagePack
+
+   [MessagePack] is a concise, widely implemented counted binary
+   serialization format, similar in many properties to CBOR, although
+   somewhat less regular.  While the data model can be used to represent
+   JSON data, MessagePack has also been used in many remote procedure
+   call (RPC) applications and for long-term storage of data.
+
+   MessagePack has been essentially stable since it was first published
+   around 2011; it has not yet had a transition.  The evolution of
+   MessagePack is impeded by an imperative to maintain complete
+   backwards compatibility with existing stored data, while only few
+   bytecodes are still available for extension.  Repeated requests over
+   the years from the MessagePack user community to separate out binary
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 52]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   and text strings in the encoding recently have led to an extension
+   proposal that would leave MessagePack's "raw" data ambiguous between
+   its usages for binary and text data.  The extension mechanism for
+   MessagePack remains unclear.
+
+E.3.  BSON
+
+   [BSON] is a data format that was developed for the storage of JSON-
+   like maps (JSON objects) in the MongoDB database.  Its major
+   distinguishing feature is the capability for in-place update,
+   foregoing a compact representation.  BSON uses a counted
+   representation except for map keys, which are null-byte terminated.
+   While BSON can be used for the representation of JSON-like objects on
+   the wire, its specification is dominated by the requirements of the
+   database application and has become somewhat baroque.  The status of
+   how BSON extensions will be implemented remains unclear.
+
+E.4.  UBJSON
+
+   [UBJSON] has a design goal to make JSON faster and somewhat smaller,
+   using a binary format that is limited to exactly the data model JSON
+   uses.  Thus, there is expressly no intention to support, for example,
+   binary data; however, there is a "high-precision number", expressed
+   as a character string in JSON syntax.  UBJSON is not optimized for
+   code compactness, and its type byte coding is optimized for human
+   recognition and not for compact representation of native types such
+   as small integers.  Although UBJSON is mostly counted, it provides a
+   reserved "unknown-length" value to support streaming of arrays and
+   maps (JSON objects).  Within these containers, UBJSON also has a
+   "Noop" type for padding.
+
+E.5.  MSDTP: RFC 713
+
+   Message Services Data Transmission (MSDTP) is a very early example of
+   a compact message format; it is described in [RFC0713], written in
+   1976.  It is included here for its historical value, not because it
+   was ever widely used.
+
+E.6.  Conciseness on the Wire
+
+   While CBOR's design objective of code compactness for encoders and
+   decoders is a higher priority than its objective of conciseness on
+   the wire, many people focus on the wire size.  Table 6 shows some
+   encoding examples for the simple nested array [1, [2, 3]]; where some
+   form of indefinite-length encoding is supported by the encoding,
+   [_ 1, [2, 3]] (indefinite length on the outer array) is also shown.
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 53]
+
+RFC 7049                          CBOR                      October 2013
+
+
+   +---------------+-------------------------+-------------------------+
+   | Format        | [1, [2, 3]]             | [_ 1, [2, 3]]           |
+   +---------------+-------------------------+-------------------------+
+   | RFC 713       | c2 05 81 c2 02 82 83    |                         |
+   |               |                         |                         |
+   | ASN.1 BER     | 30 0b 02 01 01 30 06 02 | 30 80 02 01 01 30 06 02 |
+   |               | 01 02 02 01 03          | 01 02 02 01 03 00 00    |
+   |               |                         |                         |
+   | MessagePack   | 92 01 92 02 03          |                         |
+   |               |                         |                         |
+   | BSON          | 22 00 00 00 10 30 00 01 |                         |
+   |               | 00 00 00 04 31 00 13 00 |                         |
+   |               | 00 00 10 30 00 02 00 00 |                         |
+   |               | 00 10 31 00 03 00 00 00 |                         |
+   |               | 00 00                   |                         |
+   |               |                         |                         |
+   | UBJSON        | 61 02 42 01 61 02 42 02 | 61 ff 42 01 61 02 42 02 |
+   |               | 42 03                   | 42 03 45                |
+   |               |                         |                         |
+   | CBOR          | 82 01 82 02 03          | 9f 01 82 02 03 ff       |
+   +---------------+-------------------------+-------------------------+
+
+           Table 6: Examples for Different Levels of Conciseness
+
+Authors' Addresses
+
+   Carsten Bormann
+   Universitaet Bremen TZI
+   Postfach 330440
+   D-28359 Bremen
+   Germany
+
+   Phone: +49-421-218-63921
+   EMail: cabo@tzi.org
+
+
+   Paul Hoffman
+   VPN Consortium
+
+   EMail: paul.hoffman@vpnc.org
+
+
+
+
+
+
+
+
+
+
+
+Bormann & Hoffman            Standards Track                   [Page 54]
+
-- 
cgit v1.2.3