summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc7845.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc7845.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc7845.txt')
-rw-r--r--doc/rfc/rfc7845.txt1963
1 files changed, 1963 insertions, 0 deletions
diff --git a/doc/rfc/rfc7845.txt b/doc/rfc/rfc7845.txt
new file mode 100644
index 0000000..9037080
--- /dev/null
+++ b/doc/rfc/rfc7845.txt
@@ -0,0 +1,1963 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) T. Terriberry
+Request for Comments: 7845 Mozilla Corporation
+Updates: 5334 R. Lee
+Category: Standards Track Voicetronix
+ISSN: 2070-1721 R. Giles
+ Mozilla Corporation
+ April 2016
+
+
+ Ogg Encapsulation for the Opus Audio Codec
+
+Abstract
+
+ This document defines the Ogg encapsulation for the Opus interactive
+ speech and audio codec. This allows data encoded in the Opus format
+ to be stored in an Ogg logical bitstream.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 5741.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc7845.
+
+Copyright Notice
+
+ Copyright (c) 2016 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 1]
+
+RFC 7845 Ogg Opus April 2016
+
+
+Table of Contents
+
+ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
+ 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3
+ 3. Packet Organization . . . . . . . . . . . . . . . . . . . . . 4
+ 4. Granule Position . . . . . . . . . . . . . . . . . . . . . . 6
+ 4.1. Repairing Gaps in Real-Time Streams . . . . . . . . . . . 7
+ 4.2. Pre-skip . . . . . . . . . . . . . . . . . . . . . . . . 9
+ 4.3. PCM Sample Position . . . . . . . . . . . . . . . . . . . 9
+ 4.4. End Trimming . . . . . . . . . . . . . . . . . . . . . . 10
+ 4.5. Restrictions on the Initial Granule Position . . . . . . 10
+ 4.6. Seeking and Pre-roll . . . . . . . . . . . . . . . . . . 11
+ 5. Header Packets . . . . . . . . . . . . . . . . . . . . . . . 12
+ 5.1. Identification Header . . . . . . . . . . . . . . . . . . 12
+ 5.1.1. Channel Mapping . . . . . . . . . . . . . . . . . . . 16
+ 5.2. Comment Header . . . . . . . . . . . . . . . . . . . . . 22
+ 5.2.1. Tag Definitions . . . . . . . . . . . . . . . . . . . 25
+ 6. Packet Size Limits . . . . . . . . . . . . . . . . . . . . . 26
+ 7. Encoder Guidelines . . . . . . . . . . . . . . . . . . . . . 27
+ 7.1. LPC Extrapolation . . . . . . . . . . . . . . . . . . . . 28
+ 7.2. Continuous Chaining . . . . . . . . . . . . . . . . . . . 28
+ 8. Security Considerations . . . . . . . . . . . . . . . . . . . 29
+ 9. Content Type . . . . . . . . . . . . . . . . . . . . . . . . 30
+ 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 31
+ 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 32
+ 11.1. Normative References . . . . . . . . . . . . . . . . . . 32
+ 11.2. Informative References . . . . . . . . . . . . . . . . . 33
+ Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 34
+ Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35
+
+1. Introduction
+
+ The IETF Opus codec is a low-latency audio codec optimized for both
+ voice and general-purpose audio. See [RFC6716] for technical
+ details. This document defines the encapsulation of Opus in a
+ continuous, logical Ogg bitstream [RFC3533]. Ogg encapsulation
+ provides Opus with a long-term storage format supporting all of the
+ essential features, including metadata, fast and accurate seeking,
+ corruption detection, recapture after errors, low overhead, and the
+ ability to multiplex Opus with other codecs (including video) with
+ minimal buffering. It also provides a live streamable format capable
+ of delivery over a reliable stream-oriented transport, without
+ requiring all the data (or even the total length of the data)
+ up-front, in a form that is identical to the on-disk storage format.
+
+ Ogg bitstreams are made up of a series of "pages", each of which
+ contains data from one or more "packets". Pages are the fundamental
+ unit of multiplexing in an Ogg stream. Each page is associated with
+
+
+
+Terriberry, et al. Standards Track [Page 2]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ a particular logical stream and contains a capture pattern and
+ checksum, flags to mark the beginning and end of the logical stream,
+ and a "granule position" that represents an absolute position in the
+ stream, to aid seeking. A single page can contain up to 65,025
+ octets of packet data from up to 255 different packets. Packets can
+ be split arbitrarily across pages and continued from one page to the
+ next (allowing packets much larger than would fit on a single page).
+ Each page contains "lacing values" that indicate how the data is
+ partitioned into packets, allowing a demultiplexer (demuxer) to
+ recover the packet boundaries without examining the encoded data. A
+ packet is said to "complete" on a page when the page contains the
+ final lacing value corresponding to that packet.
+
+ This encapsulation defines the contents of the packet data, including
+ the necessary headers, the organization of those packets into a
+ logical stream, and the interpretation of the codec-specific granule
+ position field. It does not attempt to describe or specify the
+ existing Ogg container format. Readers unfamiliar with the basic
+ concepts mentioned above are encouraged to review the details in
+ [RFC3533].
+
+2. Terminology
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ [RFC2119].
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 3]
+
+RFC 7845 Ogg Opus April 2016
+
+
+3. Packet Organization
+
+ An Ogg Opus stream is organized as follows (see Figure 1 for an
+ example).
+
+ Page 0 Pages 1 ... n Pages (n+1) ...
+ +------------+ +---+ +---+ ... +---+ +-----------+ +---------+ +--
+ | | | | | | | | | | | | |
+ |+----------+| |+-----------------+| |+-------------------+ +-----
+ |||ID Header|| || Comment Header || ||Audio Data Packet 1| | ...
+ |+----------+| |+-----------------+| |+-------------------+ +-----
+ | | | | | | | | | | | | |
+ +------------+ +---+ +---+ ... +---+ +-----------+ +---------+ +--
+ ^ ^ ^
+ | | |
+ | | Mandatory Page Break
+ | |
+ | ID header is contained on a single page
+ |
+ 'Beginning Of Stream'
+
+ Figure 1: Example Packet Organization for a Logical Ogg Opus Stream
+
+ There are two mandatory header packets. The first packet in the
+ logical Ogg bitstream MUST contain the identification (ID) header,
+ which uniquely identifies a stream as Opus audio. The format of this
+ header is defined in Section 5.1. It is placed alone (without any
+ other packet data) on the first page of the logical Ogg bitstream and
+ completes on that page. This page has its 'beginning of stream' flag
+ set.
+
+ The second packet in the logical Ogg bitstream MUST contain the
+ comment header, which contains user-supplied metadata. The format of
+ this header is defined in Section 5.2. It MAY span multiple pages,
+ beginning on the second page of the logical stream. However many
+ pages it spans, the comment header packet MUST finish the page on
+ which it completes.
+
+ All subsequent pages are audio data pages, and the Ogg packets they
+ contain are audio data packets. Each audio data packet contains one
+ Opus packet for each of N different streams, where N is typically one
+ for mono or stereo, but MAY be greater than one for multichannel
+ audio. The value N is specified in the ID header (see
+ Section 5.1.1), and is fixed over the entire length of the logical
+ Ogg bitstream.
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 4]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ The first (N - 1) Opus packets, if any, are packed one after another
+ into the Ogg packet, using the self-delimiting framing from
+ Appendix B of [RFC6716]. The remaining Opus packet is packed at the
+ end of the Ogg packet using the regular, undelimited framing from
+ Section 3 of [RFC6716]. All of the Opus packets in a single Ogg
+ packet MUST be constrained to have the same duration. An
+ implementation of this specification SHOULD treat any Opus packet
+ whose duration is different from that of the first Opus packet in an
+ Ogg packet as if it were a malformed Opus packet with an invalid
+ Table Of Contents (TOC) sequence.
+
+ The TOC sequence at the beginning of each Opus packet indicates the
+ coding mode, audio bandwidth, channel count, duration (frame size),
+ and number of frames per packet, as described in Section 3.1
+ of [RFC6716]. The coding mode is one of SILK, Hybrid, or Constrained
+ Energy Lapped Transform (CELT). The combination of coding mode,
+ audio bandwidth, and frame size is referred to as the configuration
+ of an Opus packet.
+
+ Packets are placed into Ogg pages in order until the end of stream.
+ Audio data packets might span page boundaries. The first audio data
+ page could have the 'continued packet' flag set (indicating the first
+ audio data packet is continued from a previous page) if, for example,
+ it was a live stream joined mid-broadcast, with the headers pasted on
+ the front. If a page has the 'continued packet' flag set and one of
+ the following conditions is also true:
+
+ o the previous page with packet data does not end in a continued
+ packet (does not end with a lacing value of 255) OR
+
+ o the page sequence numbers are not consecutive,
+
+ then a demuxer MUST NOT attempt to decode the data for the first
+ packet on the page unless the demuxer has some special knowledge that
+ would allow it to interpret this data despite the missing pieces. An
+ implementation MUST treat a zero-octet audio data packet as if it
+ were a malformed Opus packet as described in Section 3.4
+ of [RFC6716].
+
+ A logical stream ends with a page with the 'end of stream' flag set,
+ but implementations need to be prepared to deal with truncated
+ streams that do not have a page marked 'end of stream'. There is no
+ reason for the final packet on the last page to be a continued
+ packet, i.e., for the final lacing value to be 255. However,
+ demuxers might encounter such streams, possibly as the result of a
+ transfer that did not complete or of corruption. If a packet
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 5]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ continues onto a subsequent page (i.e., when the page ends with a
+ lacing value of 255) and one of the following conditions is also
+ true:
+
+ o the next page with packet data does not have the 'continued
+ packet' flag set, OR
+
+ o there is no next page with packet data, OR
+
+ o the page sequence numbers are not consecutive,
+
+ then a demuxer MUST NOT attempt to decode the data from that packet
+ unless the demuxer has some special knowledge that would allow it to
+ interpret this data despite the missing pieces. There MUST NOT be
+ any more pages in an Opus logical bitstream after a page marked 'end
+ of stream'.
+
+4. Granule Position
+
+ The granule position MUST be zero for the ID header page and the page
+ where the comment header completes. That is, the first page in the
+ logical stream and the last header page before the first audio data
+ page both have a granule position of zero.
+
+ The granule position of an audio data page encodes the total number
+ of PCM samples in the stream up to and including the last fully
+ decodable sample from the last packet completed on that page. The
+ granule position of the first audio data page will usually be larger
+ than zero, as described in Section 4.5.
+
+ A page that is entirely spanned by a single packet (that completes on
+ a subsequent page) has no granule position, and the granule position
+ field is set to the special value '-1' in two's complement.
+
+ The granule position of an audio data page is in units of PCM audio
+ samples at a fixed rate of 48 kHz (per channel; a stereo stream's
+ granule position does not increment at twice the speed of a mono
+ stream). It is possible to run an Opus decoder at other sampling
+ rates, but all Opus packets encode samples at a sampling rate that
+ evenly divides 48 kHz. Therefore, the value in the granule position
+ field always counts samples assuming a 48 kHz decoding rate, and the
+ rest of this specification makes the same assumption.
+
+ The duration of an Opus packet as defined in [RFC6716] can be any
+ multiple of 2.5 ms, up to a maximum of 120 ms. This duration is
+ encoded in the TOC sequence at the beginning of each packet. The
+ number of samples returned by a decoder corresponds to this duration
+ exactly, even for the first few packets. For example, a 20 ms packet
+
+
+
+Terriberry, et al. Standards Track [Page 6]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ fed to a decoder running at 48 kHz will always return 960 samples. A
+ demuxer can parse the TOC sequence at the beginning of each Ogg
+ packet to work backwards or forwards from a packet with a known
+ granule position (i.e., the last packet completed on some page) in
+ order to assign granule positions to every packet, or even every
+ individual sample. The one exception is the last page in the stream,
+ as described below.
+
+ All other pages with completed packets after the first MUST have a
+ granule position equal to the number of samples contained in packets
+ that complete on that page plus the granule position of the most
+ recent page with completed packets. This guarantees that a demuxer
+ can assign individual packets the same granule position when working
+ forwards as when working backwards. For this to work, there cannot
+ be any gaps.
+
+4.1. Repairing Gaps in Real-Time Streams
+
+ In order to support capturing a real-time stream that has lost or not
+ transmitted packets, a multiplexer (muxer) SHOULD emit packets that
+ explicitly request the use of Packet Loss Concealment (PLC) in place
+ of the missing packets. Implementations that fail to do so still
+ MUST NOT increment the granule position for a page by anything other
+ than the number of samples contained in packets that actually
+ complete on that page.
+
+ Only gaps that are a multiple of 2.5 ms are repairable, as these are
+ the only durations that can be created by packet loss or
+ discontinuous transmission. Muxers need not handle other gap sizes.
+ Creating the necessary packets involves synthesizing a TOC byte
+ (defined in Section 3.1 of [RFC6716]) -- and whatever additional
+ internal framing is needed -- to indicate the packet duration for
+ each stream. The actual length of each missing Opus frame inside the
+ packet is zero bytes, as defined in Section 3.2.1 of [RFC6716].
+
+ Zero-byte frames MAY be packed into packets using any of codes 0, 1,
+ 2, or 3. When successive frames have the same configuration, the
+ higher code packings reduce overhead. Likewise, if the TOC
+ configuration matches, the muxer MAY further combine the empty frames
+ with previous or subsequent nonzero-length frames (using code 2 or
+ variable bitrate (VBR) code 3).
+
+ [RFC6716] does not impose any requirements on the PLC, but this
+ section outlines choices that are expected to have a positive
+ influence on most PLC implementations, including the reference
+ implementation. Synthesized TOC sequences SHOULD maintain the same
+ mode, audio bandwidth, channel count, and frame size as the previous
+ packet (if any). This is the simplest and usually the most well-
+
+
+
+Terriberry, et al. Standards Track [Page 7]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ tested case for the PLC to handle and it covers all losses that do
+ not include a configuration switch, as defined in Section 4.5
+ of [RFC6716].
+
+ When a previous packet is available, keeping the audio bandwidth and
+ channel count the same allows the PLC to provide maximum continuity
+ in the concealment data it generates. However, if the size of the
+ gap is not a multiple of the most recent frame size, then the frame
+ size will have to change for at least some frames. Such changes
+ SHOULD be delayed as long as possible to simplify things for PLC
+ implementations.
+
+ As an example, a 95 ms gap could be encoded as nineteen 5 ms frames
+ in two bytes with a single constant bitrate (CBR) code 3 packet. If
+ the previous frame size was 20 ms, using four 20 ms frames followed
+ by three 5 ms frames requires 4 bytes (plus an extra byte of Ogg
+ lacing overhead), but allows the PLC to use its well-tested steady
+ state behavior for as long as possible. The total bitrate of the
+ latter approach, including Ogg overhead, is about 0.4 kbps, so the
+ impact on file size is minimal.
+
+ Changing modes is discouraged, since this causes some decoder
+ implementations to reset their PLC state. However, SILK and Hybrid
+ mode frames cannot fill gaps that are not a multiple of 10 ms. If
+ switching to CELT mode is needed to match the gap size, a muxer
+ SHOULD do so at the end of the gap to allow the PLC to function for
+ as long as possible.
+
+ In the example above, if the previous frame was a 20 ms SILK mode
+ frame, the better solution is to synthesize a packet describing four
+ 20 ms SILK frames, followed by a packet with a single 10 ms SILK
+ frame, and finally a packet with a 5 ms CELT frame, to fill the 95 ms
+ gap. This also requires four bytes to describe the synthesized
+ packet data (two bytes for a CBR code 3 and one byte each for two
+ code 0 packets) but three bytes of Ogg lacing overhead are needed to
+ mark the packet boundaries. At 0.6 kbps, this is still a minimal
+ bitrate impact over a naive, low-quality solution.
+
+ Since medium-band audio is an option only in the SILK mode, wideband
+ frames SHOULD be generated if switching from that configuration to
+ CELT mode, to ensure that any PLC implementation that does try to
+ migrate state between the modes will be able to preserve all of the
+ available audio bandwidth.
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 8]
+
+RFC 7845 Ogg Opus April 2016
+
+
+4.2. Pre-skip
+
+ There is some amount of latency introduced during the decoding
+ process, to allow for overlap in the CELT mode, stereo mixing in the
+ SILK mode, and resampling. The encoder might have introduced
+ additional latency through its own resampling and analysis (though
+ the exact amount is not specified). Therefore, the first few samples
+ produced by the decoder do not correspond to real input audio, but
+ are instead composed of padding inserted by the encoder to compensate
+ for this latency. These samples need to be stored and decoded, as
+ Opus is an asymptotically convergent predictive codec, meaning the
+ decoded contents of each frame depend on the recent history of
+ decoder inputs. However, a player will want to skip these samples
+ after decoding them.
+
+ A 'pre-skip' field in the ID header (see Section 5.1) signals the
+ number of samples that SHOULD be skipped (decoded but discarded) at
+ the beginning of the stream, though some specific applications might
+ have a reason for looking at that data. This amount need not be a
+ multiple of 2.5 ms, MAY be smaller than a single packet, or MAY span
+ the contents of several packets. These samples are not valid audio.
+
+ For example, if the first Opus frame uses the CELT mode, it will
+ always produce 120 samples of windowed overlap-add data. However,
+ the overlap data is initially all zeros (since there is no prior
+ frame), meaning this cannot, in general, accurately represent the
+ original audio. The SILK mode requires additional delay to account
+ for its analysis and resampling latency. The encoder delays the
+ original audio to avoid this problem.
+
+ The 'pre-skip' field MAY also be used to perform sample-accurate
+ cropping of already encoded streams. In this case, a value of at
+ least 3840 samples (80 ms) provides sufficient history to the decoder
+ that it will have converged before the stream's output begins.
+
+4.3. PCM Sample Position
+
+ The PCM sample position is determined from the granule position using
+ the following formula:
+
+ 'PCM sample position' = 'granule position' - 'pre-skip'
+
+ For example, if the granule position of the first audio data page is
+ 59,971, and the pre-skip is 11,971, then the PCM sample position of
+ the last decoded sample from that page is 48,000.
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 9]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ This can be converted into a playback time using the following
+ formula:
+
+ 'PCM sample position'
+ 'playback time' = ---------------------
+ 48000.0
+
+ The initial PCM sample position before any samples are played is
+ normally '0'. In this case, the PCM sample position of the first
+ audio sample to be played starts at '1', because it marks the time on
+ the clock _after_ that sample has been played, and a stream that is
+ exactly one second long has a final PCM sample position of '48000',
+ as in the example here.
+
+ Vorbis streams use a granule position smaller than the number of
+ audio samples contained in the first audio data page to indicate that
+ some of those samples are trimmed from the output (see
+ [VORBIS-TRIM]). However, to do so, Vorbis requires that the first
+ audio data page contains exactly two packets, in order to allow the
+ decoder to perform PCM position adjustments before needing to return
+ any PCM data. Opus uses the pre-skip mechanism for this purpose
+ instead, since the encoder might introduce more than a single
+ packet's worth of latency, and since very large packets in streams
+ with a very large number of channels might not fit on a single page.
+
+4.4. End Trimming
+
+ The page with the 'end of stream' flag set MAY have a granule
+ position that indicates the page contains less audio data than would
+ normally be returned by decoding up through the final packet. This
+ is used to end the stream somewhere other than an even frame
+ boundary. The granule position of the most recent audio data page
+ with completed packets is used to make this determination, or '0' is
+ used if there were no previous audio data pages with a completed
+ packet. The difference between these granule positions indicates how
+ many samples to keep after decoding the packets that completed on the
+ final page. The remaining samples are discarded. The number of
+ discarded samples SHOULD be no larger than the number decoded from
+ the last packet.
+
+4.5. Restrictions on the Initial Granule Position
+
+ The granule position of the first audio data page with a completed
+ packet MAY be larger than the number of samples contained in packets
+ that complete on that page. However, it MUST NOT be smaller, unless
+ that page has the 'end of stream' flag set. Allowing a granule
+ position larger than the number of samples allows the beginning of a
+ stream to be cropped or a live stream to be joined without rewriting
+
+
+
+Terriberry, et al. Standards Track [Page 10]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ the granule position of all the remaining pages. This means that the
+ PCM sample position just before the first sample to be played MAY be
+ larger than '0'. Synchronization when multiplexing with other
+ logical streams still uses the PCM sample position relative to '0' to
+ compute sample times. This does not affect the behavior of pre-skip:
+ exactly 'pre-skip' samples SHOULD be skipped from the beginning of
+ the decoded output, even if the initial PCM sample position is
+ greater than zero.
+
+ On the other hand, a granule position that is smaller than the number
+ of decoded samples prevents a demuxer from working backwards to
+ assign each packet or each individual sample a valid granule
+ position, since granule positions are non-negative. An
+ implementation MUST treat any stream as invalid if the granule
+ position is smaller than the number of samples contained in packets
+ that complete on the first audio data page with a completed packet,
+ unless that page has the 'end of stream' flag set. It MAY defer this
+ action until it decodes the last packet completed on that page.
+
+ If that page has the 'end of stream' flag set, a demuxer MUST treat
+ any stream as invalid if its granule position is smaller than the
+ 'pre-skip' amount. This would indicate that there are more samples
+ to be skipped from the initial decoded output than exist in the
+ stream. If the granule position is smaller than the number of
+ decoded samples produced by the packets that complete on that page,
+ then a demuxer MUST use an initial granule position of '0', and can
+ work forwards from '0' to timestamp individual packets. If the
+ granule position is larger than the number of decoded samples
+ available, then the demuxer MUST still work backwards as described
+ above, even if the 'end of stream' flag is set, to determine the
+ initial granule position, and thus the initial PCM sample position.
+ Both of these will be greater than '0' in this case.
+
+4.6. Seeking and Pre-roll
+
+ Seeking in Ogg files is best performed using a bisection search for a
+ page whose granule position corresponds to a PCM position at or
+ before the seek target. With appropriately weighted bisection,
+ accurate seeking can be performed in just one or two bisections on
+ average, even in multi-gigabyte files. See [SEEKING] for an example
+ of general implementation guidance.
+
+ When seeking within an Ogg Opus stream, an implementation SHOULD
+ start decoding (and discarding the output) at least 3840 samples
+ (80 ms) prior to the seek target in order to ensure that the output
+ audio is correct by the time it reaches the seek target. This
+ "pre-roll" is separate from, and unrelated to, the pre-skip used at
+ the beginning of the stream. If the point 80 ms prior to the seek
+
+
+
+Terriberry, et al. Standards Track [Page 11]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ target comes before the initial PCM sample position, an
+ implementation SHOULD start decoding from the beginning of the
+ stream, applying pre-skip as normal, regardless of whether the pre-
+ skip is larger or smaller than 80 ms, and then continue to discard
+ samples to reach the seek target (if any).
+
+5. Header Packets
+
+ An Ogg Opus logical stream contains exactly two mandatory header
+ packets: an identification header and a comment header.
+
+5.1. Identification Header
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | 'O' | 'p' | 'u' | 's' |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | 'H' | 'e' | 'a' | 'd' |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Version = 1 | Channel Count | Pre-skip |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Input Sample Rate (Hz) |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Output Gain (Q7.8 in dB) | Mapping Family| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ :
+ | |
+ : Optional Channel Mapping Table... :
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 2: ID Header Packet
+
+ The fields in the identification (ID) header have the following
+ meaning:
+
+ 1. Magic Signature:
+
+ This is an 8-octet (64-bit) field that allows codec
+ identification and is human readable. It contains, in order, the
+ magic numbers:
+
+ 0x4F 'O'
+
+ 0x70 'p'
+
+ 0x75 'u'
+
+
+
+
+Terriberry, et al. Standards Track [Page 12]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 0x73 's'
+
+ 0x48 'H'
+
+ 0x65 'e'
+
+ 0x61 'a'
+
+ 0x64 'd'
+
+ Starting with "Op" helps distinguish it from audio data packets,
+ as this is an invalid TOC sequence.
+
+ 2. Version (8 bits, unsigned):
+
+ The version number MUST always be '1' for this version of the
+ encapsulation specification. Implementations SHOULD treat
+ streams where the upper four bits of the version number match
+ that of a recognized specification as backwards compatible with
+ that specification. That is, the version number can be split
+ into "major" and "minor" version sub-fields, with changes to the
+ minor sub-field (in the lower four bits) signaling compatible
+ changes. For example, an implementation of this specification
+ SHOULD accept any stream with a version number of '15' or less,
+ and SHOULD assume any stream with a version number '16' or
+ greater is incompatible. The initial version '1' was chosen to
+ keep implementations from relying on this octet as a null
+ terminator for the "OpusHead" string.
+
+ 3. Output Channel Count 'C' (8 bits, unsigned):
+
+ This is the number of output channels. This might be different
+ than the number of encoded channels, which can change on a
+ packet-by-packet basis. This value MUST NOT be zero. The
+ maximum allowable value depends on the channel mapping family,
+ and might be as large as 255. See Section 5.1.1 for details.
+
+ 4. Pre-skip (16 bits, unsigned, little endian):
+
+ This is the number of samples (at 48 kHz) to discard from the
+ decoder output when starting playback, and also the number to
+ subtract from a page's granule position to calculate its PCM
+ sample position. When cropping the beginning of existing Ogg
+ Opus streams, a pre-skip of at least 3,840 samples (80 ms) is
+ RECOMMENDED to ensure complete convergence in the decoder.
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 13]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 5. Input Sample Rate (32 bits, unsigned, little endian):
+
+ This is the sample rate of the original input (before encoding),
+ in Hz. This field is _not_ the sample rate to use for playback
+ of the encoded data.
+
+ Opus can switch between internal audio bandwidths of 4, 6, 8, 12,
+ and 20 kHz. Each packet in the stream can have a different audio
+ bandwidth. Regardless of the audio bandwidth, the reference
+ decoder supports decoding any stream at a sample rate of 8, 12,
+ 16, 24, or 48 kHz. The original sample rate of the audio passed
+ to the encoder is not preserved by the lossy compression.
+
+ An Ogg Opus player SHOULD select the playback sample rate
+ according to the following procedure:
+
+ 1. If the hardware supports 48 kHz playback, decode at 48 kHz.
+
+ 2. Otherwise, if the hardware's highest available sample rate is
+ a supported rate, decode at this sample rate.
+
+ 3. Otherwise, if the hardware's highest available sample rate is
+ less than 48 kHz, decode at the next higher Opus supported
+ rate above the highest available hardware rate and resample.
+
+ 4. Otherwise, decode at 48 kHz and resample.
+
+ However, the 'input sample rate' field allows the muxer to pass
+ the sample rate of the original input stream as metadata. This
+ is useful when the user requires the output sample rate to match
+ the input sample rate. For example, when not playing the output,
+ an implementation writing PCM format samples to disk might choose
+ to resample the audio back to the original input sample rate to
+ reduce surprise to the user, who might reasonably expect to get
+ back a file with the same sample rate.
+
+ A value of zero indicates "unspecified". Muxers SHOULD write the
+ actual input sample rate or zero, but implementations that do
+ something with this field SHOULD take care to behave sanely if
+ given crazy values (e.g., do not actually upsample the output to
+ 10 MHz if requested). Implementations SHOULD support input
+ sample rates between 8 kHz and 192 kHz (inclusive). Rates
+ outside this range MAY be ignored by falling back to the default
+ rate of 48 kHz instead.
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 14]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 6. Output Gain (16 bits, signed, little endian):
+
+ This is a gain to be applied when decoding. It is 20*log10 of
+ the factor by which to scale the decoder output to achieve the
+ desired playback volume, stored in a 16-bit, signed, two's
+ complement fixed-point value with 8 fractional bits (i.e.,
+ Q7.8 [Q-NOTATION]).
+
+ To apply the gain, an implementation could use the following:
+
+ sample *= pow(10, output_gain/(20.0*256))
+
+ where 'output_gain' is the raw 16-bit value from the header.
+
+ Players and media frameworks SHOULD apply it by default. If a
+ player chooses to apply any volume adjustment or gain
+ modification, such as the R128_TRACK_GAIN (see Section 5.2), the
+ adjustment MUST be applied in addition to this output gain in
+ order to achieve playback at the normalized volume.
+
+ A muxer SHOULD set this field to zero, and instead apply any gain
+ prior to encoding, when this is possible and does not conflict
+ with the user's wishes. A nonzero output gain indicates the gain
+ was adjusted after encoding, or that a user wished to adjust the
+ gain for playback while preserving the ability to recover the
+ original signal amplitude.
+
+ Although the output gain has enormous range (+/- 128 dB, enough
+ to amplify inaudible sounds to the threshold of physical pain),
+ most applications can only reasonably use a small portion of this
+ range around zero. The large range serves in part to ensure that
+ gain can always be losslessly transferred between OpusHead and
+ R128 gain tags (see below) without saturating.
+
+ 7. Channel Mapping Family (8 bits, unsigned):
+
+ This octet indicates the order and semantic meaning of the output
+ channels.
+
+ Each currently specified value of this octet indicates a mapping
+ family, which defines a set of allowed channel counts, and the
+ ordered set of channel names for each allowed channel count. The
+ details are described in Section 5.1.1.
+
+ 8. Channel Mapping Table:
+
+ This table defines the mapping from encoded streams to output
+ channels. Its contents are specified in Section 5.1.1.
+
+
+
+Terriberry, et al. Standards Track [Page 15]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ All fields in the ID headers are REQUIRED, except for 'channel
+ mapping table', which MUST be omitted when the channel mapping family
+ is 0, but is REQUIRED otherwise. Implementations SHOULD treat a
+ stream as invalid if it contains an ID header that does not have
+ enough data for these fields, even if it contain a valid 'magic
+ signature'. Future versions of this specification, even backwards-
+ compatible versions, might include additional fields in the ID
+ header. If an ID header has a compatible major version, but a larger
+ minor version, an implementation MUST NOT treat it as invalid for
+ containing additional data not specified here, provided it still
+ completes on the first page.
+
+5.1.1. Channel Mapping
+
+ An Ogg Opus stream allows mapping one number of Opus streams (N) to a
+ possibly larger number of decoded channels (M + N) to yet another
+ number of output channels (C), which might be larger or smaller than
+ the number of decoded channels. The order and meaning of these
+ channels are defined by a channel mapping, which consists of the
+ 'channel mapping family' octet and, for channel mapping families
+ other than family 0, a 'channel mapping table', as illustrated
+ in Figure 3.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+
+ | Stream Count |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Coupled Count | Channel Mapping... :
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 3: Channel Mapping Table
+
+ The fields in the channel mapping table have the following meaning:
+
+ 1. Stream Count 'N' (8 bits, unsigned):
+
+ This is the total number of streams encoded in each Ogg packet.
+ This value is necessary to correctly parse the packed Opus
+ packets inside an Ogg packet, as described in Section 3. This
+ value MUST NOT be zero, as without at least one Opus packet with
+ a valid TOC sequence, a demuxer cannot recover the duration of an
+ Ogg packet.
+
+ For channel mapping family 0, this value defaults to 1, and is
+ not coded.
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 16]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 2. Coupled Stream Count 'M' (8 bits, unsigned):
+
+ This is the number of streams whose decoders are to be configured
+ to produce two channels (stereo). This MUST be no larger than
+ the total number of streams, N.
+
+ Each packet in an Opus stream has an internal channel count of 1
+ or 2, which can change from packet to packet. This is selected
+ by the encoder depending on the bitrate and the audio being
+ encoded. The original channel count of the audio passed to the
+ encoder is not necessarily preserved by the lossy compression.
+
+ Regardless of the internal channel count, any Opus stream can be
+ decoded as mono (a single channel) or stereo (two channels) by
+ appropriate initialization of the decoder. The 'coupled stream
+ count' field indicates that the decoders for the first M Opus
+ streams are to be initialized for stereo (two-channel) output,
+ and the remaining (N - M) decoders are to be initialized for mono
+ (a single channel) only. The total number of decoded channels,
+ (M + N), MUST be no larger than 255, as there is no way to index
+ more channels than that in the channel mapping.
+
+ For channel mapping family 0, this value defaults to (C - 1)
+ (i.e., 0 for mono and 1 for stereo), and is not coded.
+
+ 3. Channel Mapping (8*C bits):
+
+ This contains one octet per output channel, indicating which
+ decoded channel is to be used for each one. Let 'index' be the
+ value of this octet for a particular output channel. This value
+ MUST either be smaller than (M + N) or be the special value 255.
+ If 'index' is less than 2*M, the output MUST be taken from
+ decoding stream ('index'/2) as stereo and selecting the left
+ channel if 'index' is even, and the right channel if 'index' is
+ odd. If 'index' is 2*M or larger, but less than 255, the output
+ MUST be taken from decoding stream ('index' - M) as mono. If
+ 'index' is 255, the corresponding output channel MUST contain
+ pure silence.
+
+ The number of output channels, C, is not constrained to match the
+ number of decoded channels (M + N). A single index value MAY
+ appear multiple times, i.e., the same decoded channel might be
+ mapped to multiple output channels. Some decoded channels might
+ not be assigned to any output channel, as well.
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 17]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ For channel mapping family 0, the first index defaults to 0, and
+ if C == 2, the second index defaults to 1. Neither index is
+ coded.
+
+ After producing the output channels, the channel mapping family
+ determines the semantic meaning of each one. There are three defined
+ mapping families in this specification.
+
+5.1.1.1. Channel Mapping Family 0
+
+ Allowed numbers of channels: 1 or 2. RTP mapping. This is the same
+ channel interpretation as [RFC7587].
+
+ o 1 channel: monophonic (mono).
+
+ o 2 channels: stereo (left, right).
+
+ Special mapping: This channel mapping family also indicates that the
+ content consists of a single Opus stream that is stereo if and only
+ if C == 2, with stream index 0 mapped to output channel 0 (mono, or
+ left channel) and stream index 1 mapped to output channel 1 (right
+ channel) if stereo. When the 'channel mapping family' octet has this
+ value, the channel mapping table MUST be omitted from the ID header
+ packet.
+
+5.1.1.2. Channel Mapping Family 1
+
+ Allowed numbers of channels: 1...8. Vorbis channel order (see
+ below).
+
+ Each channel is assigned to a speaker location in a conventional
+ surround arrangement. Specific locations depend on the number of
+ channels, and are given below in order of the corresponding channel
+ indices.
+
+ o 1 channel: monophonic (mono).
+
+ o 2 channels: stereo (left, right).
+
+ o 3 channels: linear surround (left, center, right).
+
+ o 4 channels: quadraphonic (front left, front right, rear left,
+ rear right).
+
+ o 5 channels: 5.0 surround (front left, front center, front right,
+ rear left, rear right).
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 18]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ o 6 channels: 5.1 surround (front left, front center, front right,
+ rear left, rear right, LFE).
+
+ o 7 channels: 6.1 surround (front left, front center, front right,
+ side left, side right, rear center, LFE).
+
+ o 8 channels: 7.1 surround (front left, front center, front right,
+ side left, side right, rear left, rear right, LFE).
+
+ This set of surround options and speaker location orderings is the
+ same as those used by the Vorbis codec [VORBIS-MAPPING]. The
+ ordering is different from the one used by the WAVE
+ [WAVE-MULTICHANNEL] and Free Lossless Audio Codec (FLAC) [FLAC]
+ formats, so correct ordering requires permutation of the output
+ channels when decoding to or encoding from those formats. "LFE" here
+ refers to a Low Frequency Effects channel, often mapped to a
+ subwoofer with no particular spatial position. Implementations
+ SHOULD identify "side" or "rear" speaker locations with "surround"
+ and "back" as appropriate when interfacing with audio formats or
+ systems that prefer that terminology.
+
+5.1.1.3. Channel Mapping Family 255
+
+ Allowed numbers of channels: 1...255. No defined channel meaning.
+
+ Channels are unidentified. General-purpose players SHOULD NOT
+ attempt to play these streams. Offline implementations MAY
+ deinterleave the output into separate PCM files, one per channel.
+ Implementations SHOULD NOT produce output for channels mapped to
+ stream index 255 (pure silence) unless they have no other way to
+ indicate the index of non-silent channels.
+
+5.1.1.4. Undefined Channel Mappings
+
+ The remaining channel mapping families (2...254) are reserved. A
+ demuxer implementation encountering a reserved 'channel mapping
+ family' value SHOULD act as though the value is 255.
+
+5.1.1.5. Downmixing
+
+ An Ogg Opus player MUST support any valid channel mapping with a
+ channel mapping family of 0 or 1, even if the number of channels does
+ not match the physically connected audio hardware. Players SHOULD
+ perform channel mixing to increase or reduce the number of channels
+ as needed.
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 19]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ Implementations MAY use the matrices in Figures 4 through 9 to
+ implement downmixing from multichannel files using channel mapping
+ family 1 (Section 5.1.1.2), which are known to give acceptable
+ results for stereo. Matrices for 3 and 4 channels are normalized so
+ each coefficient row sums to 1 to avoid clipping. For 5 or more
+ channels, they are normalized to 2 as a compromise between clipping
+ and dynamic range reduction.
+
+ In these matrices the front-left and front-right channels are
+ generally passed through directly. When a surround channel is split
+ between both the left and right stereo channels, coefficients are
+ chosen so their squares sum to 1, which helps preserve the perceived
+ intensity. Rear channels are mixed more diffusely or attenuated to
+ maintain focus on the front channels.
+
+ L output = ( 0.585786 * left + 0.414214 * center )
+ R output = ( 0.414214 * center + 0.585786 * right )
+
+ Exact coefficient values are 1 and 1/sqrt(2), multiplied by
+ 1/(1 + 1/sqrt(2)) for normalization.
+
+ Figure 4: Stereo Downmix Matrix for the
+ Linear Surround Channel Mapping
+
+ / \ / \ / FL \
+ | L output | | 0.422650 0.000000 0.366025 0.211325 | | FR |
+ | R output | = | 0.000000 0.422650 0.211325 0.366025 | | RL |
+ \ / \ / \ RR /
+
+ Exact coefficient values are 1, sqrt(3)/2 and 1/2, multiplied by
+ 1/(1 + sqrt(3)/2 + 1/2) for normalization.
+
+ Figure 5: Stereo Downmix Matrix for the Quadraphonic Channel Mapping
+
+ / FL \
+ / \ / \ | FC |
+ | L | | 0.650802 0.460186 0.000000 0.563611 0.325401 | | FR |
+ | R | = | 0.000000 0.460186 0.650802 0.325401 0.563611 | | RL |
+ \ / \ / | RR |
+ \ /
+
+ Exact coefficient values are 1, 1/sqrt(2), sqrt(3)/2 and 1/2,
+ multiplied by 2/(1 + 1/sqrt(2) + sqrt(3)/2 + 1/2) for normalization.
+
+ Figure 6: Stereo Downmix Matrix for the 5.0 Surround Mapping
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 20]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ /FL \
+ / \ / \ |FC |
+ |L| | 0.529067 0.374107 0.000000 0.458186 0.264534 0.374107 | |FR |
+ |R| = | 0.000000 0.374107 0.529067 0.264534 0.458186 0.374107 | |RL |
+ \ / \ / |RR |
+ \LFE/
+ Exact coefficient values are 1, 1/sqrt(2), sqrt(3)/2 and 1/2,
+ multiplied by 2/(1 + 1/sqrt(2) + sqrt(3)/2 + 1/2 + 1/sqrt(2)) for
+ normalization.
+
+ Figure 7: Stereo Downmix Matrix for the 5.1 Surround Mapping
+
+ / \
+ | 0.455310 0.321953 0.000000 0.394310 0.227655 0.278819 0.321953 |
+ | 0.000000 0.321953 0.455310 0.227655 0.394310 0.278819 0.321953 |
+ \ /
+
+ Exact coefficient values are 1, 1/sqrt(2), sqrt(3)/2, 1/2 and
+ sqrt(3)/2/sqrt(2), multiplied by 2/(1 + 1/sqrt(2) + sqrt(3)/2 + 1/2 +
+ sqrt(3)/2/sqrt(2) + 1/sqrt(2)) for normalization. The coefficients
+ are in the same order as in Section 5.1.1.2 and the matrices above.
+
+ Figure 8: Stereo Downmix Matrix for the 6.1 Surround Mapping
+
+ / \
+ | .388631 .274804 .000000 .336565 .194316 .336565 .194316 .274804 |
+ | .000000 .274804 .388631 .194316 .336565 .194316 .336565 .274804 |
+ \ /
+
+ Exact coefficient values are 1, 1/sqrt(2), sqrt(3)/2 and 1/2,
+ multiplied by 2/(2 + 2/sqrt(2) + sqrt(3)) for normalization. The
+ coefficients are in the same order as in Section 5.1.1.2 and the
+ matrices above.
+
+ Figure 9: Stereo Downmix Matrix for the 7.1 Surround Mapping
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 21]
+
+RFC 7845 Ogg Opus April 2016
+
+
+5.2. Comment Header
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | 'O' | 'p' | 'u' | 's' |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | 'T' | 'a' | 'g' | 's' |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Vendor String Length |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | |
+ : Vendor String... :
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | User Comment List Length |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | User Comment #0 String Length |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | |
+ : User Comment #0 String... :
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | User Comment #1 String Length |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ : :
+
+ Figure 10: Comment Header Packet
+
+ The comment header consists of a 64-bit 'magic signature' field,
+ followed by data in the same format as the [VORBIS-COMMENT] header
+ used in Ogg Vorbis, except (like Ogg Theora and Speex) the final
+ 'framing bit' specified in the Vorbis specification is not present.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 22]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 1. Magic Signature:
+
+ This is an 8-octet (64-bit) field that allows codec
+ identification and is human readable. It contains, in order, the
+ magic numbers:
+
+ 0x4F 'O'
+
+ 0x70 'p'
+
+ 0x75 'u'
+
+ 0x73 's'
+
+ 0x54 'T'
+
+ 0x61 'a'
+
+ 0x67 'g'
+
+ 0x73 's'
+
+ Starting with "Op" helps distinguish it from audio data packets,
+ as this is an invalid TOC sequence.
+
+ 2. Vendor String Length (32 bits, unsigned, little endian):
+
+ This field gives the length of the following vendor string, in
+ octets. It MUST NOT indicate that the vendor string is longer
+ than the rest of the packet.
+
+ 3. Vendor String (variable length, UTF-8 vector):
+
+ This is a simple human-readable tag for vendor information,
+ encoded as a UTF-8 string [RFC3629]. No terminating null octet
+ is necessary.
+
+ This tag is intended to identify the codec encoder and
+ encapsulation implementations, for tracing differences in
+ technical behavior. User-facing applications can use the
+ 'ENCODER' user comment tag to identify themselves.
+
+
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 23]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 4. User Comment List Length (32 bits, unsigned, little endian):
+
+ This field indicates the number of user-supplied comments. It
+ MAY indicate there are zero user-supplied comments, in which case
+ there are no additional fields in the packet. It MUST NOT
+ indicate that there are so many comments that the comment string
+ lengths would require more data than is available in the rest of
+ the packet.
+
+ 5. User Comment #i String Length (32 bits, unsigned, little endian):
+
+ This field gives the length of the following user comment string,
+ in octets. There is one for each user comment indicated by the
+ 'user comment list length' field. It MUST NOT indicate that the
+ string is longer than the rest of the packet.
+
+ 6. User Comment #i String (variable length, UTF-8 vector):
+
+ This field contains a single user comment encoded as a UTF-8
+ string [RFC3629]. There is one for each user comment indicated
+ by the 'user comment list length' field.
+
+ The 'vendor string length' and 'user comment list length' fields are
+ REQUIRED, and implementations SHOULD treat a stream as invalid if it
+ contains a comment header that does not have enough data for these
+ fields, or that does not contain enough data for the corresponding
+ vendor string or user comments they describe. Making this check
+ before allocating the associated memory to contain the data helps
+ prevent a possible Denial-of-Service (DoS) attack from small comment
+ headers that claim to contain strings longer than the entire packet
+ or more user comments than could possibly fit in the packet.
+
+ Immediately following the user comment list, the comment header MAY
+ contain zero-padding or other binary data that is not specified here.
+ If the least-significant bit of the first byte of this data is 1,
+ then editors SHOULD preserve the contents of this data when updating
+ the tags, but if this bit is 0, all such data MAY be treated as
+ padding, and truncated or discarded as desired. This allows informal
+ experimentation with the format of this binary data until it can be
+ specified later.
+
+ The comment header can be arbitrarily large and might be spread over
+ a large number of Ogg pages. Implementations MUST avoid attempting
+ to allocate excessive amounts of memory when presented with a very
+ large comment header. To accomplish this, implementations MAY treat
+ a stream as invalid if it has a comment header larger than
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 24]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 125,829,120 octets (120 MB), and MAY ignore individual comments that
+ are not fully contained within the first 61,440 octets of the comment
+ header.
+
+5.2.1. Tag Definitions
+
+ The user comment strings follow the NAME=value format described by
+ [VORBIS-COMMENT] with the same recommended tag names: ARTIST, TITLE,
+ DATE, ALBUM, and so on.
+
+ Two new comment tags are introduced here:
+
+ First, an optional gain for track normalization:
+
+ R128_TRACK_GAIN=-573
+
+ representing the volume shift needed to normalize the track's volume
+ during isolated playback, in random shuffle, and so on. The gain is
+ a Q7.8 fixed-point number in dB, as in the ID header's 'output gain'
+ field. This tag is similar to the REPLAYGAIN_TRACK_GAIN tag in
+ Vorbis [REPLAY-GAIN], except that the normal volume reference is the
+ [EBU-R128] standard.
+
+ Second, an optional gain for album normalization:
+
+ R128_ALBUM_GAIN=111
+
+ representing the volume shift needed to normalize the overall volume
+ when played as part of a particular collection of tracks. The gain
+ is also a Q7.8 fixed-point number in dB, as in the ID header's
+ 'output gain' field. The values '-573' and '111' given here are just
+ examples.
+
+ An Ogg Opus stream MUST NOT have more than one of each of these tags,
+ and, if present, their values MUST be an integer from -32768 to
+ 32767, inclusive, represented in ASCII as a base 10 number with no
+ whitespace. A leading '+' or '-' character is valid. Leading zeros
+ are also permitted, but the value MUST be represented by no more than
+ 6 characters. Other non-digit characters MUST NOT be present.
+
+ If present, R128_TRACK_GAIN and R128_ALBUM_GAIN MUST correctly
+ represent the R128 normalization gain relative to the 'output gain'
+ field specified in the ID header. If a player chooses to make use of
+ the R128_TRACK_GAIN tag or the R128_ALBUM_GAIN tag, it MUST apply
+ those gains _in addition_ to the 'output gain' value. If a tool
+ modifies the ID header's 'output gain' field, it MUST also update or
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 25]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ remove the R128_TRACK_GAIN and R128_ALBUM_GAIN comment tags if
+ present. A muxer SHOULD place the gain it wants other tools to use
+ by default into the 'output gain' field, and not the comment tag.
+
+ To avoid confusion with multiple normalization schemes, an Opus
+ comment header SHOULD NOT contain any of the REPLAYGAIN_TRACK_GAIN,
+ REPLAYGAIN_TRACK_PEAK, REPLAYGAIN_ALBUM_GAIN, or
+ REPLAYGAIN_ALBUM_PEAK tags, unless they are only to be used in some
+ context where there is guaranteed to be no such confusion.
+ [EBU-R128] normalization is preferred to the earlier REPLAYGAIN
+ schemes because of its clear definition and adoption by industry.
+ Peak normalizations are difficult to calculate reliably for lossy
+ codecs because of variation in excursion heights due to decoder
+ differences. In the authors' investigations, they were not applied
+ consistently or broadly enough to merit inclusion here.
+
+6. Packet Size Limits
+
+ Technically, valid Opus packets can be arbitrarily large due to the
+ padding format, although the amount of non-padding data they can
+ contain is bounded. These packets might be spread over a similarly
+ enormous number of Ogg pages. When encoding, implementations SHOULD
+ limit the use of padding in audio data packets to no more than is
+ necessary to make a VBR stream CBR, unless they have no reasonable
+ way to determine what is necessary. Demuxers SHOULD treat audio data
+ packets as invalid (treat them as if they were malformed Opus packets
+ with an invalid TOC sequence) if they are larger than 61,440 octets
+ per Opus stream, unless they have a specific reason for allowing
+ extra padding. Such packets necessarily contain more padding than
+ needed to make a stream CBR. Demuxers MUST avoid attempting to
+ allocate excessive amounts of memory when presented with a very large
+ packet. Demuxers MAY treat audio data packets as invalid or
+ partially process them if they are larger than 61,440 octets in an
+ Ogg Opus stream with channel mapping families 0 or 1. Demuxers MAY
+ treat audio data packets as invalid or partially process them in any
+ Ogg Opus stream if the packet is larger than 61,440 octets and also
+ larger than 7,680 octets per Opus stream. The presence of an
+ extremely large packet in the stream could indicate a memory
+ exhaustion attack or stream corruption.
+
+ In an Ogg Opus stream, the largest possible valid packet that does
+ not use padding has a size of (61,298*N - 2) octets. With
+ 255 streams, this is 15,630,988 octets and can span up to 61,298 Ogg
+ pages, all but one of which will have a granule position of -1. This
+ is, of course, a very extreme packet, consisting of 255 streams, each
+ containing 120 ms of audio encoded as 2.5 ms frames, each frame using
+ the maximum possible number of octets (1275) and stored in the least
+
+
+
+
+Terriberry, et al. Standards Track [Page 26]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ efficient manner allowed (a VBR code 3 Opus packet). Even in such a
+ packet, most of the data will be zeros as 2.5 ms frames cannot
+ actually use all 1275 octets.
+
+ The largest packet consisting of entirely useful data is
+ (15,326*N - 2) octets. This corresponds to 120 ms of audio encoded
+ as 10 ms frames in either SILK or Hybrid mode, but at a data rate of
+ over 1 Mbps, which makes little sense for the quality achieved.
+
+ A more reasonable limit is (7,664*N - 2) octets. This corresponds to
+ 120 ms of audio encoded as 20 ms stereo CELT mode frames, with a
+ total bitrate just under 511 kbps (not counting the Ogg encapsulation
+ overhead). For channel mapping family 1, N = 8 provides a reasonable
+ upper bound, as it allows for each of the 8 possible output channels
+ to be decoded from a separate stereo Opus stream. This gives a size
+ of 61,310 octets, which is rounded up to a multiple of 1,024 octets
+ to yield the audio data packet size of 61,440 octets that any
+ implementation is expected to be able to process successfully.
+
+7. Encoder Guidelines
+
+ When encoding Opus streams, Ogg muxers SHOULD take into account the
+ algorithmic delay of the Opus encoder.
+
+ In encoders derived from the reference implementation [RFC6716], the
+ number of samples can be queried with
+
+ opus_encoder_ctl(encoder_state, OPUS_GET_LOOKAHEAD(&delay_samples));
+
+ To achieve good quality in the very first samples of a stream,
+ implementations MAY use linear predictive coding (LPC) extrapolation
+ to generate at least 120 extra samples at the beginning to avoid the
+ Opus encoder having to encode a discontinuous signal. For more
+ information on linear prediction, see [LINEAR-PREDICTION]. For an
+ input file containing 'length' samples, the implementation SHOULD set
+ the 'pre-skip' header value to (delay_samples + extra_samples),
+ encode at least (length + delay_samples + extra_samples) samples, and
+ set the granule position of the last page to
+ (length + delay_samples + extra_samples). This ensures that the
+ encoded file has the same duration as the original, with no time
+ offset. The best way to pad the end of the stream is to also use LPC
+ extrapolation, but zero-padding is also acceptable.
+
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 27]
+
+RFC 7845 Ogg Opus April 2016
+
+
+7.1. LPC Extrapolation
+
+ The first step in LPC extrapolation is to compute linear prediction
+ coefficients [LPC-SAMPLE]. When extending the end of the signal,
+ order-N (typically with N ranging from 8 to 40) LPC analysis is
+ performed on a window near the end of the signal. The last N samples
+ are used as memory to an infinite impulse response (IIR) filter.
+
+ The filter is then applied on a zero input to extrapolate the end of
+ the signal. Let 'a(k)' be the kth LPC coefficient and 'x(n)' be the
+ nth sample of the signal. Each new sample past the end of the signal
+ is computed as
+
+ N
+ ---
+ x(n) = \ a(k)*x(n - k)
+ /
+ ---
+ k = 1
+
+ The process is repeated independently for each channel. It is
+ possible to extend the beginning of the signal by applying the same
+ process backward in time. When extending the beginning of the
+ signal, it is best to apply a "fade in" to the extrapolated signal,
+ e.g., by multiplying it by a half-Hanning window [HANNING].
+
+7.2. Continuous Chaining
+
+ In some applications, such as Internet radio, it is desirable to cut
+ a long stream into smaller chains, e.g., so the comment header can be
+ updated. This can be done simply by separating the input streams
+ into segments and encoding each segment independently. The drawback
+ of this approach is that it creates a small discontinuity at the
+ boundary due to the lossy nature of Opus. A muxer MAY avoid this
+ discontinuity by using the following procedure:
+
+ 1. Encode the last frame of the first segment as an independent
+ frame by turning off all forms of inter-frame prediction.
+ De-emphasis is allowed.
+
+ 2. Set the granule position of the last page to a point near the end
+ of the last frame.
+
+ 3. Begin the second segment with a copy of the last frame of the
+ first segment.
+
+ 4. Set the 'pre-skip' value of the second stream in such a way as to
+ properly join the two streams.
+
+
+
+Terriberry, et al. Standards Track [Page 28]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ 5. Continue the encoding process normally from there, without any
+ reset to the encoder.
+
+ In encoders derived from the reference implementation, inter-frame
+ prediction can be turned off by calling
+
+ opus_encoder_ctl(encoder_state, OPUS_SET_PREDICTION_DISABLED(1));
+
+ For best results, this implementation requires that prediction be
+ explicitly enabled again before resuming normal encoding, even after
+ a reset.
+
+8. Security Considerations
+
+ Implementations of the Opus codec need to take appropriate security
+ considerations into account, as outlined in [RFC4732]. This is just
+ as much a problem for the container as it is for the codec itself.
+ Malicious payloads and/or input streams can be used to attack codec
+ implementations. Implementations MUST NOT overrun their allocated
+ memory nor consume excessive resources when decoding payloads or
+ processing input streams. Although problems in encoding applications
+ are typically rarer, this still applies to a muxer, as
+ vulnerabilities would allow an attacker to attack transcoding
+ gateways.
+
+ Header parsing code contains the most likely area for potential
+ overruns. It is important for implementations to ensure their
+ buffers contain enough data for all of the required fields before
+ attempting to read it (for example, for all of the channel map data
+ in the ID header). Implementations would do well to validate the
+ indices of the channel map, also, to ensure they meet all of the
+ restrictions outlined in Section 5.1.1, in order to avoid attempting
+ to read data from channels that do not exist.
+
+ To avoid excessive resource usage, we advise implementations to be
+ especially wary of streams that might cause them to process far more
+ data than was actually transmitted. For example, a relatively small
+ comment header may contain values for the string lengths or user
+ comment list length that imply that it is many gigabytes in size.
+ Even computing the size of the required buffer could overflow a
+ 32-bit integer, and actually attempting to allocate such a buffer
+ before verifying it would be a reasonable size is a bad idea. After
+ reading the user comment list length, implementations might wish to
+ verify that the header contains at least the minimum amount of data
+ for that many comments (4 additional octets per comment, to indicate
+ each has a length of zero) before proceeding any further, again
+ taking care to avoid overflow in these calculations. If allocating
+
+
+
+
+Terriberry, et al. Standards Track [Page 29]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ an array of pointers to point at these strings, the size of the
+ pointers may be larger than 4 octets, potentially requiring a
+ separate overflow check.
+
+ Another bug in this class we have observed more than once involves
+ the handling of invalid data at the end of a stream. Often,
+ implementations will seek to the end of a stream to locate the last
+ timestamp in order to compute its total duration. If they do not
+ find a valid capture pattern and Ogg page from the desired logical
+ stream, they will back up and try again. If care is not taken to
+ avoid re-scanning data that was already scanned, this search can
+ quickly devolve into something with a complexity that is quadratic in
+ the amount of invalid data.
+
+ In general, when seeking, implementations will wish to be cautious
+ about the effects of invalid granule position values and ensure all
+ algorithms will continue to make progress and eventually terminate,
+ even if these are missing or out of order.
+
+ Like most other container formats, Ogg Opus streams SHOULD NOT be
+ used with insecure ciphers or cipher modes that are vulnerable to
+ known-plaintext attacks. Elements such as the Ogg page capture
+ pattern and the 'magic signature' fields in the ID header and the
+ comment header all have easily predictable values, in addition to
+ various elements of the codec data itself.
+
+9. Content Type
+
+ An "Ogg Opus file" consists of one or more sequentially multiplexed
+ segments, each containing exactly one Ogg Opus stream. The
+ RECOMMENDED mime-type for Ogg Opus files is "audio/ogg".
+
+ If more specificity is desired, one MAY indicate the presence of Opus
+ streams using the codecs parameter defined in [RFC6381] and
+ [RFC5334], e.g.,
+
+ audio/ogg; codecs=opus
+
+ for an Ogg Opus file.
+
+ The RECOMMENDED filename extension for Ogg Opus files is '.opus'.
+
+ When Opus is concurrently multiplexed with other streams in an Ogg
+ container, one SHOULD use one of the "audio/ogg", "video/ogg", or
+ "application/ogg" mime-types, as defined in [RFC5334]. Such streams
+ are not strictly "Ogg Opus files" as described above, since they
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 30]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ contain more than a single Opus stream per sequentially multiplexed
+ segment, e.g., video or multiple audio tracks. In such cases, the
+ '.opus' filename extension is NOT RECOMMENDED.
+
+ In either case, this document updates [RFC5334] to add "opus" as a
+ codecs parameter value with char[8]: 'OpusHead' as Codec Identifier.
+
+10. IANA Considerations
+
+ Per this document, IANA has updated the "Media Types" registry by
+ adding .opus as a file extension for "audio/ogg" and adding itself as
+ a reference alongside [RFC5334] for "audio/ogg", "video/ogg", and
+ "application/ogg" Media Types.
+
+ This document defines a new registry "Opus Channel Mapping Families"
+ to indicate how the semantic meanings of the channels in a multi-
+ channel Opus stream are described. IANA has created a new namespace
+ of "Opus Channel Mapping Families". This registry is listed on the
+ IANA Matrix. Modifications to this registry follow the
+ "Specification Required" registration policy as defined in [RFC5226].
+ Each registry entry consists of a Channel Mapping Family Number,
+ which is specified in decimal in the range 0 to 255, inclusive, and a
+ Reference (or list of references). Each Reference must point to
+ sufficient documentation to describe what information is coded in the
+ Opus identification header for this channel mapping family, how a
+ demuxer determines the stream count ('N') and coupled stream count
+ ('M') from this information, and how it determines the proper
+ interpretation of each of the decoded channels.
+
+ This document defines three initial assignments for this registry.
+
+ +-------+---------------------------+
+ | Value | Reference |
+ +-------+---------------------------+
+ | 0 | RFC 7845, Section 5.1.1.1 |
+ | | |
+ | 1 | RFC 7845, Section 5.1.1.2 |
+ | | |
+ | 255 | RFC 7845, Section 5.1.1.3 |
+ +-------+---------------------------+
+
+ The designated expert will determine if the Reference points to a
+ specification that meets the requirements for permanence and ready
+ availability laid out in [RFC5226] and whether it specifies the
+ information described above with sufficient clarity to allow
+ interoperable implementations.
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 31]
+
+RFC 7845 Ogg Opus April 2016
+
+
+11. References
+
+11.1. Normative References
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119,
+ DOI 10.17487/RFC2119, March 1997,
+ <https://www.rfc-editor.org/info/rfc2119>.
+
+ [RFC3533] Pfeiffer, S., "The Ogg Encapsulation Format Version 0",
+ RFC 3533, DOI 10.17487/RFC3533, May 2003,
+ <https://www.rfc-editor.org/info/rfc3533>.
+
+ [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO
+ 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November
+ 2003, <https://www.rfc-editor.org/info/rfc3629>.
+
+ [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an
+ IANA Considerations Section in RFCs", BCP 26, RFC 5226,
+ DOI 10.17487/RFC5226, May 2008,
+ <https://www.rfc-editor.org/info/rfc5226>.
+
+ [RFC5334] Goncalves, I., Pfeiffer, S., and C. Montgomery, "Ogg Media
+ Types", RFC 5334, DOI 10.17487/RFC5334, September 2008,
+ <https://www.rfc-editor.org/info/rfc5334>.
+
+ [RFC6381] Gellens, R., Singer, D., and P. Frojdh, "The 'Codecs' and
+ 'Profiles' Parameters for "Bucket" Media Types", RFC 6381,
+ DOI 10.17487/RFC6381, August 2011,
+ <https://www.rfc-editor.org/info/rfc6381>.
+
+ [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the
+ Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
+ September 2012, <https://www.rfc-editor.org/info/rfc6716>.
+
+ [EBU-R128]
+ EBU Technical Committee, "Loudness Recommendation EBU
+ R128", August 2011, <https://tech.ebu.ch/loudness>.
+
+ [VORBIS-COMMENT]
+ Montgomery, C., "Ogg Vorbis I Format Specification:
+ Comment Field and Header Specification", July 2002,
+ <https://www.xiph.org/vorbis/doc/v-comment.html>.
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 32]
+
+RFC 7845 Ogg Opus April 2016
+
+
+11.2. Informative References
+
+ [RFC4732] Handley, M., Ed., Rescorla, E., Ed., and IAB, "Internet
+ Denial-of-Service Considerations", RFC 4732,
+ DOI 10.17487/RFC4732, December 2006,
+ <https://www.rfc-editor.org/info/rfc4732>.
+
+ [RFC7587] Spittka, J., Vos, K., and JM. Valin, "RTP Payload Format
+ for the Opus Speech and Audio Codec", RFC 7587,
+ DOI 10.17487/RFC7587, June 2015,
+ <https://www.rfc-editor.org/info/rfc7587>.
+
+ [FLAC] Coalson, J., "FLAC - Free Lossless Audio Codec Format
+ Description", January 2008,
+ <https://xiph.org/flac/format.html>.
+
+ [HANNING] Wikipedia, "Hann window", February 2016,
+ <https://en.wikipedia.org/w/index.php?title=Window_functio
+ n&oldid=703074467#Hann_.28Hanning.29_window>.
+
+ [LINEAR-PREDICTION]
+ Wikipedia, "Linear Predictive Coding", October 2015,
+ <https://en.wikipedia.org/w/
+ index.php?title=Linear_predictive_coding&oldid=687498962>.
+
+ [LPC-SAMPLE]
+ Degener, J. and C. Bormann, "Autocorrelation LPC coeff
+ generation algorithm (Vorbis source code)", November 1994,
+ <https://svn.xiph.org/trunk/vorbis/lib/lpc.c>.
+
+ [Q-NOTATION]
+ Wikipedia, "Q (number format)", December 2015,
+ <https://en.wikipedia.org/w/
+ index.php?title=Q_%28number_format%29&oldid=697252615>.
+
+ [REPLAY-GAIN]
+ Parker, C. and M. Leese, "VorbisComment: Replay Gain",
+ June 2009,
+ <https://wiki.xiph.org/VorbisComment#Replay_Gain>.
+
+ [SEEKING] Pfeiffer, S., Parker, C., and G. Maxwell, "Granulepos
+ Encoding and How Seeking Really Works", May 2012,
+ <https://wiki.xiph.org/Seeking>.
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 33]
+
+RFC 7845 Ogg Opus April 2016
+
+
+ [VORBIS-MAPPING]
+ Montgomery, C., "The Vorbis I Specification, Section 4.3.9
+ Output Channel Order", January 2010,
+ <https://www.xiph.org/vorbis/doc/
+ Vorbis_I_spec.html#x1-810004.3.9>.
+
+ [VORBIS-TRIM]
+ Montgomery, C., "The Vorbis I Specification, Appendix A:
+ Embedding Vorbis into an Ogg stream", November 2008,
+ <https://xiph.org/vorbis/doc/
+ Vorbis_I_spec.html#x1-132000A.2>.
+
+ [WAVE-MULTICHANNEL]
+ Microsoft Corporation, "Multiple Channel Audio Data and
+ WAVE Files", March 2007,
+ <https://msdn.microsoft.com/en-us/windows/hardware/
+ gg463006.aspx>.
+
+Acknowledgments
+
+ Thanks to Ben Campbell, Joel M. Halpern, Mark Harris, Greg Maxwell,
+ Christopher "Monty" Montgomery, Jean-Marc Valin, Stephan Wenger, and
+ Mo Zanaty for their valuable contributions to this document.
+ Additional thanks to Andrew D'Addesio, Greg Maxwell, and Vincent
+ Penquerc'h for their feedback based on early implementations.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 34]
+
+RFC 7845 Ogg Opus April 2016
+
+
+Authors' Addresses
+
+ Timothy B. Terriberry
+ Mozilla Corporation
+ 331 E. Evelyn Ave.
+ Mountain View, CA 94041
+ United States
+
+ Phone: +1 650 903-0800
+ Email: tterribe@xiph.org
+
+
+ Ron Lee
+ Voicetronix
+ 246 Pulteney Street, Level 1
+ Adelaide, SA 5000
+ Australia
+
+ Phone: +61 8 8232 9112
+ Email: ron@debian.org
+
+
+ Ralph Giles
+ Mozilla Corporation
+ 163 West Hastings Street
+ Vancouver, BC V6B 1H5
+ Canada
+
+ Phone: +1 778 785 1540
+ Email: giles@xiph.org
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Terriberry, et al. Standards Track [Page 35]
+