summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc7587.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc7587.txt')
-rw-r--r--doc/rfc/rfc7587.txt1011
1 files changed, 1011 insertions, 0 deletions
diff --git a/doc/rfc/rfc7587.txt b/doc/rfc/rfc7587.txt
new file mode 100644
index 0000000..a4d1140
--- /dev/null
+++ b/doc/rfc/rfc7587.txt
@@ -0,0 +1,1011 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) J. Spittka
+Request for Comments: 7587
+Category: Standards Track K. Vos
+ISSN: 2070-1721 vocTone
+ JM. Valin
+ Mozilla
+ June 2015
+
+
+ RTP Payload Format for the Opus Speech and Audio Codec
+
+Abstract
+
+ This document defines the Real-time Transport Protocol (RTP) payload
+ format for packetization of Opus-encoded speech and audio data
+ necessary to integrate the codec in the most compatible way. It also
+ provides an applicability statement for the use of Opus over RTP.
+ Further, it describes media type registrations for the RTP payload
+ format.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 5741.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc7587.
+
+Copyright Notice
+
+ Copyright (c) 2015 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+
+
+Spittka, et al. Standards Track [Page 1]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+Table of Contents
+
+ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
+ 2. Conventions, Definitions, and Acronyms Used in This Document 3
+ 3. Opus Codec . . . . . . . . . . . . . . . . . . . . . . . . . 4
+ 3.1. Network Bandwidth . . . . . . . . . . . . . . . . . . . . 4
+ 3.1.1. Recommended Bitrate . . . . . . . . . . . . . . . . . 4
+ 3.1.2. Variable versus Constant Bitrate . . . . . . . . . . 4
+ 3.1.3. Discontinuous Transmission (DTX) . . . . . . . . . . 5
+ 3.2. Complexity . . . . . . . . . . . . . . . . . . . . . . . 6
+ 3.3. Forward Error Correction (FEC) . . . . . . . . . . . . . 6
+ 3.4. Stereo Operation . . . . . . . . . . . . . . . . . . . . 6
+ 4. Opus RTP Payload Format . . . . . . . . . . . . . . . . . . . 7
+ 4.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . 7
+ 4.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 7
+ 5. Congestion Control . . . . . . . . . . . . . . . . . . . . . 8
+ 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9
+ 6.1. Opus Media Type Registration . . . . . . . . . . . . . . 9
+ 7. SDP Considerations . . . . . . . . . . . . . . . . . . . . . 12
+ 7.1. SDP Offer/Answer Considerations . . . . . . . . . . . . . 13
+ 7.2. Declarative SDP Considerations for Opus . . . . . . . . . 15
+ 8. Security Considerations . . . . . . . . . . . . . . . . . . . 15
+ 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 16
+ 9.1. Normative References . . . . . . . . . . . . . . . . . . 16
+ 9.2. Informative References . . . . . . . . . . . . . . . . . 17
+ Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 18
+ Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18
+
+1. Introduction
+
+ Opus [RFC6716] is a speech and audio codec developed within the IETF
+ Internet Wideband Audio Codec working group. The codec has a very
+ low algorithmic delay, and it is highly scalable in terms of audio
+ bandwidth, bitrate, and complexity. Further, it provides different
+ modes to efficiently encode speech signals as well as music signals,
+ thus making it the codec of choice for various applications using the
+ Internet or similar networks.
+
+ This document defines the Real-time Transport Protocol (RTP)
+ [RFC3550] payload format for packetization of Opus-encoded speech and
+ audio data necessary to integrate Opus in the most compatible way.
+ It also provides an applicability statement for the use of Opus over
+ RTP. Further, it describes media type registrations for the RTP
+ payload format.
+
+
+
+
+
+
+
+Spittka, et al. Standards Track [Page 2]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+2. Conventions, Definitions, and Acronyms Used in This Document
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in [RFC2119].
+
+ audio bandwidth: The range of audio frequencies being coded
+
+ CBR: Constant bitrate
+
+ CPU: Central Processing Unit
+
+ DTX: Discontinuous Transmission
+
+ FEC: Forward Error Correction
+
+ IP: Internet Protocol
+
+ samples: Speech or audio samples (per channel)
+
+ SDP: Session Description Protocol
+
+ SSRC: Synchronization source
+
+ VBR: Variable bitrate
+
+ Throughout this document, we refer to the following definitions:
+
+ +--------------+----------------+-----------------+-----------------+
+ | Abbreviation | Name | Audio Bandwidth | Sampling Rate |
+ | | | (Hz) | (Hz) |
+ +--------------+----------------+-----------------+-----------------+
+ | NB | Narrowband | 0 - 4000 | 8000 |
+ | | | | |
+ | MB | Mediumband | 0 - 6000 | 12000 |
+ | | | | |
+ | WB | Wideband | 0 - 8000 | 16000 |
+ | | | | |
+ | SWB | Super-wideband | 0 - 12000 | 24000 |
+ | | | | |
+ | FB | Fullband | 0 - 20000 | 48000 |
+ +--------------+----------------+-----------------+-----------------+
+
+ Table 1: Audio Bandwidth Naming
+
+
+
+
+
+
+
+Spittka, et al. Standards Track [Page 3]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+3. Opus Codec
+
+ Opus encodes speech signals as well as general audio signals. Two
+ different modes can be chosen, a voice mode or an audio mode, to
+ allow the most efficient coding depending on the type of the input
+ signal, the sampling frequency of the input signal, and the intended
+ application.
+
+ The voice mode allows efficient encoding of voice signals at lower
+ bitrates while the audio mode is optimized for general audio signals
+ at medium and higher bitrates.
+
+ Opus is highly scalable in terms of audio bandwidth, bitrate, and
+ complexity. Further, Opus allows transmitting stereo signals with
+ in-band signaling in the bitstream.
+
+3.1. Network Bandwidth
+
+ Opus supports bitrates from 6 kbit/s to 510 kbit/s. The bitrate can
+ be changed dynamically within that range. All other parameters being
+ equal, higher bitrates result in higher audio quality.
+
+3.1.1. Recommended Bitrate
+
+ For a frame size of 20 ms, these are the bitrate "sweet spots" for
+ Opus in various configurations:
+
+ o 8-12 kbit/s for NB speech,
+
+ o 16-20 kbit/s for WB speech,
+
+ o 28-40 kbit/s for FB speech,
+
+ o 48-64 kbit/s for FB mono music, and
+
+ o 64-128 kbit/s for FB stereo music.
+
+3.1.2. Variable versus Constant Bitrate
+
+ For the same average bitrate, variable bitrate (VBR) can achieve
+ higher audio quality than constant bitrate (CBR). For the majority
+ of voice transmission applications, VBR is the best choice. One
+ reason for choosing CBR is the potential information leak that
+ _might_ occur when encrypting the compressed stream. See [RFC6562]
+ for guidelines on when VBR is appropriate for encrypted audio
+ communications. In the case where an existing VBR stream needs to be
+ converted to CBR for security reasons, the Opus padding mechanism
+
+
+
+
+Spittka, et al. Standards Track [Page 4]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ described in [RFC6716] is the RECOMMENDED way to achieve padding
+ because the RTP padding bit is unencrypted.
+
+ The bitrate can be adjusted at any point in time. To avoid
+ congestion, the average bitrate SHOULD NOT exceed the available
+ network bandwidth. If no target bitrate is specified, the bitrates
+ specified in Section 3.1.1 are RECOMMENDED.
+
+3.1.3. Discontinuous Transmission (DTX)
+
+ Opus can, as described in Section 3.1.2, be operated with a variable
+ bitrate. In that case, the encoder will automatically reduce the
+ bitrate for certain input signals, like periods of silence. When
+ using continuous transmission, it will reduce the bitrate when the
+ characteristics of the input signal permit, but it will never
+ interrupt the transmission to the receiver. Therefore, the received
+ signal will maintain the same high level of audio quality over the
+ full duration of a transmission while minimizing the average bitrate
+ over time.
+
+ In cases where the bitrate of Opus needs to be reduced even further
+ or in cases where only constant bitrate is available, the Opus
+ encoder can use Discontinuous Transmission (DTX), where parts of the
+ encoded signal that correspond to periods of silence in the input
+ speech or audio signal are not transmitted to the receiver. A
+ receiver can distinguish between DTX and packet loss by looking for
+ gaps in the sequence number, as described by Section 4.1
+ of [RFC3551].
+
+ On the receiving side, the non-transmitted parts will be handled by a
+ frame loss concealment unit in the Opus decoder, which generates a
+ comfort noise signal to replace the non-transmitted parts of the
+ speech or audio signal. Using Comfort Noise as defined in [RFC3389]
+ with Opus is discouraged. The transmitter MUST drop whole frames
+ only, based on the size of the last transmitted frame, to ensure
+ successive RTP timestamps differ by a multiple of 120 and to allow
+ the receiver to use whole frames for concealment.
+
+ DTX can be used with both variable and constant bitrate. It will
+ have a slightly lower speech or audio quality than continuous
+ transmission. Therefore, using continuous transmission is
+ RECOMMENDED unless constraints on available network bandwidth are
+ severe.
+
+
+
+
+
+
+
+
+Spittka, et al. Standards Track [Page 5]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+3.2. Complexity
+
+ Complexity of the encoder can be scaled to optimize for CPU resources
+ in real time, mostly as a trade-off between audio quality and
+ bitrate. Also, different modes of Opus have different complexity.
+
+3.3. Forward Error Correction (FEC)
+
+ The voice mode of Opus allows for embedding in-band Forward Error
+ Correction (FEC) data into the Opus bitstream. This FEC scheme adds
+ redundant information about the previous packet (N-1) to the current
+ output packet N. For each frame, the encoder decides whether to use
+ FEC based on (1) an externally provided estimate of the channel's
+ packet loss rate; (2) an externally provided estimate of the
+ channel's capacity; (3) the sensitivity of the audio or speech signal
+ to packet loss; and (4) whether the receiving decoder has indicated
+ it can take advantage of in-band FEC information. The decision to
+ send in-band FEC information is entirely controlled by the encoder;
+ therefore, no special precautions for the payload have to be taken.
+
+ On the receiving side, the decoder can take advantage of this
+ additional information when it loses a packet and the next packet is
+ available. In order to use the FEC data, the jitter buffer needs to
+ provide access to payloads with the FEC data. Instead of performing
+ loss concealment for a missing packet, the receiver can then
+ configure its decoder to decode the FEC data from the next packet.
+
+ Any compliant Opus decoder is capable of ignoring FEC information
+ when it is not needed, so encoding with FEC cannot cause
+ interoperability problems. However, if FEC cannot be used on the
+ receiving side, then FEC SHOULD NOT be used, as it leads to an
+ inefficient usage of network resources. Decoder support for FEC
+ SHOULD be indicated at the time a session is set up.
+
+3.4. Stereo Operation
+
+ Opus allows for transmission of stereo audio signals. This operation
+ is signaled in-band in the Opus bitstream and no special arrangement
+ is needed in the payload format. An Opus decoder is capable of
+ handling a stereo encoding, but an application might only be capable
+ of consuming a single audio channel.
+
+ If a decoder cannot take advantage of the benefits of a stereo
+ signal, this SHOULD be indicated at the time a session is set up. In
+ that case, the sending side SHOULD NOT send stereo signals as it
+ leads to an inefficient usage of network resources.
+
+
+
+
+
+Spittka, et al. Standards Track [Page 6]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+4. Opus RTP Payload Format
+
+ The payload format for Opus consists of the RTP header and Opus
+ payload data.
+
+4.1. RTP Header Usage
+
+ The format of the RTP header is specified in [RFC3550]. The use of
+ the fields of the RTP header by the Opus payload format is consistent
+ with that specification.
+
+ The payload length of Opus is an integer number of octets; therefore,
+ no padding is necessary. The payload MAY be padded by an integer
+ number of octets according to [RFC3550], although the Opus internal
+ padding is preferred.
+
+ The timestamp, sequence number, and marker bit (M) of the RTP header
+ are used in accordance with Section 4.1 of [RFC3551].
+
+ The RTP payload type for Opus is to be assigned dynamically.
+
+ The receiving side MUST be prepared to receive duplicate RTP packets.
+ The receiver MUST provide at most one of those payloads to the Opus
+ decoder for decoding, and it MUST discard the others.
+
+ Opus supports 5 different audio bandwidths, which can be adjusted
+ during a stream. The RTP timestamp is incremented with a 48000 Hz
+ clock rate for all modes of Opus and all sampling rates. The unit
+ for the timestamp is samples per single (mono) channel. The RTP
+ timestamp corresponds to the sample time of the first encoded sample
+ in the encoded frame. For data encoded with sampling rates other
+ than 48000 Hz, the sampling rate has to be adjusted to 48000 Hz.
+
+4.2. Payload Structure
+
+ The Opus encoder can output encoded frames representing 2.5, 5, 10,
+ 20, 40, or 60 ms of speech or audio data. Further, an arbitrary
+ number of frames can be combined into a packet, up to a maximum
+ packet duration representing 120 ms of speech or audio data. The
+ grouping of one or more Opus frames into a single Opus packet is
+ defined in Section 3 of [RFC6716]. An RTP payload MUST contain
+ exactly one Opus packet as defined by that document.
+
+ Figure 1 shows the structure combined with the RTP header.
+
+
+
+
+
+
+
+Spittka, et al. Standards Track [Page 7]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ +----------+--------------+
+ |RTP Header| Opus Payload |
+ +----------+--------------+
+
+ Figure 1: Packet Structure with RTP Header
+
+ Table 2 shows supported frame sizes in milliseconds of encoded speech
+ or audio data for the speech and audio modes (Mode) and sampling
+ rates (fs) of Opus, and it shows how the timestamp is incremented for
+ packetization (ts incr). If the Opus encoder outputs multiple
+ encoded frames into a single packet, the timestamp increment is the
+ sum of the increments for the individual frames.
+
+ +---------+-----------------+-----+-----+-----+-----+------+------+
+ | Mode | fs | 2.5 | 5 | 10 | 20 | 40 | 60 |
+ +---------+-----------------+-----+-----+-----+-----+------+------+
+ | ts incr | all | 120 | 240 | 480 | 960 | 1920 | 2880 |
+ | | | | | | | | |
+ | voice | NB/MB/WB/SWB/FB | x | x | o | o | o | o |
+ | | | | | | | | |
+ | audio | NB/WB/SWB/FB | o | o | o | o | x | x |
+ +---------+-----------------+-----+-----+-----+-----+------+------+
+
+ Table 2: Supported Opus frame sizes and timestamp increments are
+ marked with an o. Unsupported ones are marked with an x.
+
+5. Congestion Control
+
+ The target bitrate of Opus can be adjusted at any point in time, thus
+ allowing efficient congestion control. Furthermore, the amount of
+ encoded speech or audio data encoded in a single packet can be used
+ for congestion control, since the transmission rate is inversely
+ proportional to the packet duration. A lower packet transmission
+ rate reduces the amount of header overhead, but at the same time
+ increases latency and loss sensitivity, so it ought to be used with
+ care.
+
+ Since UDP does not provide congestion control, applications that use
+ RTP over UDP SHOULD implement their own congestion control above the
+ UDP layer [RFC5405]. Work in the RMCAT working group [rmcat]
+ describes the interactions and conceptual interfaces necessary
+ between the application components that relate to congestion control,
+ including the RTP layer, the higher-level media codec control layer,
+ and the lower-level transport interface, as well as components
+ dedicated to congestion control functions.
+
+
+
+
+
+
+Spittka, et al. Standards Track [Page 8]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+6. IANA Considerations
+
+ One media subtype (audio/opus) has been defined and registered as
+ described in the following section.
+
+6.1. Opus Media Type Registration
+
+ Media type registration is done according to [RFC6838] and [RFC4855].
+
+ Type name: audio
+
+ Subtype name: opus
+
+ Required parameters:
+
+ rate: the RTP timestamp is incremented with a 48000 Hz clock rate
+ for all modes of Opus and all sampling rates. For data encoded
+ with sampling rates other than 48000 Hz, the sampling rate has to
+ be adjusted to 48000 Hz.
+
+ Optional parameters:
+
+ maxplaybackrate: a hint about the maximum output sampling rate that
+ the receiver is capable of rendering in Hz. The decoder MUST be
+ capable of decoding any audio bandwidth, but, due to hardware
+ limitations, only signals up to the specified sampling rate can be
+ played back. Sending signals with higher audio bandwidth results
+ in higher than necessary network usage and encoding complexity, so
+ an encoder SHOULD NOT encode frequencies above the audio bandwidth
+ specified by maxplaybackrate. This parameter can take any value
+ between 8000 and 48000, although commonly the value will match one
+ of the Opus bandwidths (Table 1). By default, the receiver is
+ assumed to have no limitations, i.e., 48000.
+
+ sprop-maxcapturerate: a hint about the maximum input sampling rate
+ that the sender is likely to produce. This is not a guarantee
+ that the sender will never send any higher bandwidth (e.g., it
+ could send a prerecorded prompt that uses a higher bandwidth), but
+ it indicates to the receiver that frequencies above this maximum
+ can safely be discarded. This parameter is useful to avoid
+ wasting receiver resources by operating the audio processing
+ pipeline (e.g., echo cancellation) at a higher rate than
+ necessary. This parameter can take any value between 8000 and
+ 48000, although commonly the value will match one of the Opus
+ bandwidths (Table 1). By default, the sender is assumed to have
+ no limitations, i.e., 48000.
+
+
+
+
+
+Spittka, et al. Standards Track [Page 9]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ maxptime: the maximum duration of media represented by a packet
+ (according to Section 6 of [RFC4566]) that a decoder wants to
+ receive, in milliseconds rounded up to the next full integer
+ value. Possible values are 3, 5, 10, 20, 40, 60, or an arbitrary
+ multiple of an Opus frame size rounded up to the next full integer
+ value, up to a maximum value of 120, as defined in Section 4. If
+ no value is specified, the default is 120.
+
+ ptime: the preferred duration of media represented by a packet
+ (according to Section 6 of [RFC4566]) that a decoder wants to
+ receive, in milliseconds rounded up to the next full integer
+ value. Possible values are 3, 5, 10, 20, 40, 60, or an arbitrary
+ multiple of an Opus frame size rounded up to the next full integer
+ value, up to a maximum value of 120, as defined in Section 4. If
+ no value is specified, the default is 20.
+
+ maxaveragebitrate: specifies the maximum average receive bitrate of
+ a session in bits per second (bit/s). The actual value of the
+ bitrate can vary, as it is dependent on the characteristics of the
+ media in a packet. Note that the maximum average bitrate MAY be
+ modified dynamically during a session. Any positive integer is
+ allowed, but values outside the range 6000 to 510000 SHOULD be
+ ignored. If no value is specified, the maximum value specified in
+ Section 3.1.1 for the corresponding mode of Opus and corresponding
+ maxplaybackrate is the default.
+
+ stereo: specifies whether the decoder prefers receiving stereo or
+ mono signals. Possible values are 1 and 0, where 1 specifies that
+ stereo signals are preferred, and 0 specifies that only mono
+ signals are preferred. Independent of the stereo parameter, every
+ receiver MUST be able to receive and decode stereo signals, but
+ sending stereo signals to a receiver that signaled a preference
+ for mono signals may result in higher than necessary network
+ utilization and encoding complexity. If no value is specified,
+ the default is 0 (mono).
+
+ sprop-stereo: specifies whether the sender is likely to produce
+ stereo audio. Possible values are 1 and 0, where 1 specifies that
+ stereo signals are likely to be sent, and 0 specifies that the
+ sender will likely only send mono. This is not a guarantee that
+ the sender will never send stereo audio (e.g., it could send a
+ prerecorded prompt that uses stereo), but it indicates to the
+ receiver that the received signal can be safely downmixed to mono.
+ This parameter is useful to avoid wasting receiver resources by
+ operating the audio processing pipeline (e.g., echo cancellation)
+ in stereo when not necessary. If no value is specified, the
+ default is 0 (mono).
+
+
+
+
+Spittka, et al. Standards Track [Page 10]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ cbr: specifies if the decoder prefers the use of a constant bitrate
+ versus a variable bitrate. Possible values are 1 and 0, where 1
+ specifies constant bitrate, and 0 specifies variable bitrate. If
+ no value is specified, the default is 0 (vbr). When cbr is 1, the
+ maximum average bitrate can still change, e.g., to adapt to
+ changing network conditions.
+
+ useinbandfec: specifies that the decoder has the capability to take
+ advantage of the Opus in-band FEC. Possible values are 1 and 0.
+ Providing 0 when FEC cannot be used on the receiving side is
+ RECOMMENDED. If no value is specified, useinbandfec is assumed to
+ be 0. This parameter is only a preference, and the receiver MUST
+ be able to process packets that include FEC information, even if
+ it means the FEC part is discarded.
+
+ usedtx: specifies if the decoder prefers the use of DTX. Possible
+ values are 1 and 0. If no value is specified, the default is 0.
+
+ Encoding considerations:
+
+ The Opus media type is framed and consists of binary data
+ according to Section 4.8 of [RFC6838].
+
+ Security considerations:
+
+ See Section 8 of this document.
+
+ Interoperability considerations: none
+
+ Published specification: RFC 7587
+
+ Applications that use this media type:
+
+ Any application that requires the transport of speech or audio
+ data can use this media type. Some examples are, but not limited
+ to, audio and video conferencing, Voice over IP, and media
+ streaming.
+
+ Fragment identifier considerations: N/A
+
+ Person & email address to contact for further information:
+
+ SILK Support, silksupport@skype.net
+
+ Jean-Marc Valin, jmvalin@jmvalin.ca
+
+ Intended usage: COMMON
+
+
+
+
+Spittka, et al. Standards Track [Page 11]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ Restrictions on usage:
+
+ For transfer over RTP, the RTP payload format (Section 4 of this
+ document) SHALL be used.
+
+ Authors:
+
+ Julian Spittka, jspittka@gmail.com
+
+ Koen Vos, koenvos74@gmail.com
+
+ Jean-Marc Valin, jmvalin@jmvalin.ca
+
+ Change controller: IETF Payload working group delegated from the IESG
+
+7. SDP Considerations
+
+ The information described in the media type specification has a
+ specific mapping to fields in the Session Description Protocol (SDP)
+ [RFC4566], which is commonly used to describe RTP sessions. When SDP
+ is used to specify sessions employing Opus, the mapping is as
+ follows:
+
+ o The media type ("audio") goes in SDP "m=" as the media name.
+
+ o The media subtype ("opus") goes in SDP "a=rtpmap" as the encoding
+ name. The RTP clock rate in "a=rtpmap" MUST be 48000, and the
+ number of channels MUST be 2.
+
+ o The OPTIONAL media type parameters "ptime" and "maxptime" are
+ mapped to "a=ptime" and "a=maxptime" attributes, respectively, in
+ the SDP.
+
+ o The OPTIONAL media type parameters "maxaveragebitrate",
+ "maxplaybackrate", "stereo", "cbr", "useinbandfec", and "usedtx",
+ when present, MUST be included in the "a=fmtp" attribute in the
+ SDP, expressed as a media type string in the form of a semicolon-
+ separated list of parameter=value pairs (e.g.,
+ maxplaybackrate=48000). They MUST NOT be specified in an SSRC-
+ specific "fmtp" source-level attribute (as defined in Section 6.3
+ of [RFC5576]).
+
+ o The OPTIONAL media type parameters "sprop-maxcapturerate" and
+ "sprop-stereo" MAY be mapped to the "a=fmtp" SDP attribute by
+ copying them directly from the media type parameter string as part
+ of the semicolon-separated list of parameter=value pairs (e.g.,
+ sprop-stereo=1). These same OPTIONAL media type parameters MAY
+ also be specified using an SSRC-specific "fmtp" source-level
+
+
+
+Spittka, et al. Standards Track [Page 12]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ attribute as described in Section 6.3 of [RFC5576]. They MAY be
+ specified in both places, in which case the parameter in the
+ source-level attribute overrides the one found on the "a=fmtp"
+ line. The value of any parameter that is not specified in a
+ source-level source attribute MUST be taken from the "a=fmtp"
+ line, if it is present there.
+
+ Below are some examples of SDP session descriptions for Opus:
+
+ Example 1: Standard mono session with 48000 Hz clock rate
+
+ m=audio 54312 RTP/AVP 101
+ a=rtpmap:101 opus/48000/2
+
+ Example 2: 16000 Hz clock rate, maximum packet size of 40 ms,
+ recommended packet size of 40 ms, maximum average bitrate of 20000
+ bit/s, prefers to receive stereo but only plans to send mono, FEC is
+ desired, DTX is not desired
+
+ m=audio 54312 RTP/AVP 101
+ a=rtpmap:101 opus/48000/2
+ a=fmtp:101 maxplaybackrate=16000; sprop-maxcapturerate=16000;
+ maxaveragebitrate=20000; stereo=1; useinbandfec=1; usedtx=0
+ a=ptime:40
+ a=maxptime:40
+
+ Example 3: Two-way full-band stereo preferred
+
+ m=audio 54312 RTP/AVP 101
+ a=rtpmap:101 opus/48000/2
+ a=fmtp:101 stereo=1; sprop-stereo=1
+
+7.1. SDP Offer/Answer Considerations
+
+ When using the offer/answer procedure described in [RFC3264] to
+ negotiate the use of Opus, the following considerations apply:
+
+ o Opus supports several clock rates. For signaling purposes, only
+ the highest, i.e., 48000, is used. The actual clock rate of the
+ corresponding media is signaled inside the payload and is not
+ restricted by this payload format description. The decoder MUST
+ be capable of decoding every received clock rate. An example is
+ shown below:
+
+ m=audio 54312 RTP/AVP 100
+ a=rtpmap:100 opus/48000/2
+
+
+
+
+
+Spittka, et al. Standards Track [Page 13]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ o The "ptime" and "maxptime" parameters are unidirectional receive-
+ only parameters and typically will not compromise
+ interoperability; however, some values might cause application
+ performance to suffer. [RFC3264] defines the SDP offer/answer
+ handling of the "ptime" parameter. The "maxptime" parameter MUST
+ be handled in the same way.
+
+ o The "maxplaybackrate" parameter is a unidirectional receive-only
+ parameter that reflects limitations of the local receiver. When
+ sending to a single destination, a sender MUST NOT use an audio
+ bandwidth higher than necessary to make full use of audio sampled
+ at a sampling rate of "maxplaybackrate". Gateways or senders that
+ are sending the same encoded audio to multiple destinations SHOULD
+ NOT use an audio bandwidth higher than necessary to represent
+ audio sampled at "maxplaybackrate", as this would lead to
+ inefficient use of network resources. The "maxplaybackrate"
+ parameter does not affect interoperability. Also, this parameter
+ SHOULD NOT be used to adjust the audio bandwidth as a function of
+ the bitrate, as this is the responsibility of the Opus encoder
+ implementation.
+
+ o The "maxaveragebitrate" parameter is a unidirectional receive-only
+ parameter that reflects limitations of the local receiver. The
+ sender of the other side MUST NOT send with an average bitrate
+ higher than "maxaveragebitrate" as it might overload the network
+ and/or receiver. The "maxaveragebitrate" parameter typically will
+ not compromise interoperability; however, some values might cause
+ application performance to suffer and ought to be set with care.
+
+ o The "sprop-maxcapturerate" and "sprop-stereo" parameters are
+ unidirectional sender-only parameters that reflect limitations of
+ the sender side. They allow the receiver to set up a reduced-
+ complexity audio processing pipeline if the sender is not planning
+ to use the full range of Opus's capabilities. Neither "sprop-
+ maxcapturerate" nor "sprop-stereo" affect interoperability, and
+ the receiver MUST be capable of receiving any signal.
+
+ o The "stereo" parameter is a unidirectional receive-only parameter.
+ When sending to a single destination, a sender MUST NOT use stereo
+ when "stereo" is 0. Gateways or senders that are sending the same
+ encoded audio to multiple destinations SHOULD NOT use stereo when
+ "stereo" is 0, as this would lead to inefficient use of network
+ resources. The "stereo" parameter does not affect
+ interoperability.
+
+ o The "cbr" parameter is a unidirectional receive-only parameter.
+
+
+
+
+
+Spittka, et al. Standards Track [Page 14]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ o The "useinbandfec" parameter is a unidirectional receive-only
+ parameter.
+
+ o The "usedtx" parameter is a unidirectional receive-only parameter.
+
+ o Any unknown parameter in an offer MUST be ignored by the receiver
+ and MUST be removed from the answer.
+
+ The Opus parameters in an SDP offer/answer exchange are completely
+ orthogonal, and there is no relationship between the SDP offer and
+ the answer.
+
+7.2. Declarative SDP Considerations for Opus
+
+ For declarative use of SDP such as in the Session Announcement
+ Protocol (SAP) [RFC2974] and the Real Time Streaming Protocol (RTSP)
+ [RFC2326] for Opus, the following needs to be considered:
+
+ o The values for "maxptime", "ptime", "maxplaybackrate", and
+ "maxaveragebitrate" ought to be selected carefully to ensure that
+ a reasonable performance can be achieved for the participants of a
+ session.
+
+ o The values for "maxptime", "ptime", and of the payload format
+ configuration are recommendations by the decoding side to ensure
+ the best performance for the decoder.
+
+ o All other parameters of the payload format configuration are
+ declarative and a participant MUST use the configurations that are
+ provided for the session. More than one configuration can be
+ provided if necessary by declaring multiple RTP payload types;
+ however, the number of types ought to be kept small.
+
+8. Security Considerations
+
+ Use of VBR is subject to the security considerations in [RFC6562].
+
+ RTP packets using the payload format defined in this specification
+ are subject to the security considerations discussed in the RTP
+ specification [RFC3550] and in any applicable RTP profile such as
+ RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or RTP/
+ SAVPF [RFC5124]. However, as "Securing the RTP Framework: Why RTP
+ Does Not Mandate a Single Media Security Solution" [RFC7202]
+ discusses, it is not an RTP payload format's responsibility to
+ discuss or mandate what solutions are used to meet the basic security
+ goals like confidentiality, integrity, and source authenticity for
+ RTP in general. This responsibility lies on anyone using RTP in an
+ application. They can find guidance on available security mechanisms
+
+
+
+Spittka, et al. Standards Track [Page 15]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ and important considerations in "Options for Securing RTP Sessions"
+ [RFC7201]. Applications SHOULD use one or more appropriate strong
+ security mechanisms.
+
+ This payload format and the Opus encoding do not exhibit any
+ significant non-uniformity in the receiver-end computational load and
+ thus are unlikely to pose a denial-of-service threat due to the
+ receipt of pathological datagrams.
+
+9. References
+
+9.1. Normative References
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119,
+ DOI 10.17487/RFC2119, March 1997,
+ <http://www.rfc-editor.org/info/rfc2119>.
+
+ [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time
+ Streaming Protocol (RTSP)", RFC 2326,
+ DOI 10.17487/RFC2326, April 1998,
+ <http://www.rfc-editor.org/info/rfc2326>.
+
+ [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
+ with Session Description Protocol (SDP)", RFC 3264,
+ DOI 10.17487/RFC3264, June 2002,
+ <http://www.rfc-editor.org/info/rfc3264>.
+
+ [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for
+ Comfort Noise (CN)", RFC 3389, DOI 10.17487/RFC3389,
+ September 2002, <http://www.rfc-editor.org/info/rfc3389>.
+
+ [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
+ Jacobson, "RTP: A Transport Protocol for Real-Time
+ Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
+ July 2003, <http://www.rfc-editor.org/info/rfc3550>.
+
+ [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
+ Video Conferences with Minimal Control", STD 65, RFC 3551,
+ DOI 10.17487/RFC3551, July 2003,
+ <http://www.rfc-editor.org/info/rfc3551>.
+
+ [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
+ Norrman, "The Secure Real-time Transport Protocol (SRTP)",
+ RFC 3711, DOI 10.17487/RFC3711, March 2004,
+ <http://www.rfc-editor.org/info/rfc3711>.
+
+
+
+
+
+Spittka, et al. Standards Track [Page 16]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
+ Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
+ July 2006, <http://www.rfc-editor.org/info/rfc4566>.
+
+ [RFC4855] Casner, S., "Media Type Registration of RTP Payload
+ Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007,
+ <http://www.rfc-editor.org/info/rfc4855>.
+
+ [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific
+ Media Attributes in the Session Description Protocol
+ (SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009,
+ <http://www.rfc-editor.org/info/rfc5576>.
+
+ [RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of
+ Variable Bit Rate Audio with Secure RTP", RFC 6562,
+ DOI 10.17487/RFC6562, March 2012,
+ <http://www.rfc-editor.org/info/rfc6562>.
+
+ [RFC6716] Valin, JM., Vos, K., and T. Terriberry, "Definition of the
+ Opus Audio Codec", RFC 6716, DOI 10.17487/RFC6716,
+ September 2012, <http://www.rfc-editor.org/info/rfc6716>.
+
+ [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type
+ Specifications and Registration Procedures", BCP 13,
+ RFC 6838, DOI 10.17487/RFC6838, January 2013,
+ <http://www.rfc-editor.org/info/rfc6838>.
+
+9.2. Informative References
+
+ [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session
+ Announcement Protocol", RFC 2974, DOI 10.17487/RFC2974,
+ October 2000, <http://www.rfc-editor.org/info/rfc2974>.
+
+ [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
+ "Extended RTP Profile for Real-time Transport Control
+ Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
+ DOI 10.17487/RFC4585, July 2006,
+ <http://www.rfc-editor.org/info/rfc4585>.
+
+ [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for
+ Real-time Transport Control Protocol (RTCP)-Based Feedback
+ (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February
+ 2008, <http://www.rfc-editor.org/info/rfc5124>.
+
+
+
+
+
+
+
+
+Spittka, et al. Standards Track [Page 17]
+
+RFC 7587 RTP Payload Format for Opus June 2015
+
+
+ [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines
+ for Application Designers", BCP 145, RFC 5405,
+ DOI 10.17487/RFC5405, November 2008,
+ <http://www.rfc-editor.org/info/rfc5405>.
+
+ [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP
+ Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014,
+ <http://www.rfc-editor.org/info/rfc7201>.
+
+ [RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP
+ Framework: Why RTP Does Not Mandate a Single Media
+ Security Solution", RFC 7202, DOI 10.17487/RFC7202, April
+ 2014, <http://www.rfc-editor.org/info/rfc7202>.
+
+ [rmcat] "RTP Media Congestion Avoidance Techniques (rmcat)
+ Documents", <https://datatracker.ietf.org/wg/rmcat/
+ documents/>.
+
+Acknowledgements
+
+ Many people have made useful comments and suggestions contributing to
+ this document. In particular, we would like to thank Tina le Grand,
+ Cullen Jennings, Jonathan Lennox, Gregory Maxwell, Colin Perkins, Jan
+ Skoglund, Timothy B. Terriberry, Martin Thompson, Justin Uberti,
+ Magnus Westerlund, and Mo Zanaty.
+
+Authors' Addresses
+
+ Julian Spittka
+
+ Email: jspittka@gmail.com
+
+
+ Koen Vos
+ vocTone
+
+ Email: koenvos74@gmail.com
+
+
+ Jean-Marc Valin
+ Mozilla
+ 331 E. Evelyn Avenue
+ Mountain View, CA 94041
+ United States
+
+ Email: jmvalin@jmvalin.ca
+
+
+
+
+
+Spittka, et al. Standards Track [Page 18]
+