summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc5404.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc5404.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc5404.txt')
-rw-r--r--doc/rfc/rfc5404.txt1515
1 files changed, 1515 insertions, 0 deletions
diff --git a/doc/rfc/rfc5404.txt b/doc/rfc/rfc5404.txt
new file mode 100644
index 0000000..11f6f8f
--- /dev/null
+++ b/doc/rfc/rfc5404.txt
@@ -0,0 +1,1515 @@
+
+
+
+
+
+
+Network Working Group M. Westerlund
+Request for Comments: 5404 I. Johansson
+Category: Standards Track Ericsson AB
+ January 2009
+
+
+ RTP Payload Format for G.719
+
+Status of This Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (c) 2008 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents (http://trustee.ietf.org/
+ license-info) in effect on the date of publication of this document.
+ Please review these documents carefully, as they describe your rights
+ and restrictions with respect to this document.
+
+Abstract
+
+ This document specifies the payload format for packetization of the
+ G.719 full-band codec encoded audio signals into the Real-time
+ Transport Protocol (RTP). The payload format supports transmission
+ of multiple channels, multiple frames per payload, and interleaving.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 1]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+Table of Contents
+
+ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
+ 2. Definitions and Conventions . . . . . . . . . . . . . . . . . 3
+ 3. G.719 Description . . . . . . . . . . . . . . . . . . . . . . 3
+ 4. Payload Format Capabilities . . . . . . . . . . . . . . . . . 4
+ 4.1. Multi-Rate Encoding and Rate Adaptation . . . . . . . . . 4
+ 4.2. Support for Multi-Channel Sessions . . . . . . . . . . . . 5
+ 4.3. Robustness against Packet Loss . . . . . . . . . . . . . . 5
+ 4.3.1. Use of Forward Error Correction (FEC) . . . . . . . . 5
+ 4.3.2. Use of Frame Interleaving . . . . . . . . . . . . . . 6
+ 5. Payload Format . . . . . . . . . . . . . . . . . . . . . . . . 7
+ 5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 8
+ 5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 8
+ 5.2.1. Basic ToC Element . . . . . . . . . . . . . . . . . . 9
+ 5.3. Basic Mode . . . . . . . . . . . . . . . . . . . . . . . . 10
+ 5.4. Interleaved Mode . . . . . . . . . . . . . . . . . . . . . 10
+ 5.5. Audio Data . . . . . . . . . . . . . . . . . . . . . . . . 11
+ 5.6. Implementation Considerations . . . . . . . . . . . . . . 12
+ 5.6.1. Receiving Redundant Frames . . . . . . . . . . . . . . 12
+ 5.6.2. Interleaving . . . . . . . . . . . . . . . . . . . . . 12
+ 5.6.3. Decoding Validation . . . . . . . . . . . . . . . . . 13
+ 6. Payload Examples . . . . . . . . . . . . . . . . . . . . . . . 13
+ 6.1. 3 Mono Frames with 2 Different Bitrates . . . . . . . . . 13
+ 6.2. 2 Stereo Frame-Blocks of the Same Bitrate . . . . . . . . 14
+ 6.3. 4 Mono Frames Interleaved . . . . . . . . . . . . . . . . 15
+ 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 16
+ 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 16
+ 7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 19
+ 7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 19
+ 7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 22
+ 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23
+ 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 23
+ 10. Security Considerations . . . . . . . . . . . . . . . . . . . 24
+ 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 25
+ 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 25
+ 12.1. Normative References . . . . . . . . . . . . . . . . . . . 25
+ 12.2. Informative References . . . . . . . . . . . . . . . . . . 26
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 2]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+1. Introduction
+
+ This document specifies the payload format for packetization of the
+ G.719 full-band (FB) codec encoded audio signals into the Real-time
+ Transport Protocol (RTP) [RFC3550]. The payload format supports
+ transmission of multiple channels, multiple frames per payload, and
+ packet loss robustness methods using redundancy or interleaving.
+
+ This document starts with conventions, a brief description of the
+ codec, and the payload format's capabilities. The payload format is
+ specified in Section 5. Examples can be found in Section 6. The
+ media type and its mappings to the Session Description Protocol (SDP)
+ and usage in SDP offer/answer are then specified. The document ends
+ with considerations regarding congestion control and security.
+
+2. Definitions and Conventions
+
+ The term "frame-block" is used in this document to describe the time-
+ synchronized set of audio frames in a multi-channel audio session.
+ In particular, in an N-channel session, a frame-block will contain N
+ audio frames, one from each of the channels, and all N speech frames
+ represent exactly the same time period.
+
+ This document contains depictions of bit fields. The most
+ significant bit is always leftmost in the figure on each row and has
+ the lowest enumeration. For fields that are depicted over multiple
+ rows, the upper row is more significant than the next.
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in RFC 2119 [RFC2119].
+
+3. G.719 Description
+
+ The ITU-T G.719 full-band codec is a transform coder based on
+ Modulated Lapped Transform (MLT). G.719 is a low-complexity full-
+ bandwidth codec for conversational speech and audio coding. The
+ encoder input and decoder output are sampled at 48 kHz. The codec
+ enables full-bandwidth from 20 Hz to 20 kHz, encoding of speech,
+ music, and general audio content at rates from 32 kbit/s up to 128
+ kbit/s. The codec operates on 20-ms frames and has an algorithmic
+ delay of 40 ms.
+
+ The codec provides excellent quality for speech, music, and other
+ types of audio. Some of the applications for which this coder is
+ suitable are:
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 3]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ o Real-time communications such as video conferencing and telephony
+
+ o Streaming audio
+
+ o Archival and messaging
+
+ The encoding and decoding algorithm can change the bitrate at any
+ 20-ms frame boundary. The encoder receives the audio sampled at 48
+ kHz. The support of other sampling rates is possible by re-sampling
+ the input signal to the codec's sampling rate, i.e., 48 kHz; however,
+ this functionality is not part of the standard.
+
+ The encoding is performed on equally sized frames. For each frame,
+ the encoder decides between two encoding modes, a transient mode and
+ a stationary mode. The decision is based on statistics derived from
+ the input signal. The stationary mode uses a long MLT that leads to
+ a spectrum of 960 coefficients, while the transient encoding mode
+ uses a short MLT (higher time resolution transform) that results in 4
+ spectra (4 x 240 = 960 coefficients). The encoding of the spectrum
+ is done in two steps. First, the spectral envelope is computed,
+ quantized, and Huffman encoded. The envelope is computed on a non-
+ uniform frequency subdivision. From the coded spectral envelope, a
+ weighted spectral envelope is derived and is used for bit allocation;
+ this process is also repeated at the decoder. Thus, only the
+ spectral envelope is transmitted. The output of the bit allocation
+ is used in order to quantize the spectra. In addition, for
+ stationary frames, the encoder estimates the amount of noise level.
+ The decoder applies the reverse operation upon reception of the bit
+ stream. The non-coded coefficients (i.e., no bits allocated) are
+ replaced by entries of a noise codebook that is built based on the
+ decoded coefficients.
+
+4. Payload Format Capabilities
+
+ This payload format has a number of capabilities, and this section
+ discusses them in some detail.
+
+4.1. Multi-Rate Encoding and Rate Adaptation
+
+ G.719 supports a multi-rate encoding capability that enables on a
+ per-frame basis variation of the encoding rate. This enables support
+ for bitrate adaptation and congestion control. The possibility to
+ aggregate multiple audio frames into a single RTP payload is another
+ dimension of adaptation. The RTP and payload format overhead can
+ thus be reduced by the aggregation at the cost of increased delay and
+ reduced packet-loss robustness.
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 4]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+4.2. Support for Multi-Channel Sessions
+
+ The RTP payload format defined in this document supports multi-
+ channel audio content (e.g., stereophonic or surround audio
+ sessions). Although the G.719 codec itself does not support encoding
+ of multi-channel audio content into a single bit stream, it can be
+ used to separately encode and decode each of the individual channels.
+ To transport (or store) the separately encoded multi-channel content,
+ the audio frames for all channels that are framed and encoded for the
+ same 20-ms period are logically collected in a "frame-block".
+
+ At the session setup, out-of-band signaling must be used to indicate
+ the number of channels in the payload type. The order of the audio
+ frames within the frame-block depends on the number of the channels
+ and follows the definition in Section 4.1 of the RTP/AVP profile
+ [RFC3551]. When using SDP for signaling, the number of channels is
+ specified in the rtpmap attribute.
+
+4.3. Robustness against Packet Loss
+
+ The payload format supports several means, including forward error
+ correction (FEC) and frame interleaving, to increase robustness
+ against packet loss.
+
+4.3.1. Use of Forward Error Correction (FEC)
+
+ Generic forward error correction within RTP is defined, for example,
+ in RFC 5109 [RFC5109]. Audio redundancy coding is defined in RFC
+ 2198 [RFC2198]. Either scheme can be used to add redundant
+ information to the RTP packet stream and make it more resilient to
+ packet losses, at the expense of a higher bitrate. Please see either
+ of the RFCs for a discussion of the implications of the higher
+ bitrate to network congestion.
+
+ In addition to these media-unaware mechanisms, this memo specifies a
+ G.719-specific form of audio redundancy coding, which may be
+ beneficial in terms of packetization overhead. Conceptually,
+ previously transmitted transport frames are aggregated together with
+ new ones. A sliding window can be used to group the frames to be
+ sent in each payload. However, irregular or non-consecutive patterns
+ are also possible by inserting NO_DATA frames between primary and
+ redundant transmissions. Figure 1 below shows an example.
+
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 5]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ --+--------+--------+--------+--------+--------+--------+--------+--
+ | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) |
+ --+--------+--------+--------+--------+--------+--------+--------+--
+
+ <---- p(n-1) ---->
+ <----- p(n) ----->
+ <---- p(n+1) ---->
+ <---- p(n+2) ---->
+ <---- p(n+3) ---->
+ <---- p(n+4) ---->
+
+ Figure 1: An example of redundant transmission
+
+ Here, each frame is retransmitted once in the following RTP payload
+ packet. f(n-2)...f(n+4) denote a sequence of audio frames, and
+ p(n-1)...p(n+4) a sequence of payload packets.
+
+ The mechanism described does not really require signaling at the
+ session setup. However, signaling has been defined to allow for the
+ sender to voluntarily bind the buffering and delay requirements. If
+ nothing is signaled, the use of this mechanism is allowed and
+ unbounded. For a certain timestamp, the receiver may receive
+ multiple copies of a frame containing encoded audio data, even at
+ different encoding rates. The cost of this scheme is bandwidth and
+ the receiver delay necessary to allow the redundant copy to arrive.
+
+ This redundancy scheme provides a functionality similar to the one
+ described in RFC 2198, but it works only if both original frames and
+ redundant representations are G.719 frames. When the use of other
+ media coding schemes is desirable, one has to resort to RFC 2198.
+
+ The sender is responsible for selecting an appropriate amount of
+ redundancy based on feedback about the channel conditions, e.g., in
+ the RTP Control Protocol (RTCP) [RFC3550] receiver reports. The
+ sender is also responsible for avoiding congestion, which may be
+ exacerbated by redundancy (see Section 9 for more details).
+
+4.3.2. Use of Frame Interleaving
+
+ To decrease protocol overhead, the payload design allows several
+ audio transport frames to be encapsulated into a single RTP packet.
+ One of the drawbacks of such an approach is that in the case of
+ packet loss, several consecutive frames are lost. Consecutive frame
+ loss normally renders error concealment less efficient and usually
+ causes clearly audible and annoying distortions in the reconstructed
+ audio. Interleaving of transport frames can improve the audio
+ quality in such cases by distributing the consecutive losses into a
+ number of isolated frame losses, which are easier to conceal.
+
+
+
+Westerlund & Johansson Standards Track [Page 6]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ However, interleaving and bundling several frames per payload also
+ increases end-to-end delay and sets higher buffering requirements.
+ Therefore, interleaving is not appropriate for all use cases or
+ devices. Streaming applications should most likely be able to
+ exploit interleaving to improve audio quality in lossy transmission
+ conditions.
+
+ Note that this payload design supports the use of frame interleaving
+ as an option. The usage of this feature needs to be negotiated in
+ the session setup.
+
+ The interleaving supported by this format is rather flexible. For
+ example, a continuous pattern can be defined, as depicted in
+ Figure 2.
+
+ --+--------+--------+--------+--------+--------+--------+--------+--
+ | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) |
+ --+--------+--------+--------+--------+--------+--------+--------+--
+
+ [ p(n) ]
+ [ p(n+1) ] [ p(n+1) ]
+ [ p(n+2) ] [ p(n+2) ]
+ [ p(n+3) ]
+ [ p(n+4) ]
+
+ Figure 2: An example of interleaving pattern that has constant delay
+
+ In Figure 2, the consecutive frames, denoted f(n-2) to f(n+4), are
+ aggregated into packets p(n) to p(n+4), each packet carrying two
+ frames. This approach provides an interleaving pattern that allows
+ for constant delay in both the interleaving and de-interleaving
+ processes. The de-interleaving buffer needs to have room for at
+ least three frames, including the one that is ready to be consumed.
+ The storage space for three frames is needed, for example, when f(n)
+ is the next frame to be decoded: since frame f(n) was received in
+ packet p(n+2), which also carried frame f(n+3), both these frames are
+ stored in the buffer. Furthermore, frame f(n+1) received in the
+ previous packet, p(n+1), is also in the de-interleaving buffer. Note
+ also that in this example the buffer occupancy varies: when frame
+ f(n+1) is the next one to be decoded, there are only two frames,
+ f(n+1) and f(n+3), in the buffer.
+
+5. Payload Format
+
+ The main purpose of the payload design for G.719 is to maximize the
+ potential of the codec to its fullest degree with as minimal overhead
+ as possible. In the design, both basic and interleaved modes have
+
+
+
+
+Westerlund & Johansson Standards Track [Page 7]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ been included, as the codec is suitable both for conversational and
+ other low-delay applications as well as streaming, where more delay
+ is acceptable.
+
+ The main structural difference between the basic and interleaved
+ modes is the extension of the table of contents entries with frame
+ displacement fields in the interleaved mode. The basic mode supports
+ aggregation of multiple consecutive frames in a payload. The
+ interleaved mode supports aggregation of multiple frames that are
+ non-consecutive in time. In both modes, it is possible to have
+ frames encoded with different frame types in the same payload.
+
+ The payload format also supports the usage of G.719 for carrying
+ multi-channel content using one discrete encoder per channel all
+ using the same bitrate. In this case, a complete frame-block with
+ data from all channels is included in the RTP payload. The data is
+ the concatenation of all the encoded audio frames in the order
+ specified for that number of included channels. Also, interleaving
+ is done on complete frame-blocks rather than on individual audio
+ frames.
+
+5.1. RTP Header Usage
+
+ The RTP timestamp corresponds to the sampling instant of the first
+ sample encoded for the first frame-block in the packet. The
+ timestamp clock frequency SHALL be 48000 Hz. The timestamp is also
+ used to recover the correct decoding order of the frame-blocks.
+
+ The RTP header marker bit (M) SHALL be set to 1 whenever the first
+ frame-block carried in the packet is the first frame-block in a
+ talkspurt (see definition of the talkspurt in Section 4.1 of
+ [RFC3551]). For all other packets, the marker bit SHALL be set to
+ zero (M=0).
+
+ The assignment of an RTP payload type for the format defined in this
+ memo is outside the scope of this document. The RTP profiles in use
+ currently mandate binding the payload type dynamically for this
+ payload format. This is basically necessary because the payload type
+ expresses the configuration of the payload itself, i.e., basic or
+ interleaved mode, and the number of channels carried.
+
+ The remaining RTP header fields are used as specified in [RFC3550].
+
+5.2. Payload Structure
+
+ The payload consists of one or more table of contents (ToC) entries
+ followed by the audio data corresponding to the ToC entries. The
+ following sections describe both the basic mode and the interleaved
+
+
+
+Westerlund & Johansson Standards Track [Page 8]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ mode. Each ToC entry MUST be padded to a byte boundary to ensure
+ octet alignment. The rules regarding maximum payload size given in
+ Section 3.2 of [RFC5405] SHOULD be followed.
+
+5.2.1. Basic ToC Element
+
+ All the different formats and modes in this document use a common
+ basic ToC that may be extended in the different options described
+ below.
+
+ 0 1 2 3 4 5 6 7
+ +-+-+-+-+-+-+-+-+
+ |F| L |R|R|
+ +-+-+-+-+-+-+-+-+
+
+ Figure 3: Basic TOC element
+
+ F (1 bit): If set to 1, indicates that this ToC entry is followed by
+ another ToC entry; if set to zero, indicates that this ToC entry
+ is the last one in the ToC.
+
+ L (5 bits): A field that gives the frame length of each individual
+ frame within the frame-block.
+
+ L length(bytes)
+ ============================
+ 0 0 NO_DATA
+ 1-7 N/A (reserved)
+ 8-22 80+10*(L-8)
+ 23-27 240+20*(L-23)
+ 28-31 N/A (reserved)
+
+ Figure 4: How to map L values to frame lengths
+
+ L=0 (NO_DATA) is used to indicate an empty frame, which is useful
+ if frames are missing (e.g., at re-packetization), or to insert
+ gaps when sending redundant frames together with primary frames in
+ the same payload.
+ The value range [1..7] and [28..31] inclusive is reserved for
+ future use in this document version; if these values occur in a
+ ToC, the entire packet SHOULD be treated as invalid and discarded.
+ A few examples are given below where the frame size and the
+ corresponding codec bitrate is computed based on the value L.
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 9]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ L Bytes Codec Bitrate(kbps)
+ ===================================
+ 8 80 32
+ 9 90 36
+ 10 100 40
+ 12 120 48
+ 16 160 64
+ 22 220 88
+ 23 240 96
+ 25 280 112
+ 27 320 128
+
+ Figure 5: Examples of L values and corresponding frame lengths
+
+ This encoding yields a granularity of 4 kbps between 32 and 88
+ kbps and a granularity of 8 kbps between 88 and 128 kbps with a
+ defined range of 32-128 kbps for the codec data.
+
+ R (2 bits): Reserved bits. SHALL be set to zero on sending and
+ SHALL be ignored on reception.
+
+5.3. Basic Mode
+
+ The basic ToC element shown in Figure 3 is followed by a 1-octet
+ field for the number of frame-blocks (#frames) to form the ToC entry.
+ The frame-blocks field tells how many frame-blocks of the same length
+ the ToC entry relates to.
+
+ 0 1 2 3 4 5 6 7
+ +-+-+-+-+-+-+-+-+
+ | #frames |
+ +-+-+-+-+-+-+-+-+
+
+ Figure 6: Number of frame-blocks field
+
+5.4. Interleaved Mode
+
+ The basic ToC is followed by a 1-octet field for the number of frame-
+ blocks (#frames) and then the DIS fields to form a ToC entry in
+ interleaved mode. The frame-blocks field tells how many frame-blocks
+ of the same length the ToC relates to. The DIS fields, one for each
+ frame-block indicated by the #frames field, express the interleaving
+ distance between audio frames carried in the payload. If necessary
+ to achieve octet alignment, a 4-bit padding is added.
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 10]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | #frames | DIS1 | ... | DISi | ... | DISn | Padd |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 7: Number of frame-block + interleave fields
+
+ DIS1...DISn (4 bits): A list of n (n=#frames) displacement fields
+ indicating the displacement of the i:th (i=1..n) audio frame-block
+ relative to the preceding frame-block in the payload, in units of
+ 20-ms long audio frame-blocks). The 4-bit unsigned integer
+ displacement values may be between zero and 15 indicating the
+ number of audio frame-blocks in decoding order between the
+ (i-1):th and the i:th frame in the payload. Note that for the
+ first ToC entry of the payload, the value of DIS1 is meaningless.
+ It SHALL be set to zero by a sender and SHALL be ignored by a
+ receiver. This frame-block's location in the decoding order is
+ uniquely defined by the RTP timestamp. Note that for subsequent
+ ToC entries DIS1 indicates the number of frames between the last
+ frame of the previous group and the first frame of this group.
+
+ Padd (4 bits): To ensure octet alignment, 4 padding bits SHALL be
+ included at the end of the ToC entry in case there is an odd
+ number of frame-blocks in the group referenced by this ToC entry.
+ These bits SHALL be set to zero and SHALL be ignored by the
+ receiver. If a group containing an even number of frames is
+ referenced by this ToC entry, these padding bits SHALL NOT be
+ included in the payload.
+
+5.5. Audio Data
+
+ The audio data part follows the table of contents. All the octets
+ comprising an audio frame SHALL be appended to the payload as a unit.
+ For each frame-block, the audio frames are concatenated in the order
+ indicated by the table in Section 4.1 of [RFC3551] for the number of
+ channels configured for the payload type in use. So the first
+ channel (leftmost) indicated comes first followed by the next
+ channel. The audio frame-blocks are packetized in increasing
+ timestamp order within each group of frame-blocks (per ToC entry),
+ i.e., oldest frame-block first. The groups of frame-blocks are
+ packetized in the same order as their corresponding ToC entries.
+
+ The audio frames are specified in ITU recommendation [ITU-T-G719].
+
+ The G.719 bit stream is split into a sequence of octets and
+ transmitted in order from the leftmost (most significant (MSB)) bit
+ to the rightmost (least significant (LSB)) bit.
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 11]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+5.6. Implementation Considerations
+
+ An application implementing this payload format MUST understand all
+ the payload parameters specified in this specification. Any mapping
+ of the parameters to a signaling protocol MUST support all
+ parameters. So an implementation of this payload format in an
+ application using SDP is required to understand all the payload
+ parameters in their SDP-mapped form. This requirement ensures that
+ an implementation always can decide whether it is capable of
+ communicating when the communicating entities support this version of
+ the specification.
+
+ Basic mode SHALL be implemented and the interleaved mode SHOULD be
+ implemented. The implementation burden of both is rather small, and
+ supporting both ensures interoperability. However, interleaving is
+ not mandated as it has limited applicability for conversational
+ applications that require tight delay boundaries.
+
+5.6.1. Receiving Redundant Frames
+
+ The reception of redundant audio frames, i.e., more than one audio
+ frame from the same source for the same time slot, MUST be supported
+ by the implementation. In the case that the receiver gets multiple
+ audio frames in different bitrates for the same time slot, it is
+ RECOMMENDED that the receiver keeps the one with the highest bitrate.
+
+5.6.2. Interleaving
+
+ The use of interleaving requires further considerations. As
+ presented in the example in Section 4.3.2, a given interleaving
+ pattern requires a certain amount of the de-interleaving buffer.
+ This buffer space, expressed in a number of transport frame slots, is
+ indicated by the "interleaving" media type parameter. The number of
+ frame slots needed can be converted into actual memory requirements
+ by considering the 320 bytes per frame used by the highest bitrate of
+ G.719.
+
+ The information about the frame buffer size is not always sufficient
+ to determine when it is appropriate to start consuming frames from
+ the interleaving buffer. Additional information is needed when the
+ interleaving pattern changes. The "int-delay" media type parameter
+ is defined to convey this information. It allows a sender to
+ indicate the minimal media time that needs to be present in the
+ buffer before the decoder can start consuming frames from the buffer.
+ Because the sender has full control over the interleaving pattern, it
+ can calculate this value. In certain cases (for example, if joining
+ a multicast session with interleaving mid-session), a receiver may
+ initially receive only part of the packets in the interleaving
+
+
+
+Westerlund & Johansson Standards Track [Page 12]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ pattern. This initial partial reception (in frame sequence order) of
+ frames can yield too few frames for acceptable quality from the audio
+ decoding. This problem also arises when using encryption for access
+ control, and the receiver does not have the previous key. Although
+ the G.719 is robust and thus tolerant to a high random frame erasure
+ rate, it would have difficulties handling consecutive frame losses at
+ startup. Thus, some special implementation considerations are
+ described.
+
+ In order to handle this type of startup efficiently, decoding can
+ start provided that:
+
+ 1. There are at least two consecutive frames available.
+
+ 2. More than or equal to half the frames are available in the time
+ period from where decoding was planned to start and the most
+ forward received decoding.
+
+ After receiving a number of packets, in the worst case as many
+ packets as the interleaving pattern covers, the previously described
+ effects disappear and normal decoding is resumed. Similar issues
+ arise when a receiver leaves a session or has lost access to the
+ stream. If the receiver leaves the session, this would be a minor
+ issue since playout is normally stopped. The sender can avoid this
+ type of problem in many sessions by starting and ending interleaving
+ patterns correctly when risks of losses occur. One such example is a
+ key-change done for access control to encrypted streams. If only
+ some keys are provided to clients and there is a risk they will
+ receive content for which they do not have the key, it is recommended
+ that interleaving patterns do not overlap key changes.
+
+5.6.3. Decoding Validation
+
+ If the receiver finds a mismatch between the size of a received
+ payload and the size indicated by the ToC of the payload, the
+ receiver SHOULD discard the packet. This is recommended because
+ decoding a frame parsed from a payload based on erroneous ToC data
+ could severely degrade the audio quality.
+
+6. Payload Examples
+
+ A few examples to highlight the payload format follow.
+
+6.1. 3 Mono Frames with 2 Different Bitrates
+
+ The first example is a payload consisting of 3 mono frames where the
+ first 2 frames correspond to a bitrate of 32 kbps (80 bytes/frame)
+ and the last is 48 kbps (120 bytes/frame).
+
+
+
+Westerlund & Johansson Standards Track [Page 13]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ The first 32 bits are ToC fields.
+ Bit 0 is '1' as another ToC field follows.
+ Bits 1..5 are '01000' = 80 bytes/frame.
+ Bits 8..15 are '00000010' = 2 frame-blocks with 80 bytes/frame.
+ Bit 16 is '0', no more ToC follows.
+ Bits 17..21 are '01100' = 120 bytes/frame.
+ Bits 24..31 are '00000001' = 1 frame-block with 120 bytes/frame.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |1|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0|0|0 1 1 0 0|0 0|0 0 0 0 0 0 0 1|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |d(0) frame 1 |
+ . .
+ | d(639)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |d(0) frame 2 |
+ . .
+ | d(639)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |d(0) frame 3 |
+ . .
+ | d(959)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+6.2. 2 Stereo Frame-Blocks of the Same Bitrate
+
+ The second example is a payload consisting of 2 stereo frames that
+ correspond to a bitrate of 32 kbps (80 bytes/frame) per channel. The
+ receiver calculates the number of frames in the audio block by
+ multiplying the value of the "channels" parameter (2) with the
+ #frames field value (2) to derive that there are 4 audio frames in
+ the payload.
+
+ The first 16 bits is the ToC field.
+ Bit 0 is '0' as no ToC field follows.
+ Bits 1..5 are '01000' = 80 bytes/frame.
+ Bits 8..15 are '00000010' = 2 frame-blocks with 80 bytes/frame.
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 14]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |0|0 1 0 0 0|0 0|0 0 0 0 0 0 1 0| d(0) frame 1 left ch. |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ . .
+ | d(639)| d(0) frame 1 right ch. |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ . .
+ | d(639)| d(0) frame 2 left ch. |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ . .
+ | d(639)| d(0) frame 2 right ch. |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | d(639)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+6.3. 4 Mono Frames Interleaved
+
+ The third example is a payload consisting of 4 mono frames that
+ correspond to a bitrate of 32 kbps (80 bytes/frame) interleaved. A
+ pattern of interleaving for constant delay when aggregating 4 frames
+ is used in the example below. The actual packet illustrated is
+ packet n, while the previous and following packets' frame-block
+ content is shown to illustrate the pattern.
+
+ Packet n-3: 1, 6, 11, 16
+ Packet n-2: 5, 10, 15, 20
+ Packet n-1: 9, 14, 19, 24
+ Packet n: 13, 18, 23, 28
+ Packet n+1: 17, 22, 27, 32
+ Packet n+2: 21, 26, 31, 36
+
+ The first 32 bits are the ToC field.
+ Bit 0 is '0' as there is no ToC field following.
+ Bits 1..5 are '01000' = 80 bytes/frame.
+ Bits 8..15 are '00000100' = 4 frame-blocks with 80 bytes/frame.
+ Bits 16..19 are '0000' = DIS1 (0).
+ Bits 20..23 are '0100' = DIS2 (4).
+ Bits 24..27 are '0100' = DIS3 (4).
+ Bits 28..31 are '0100' = DIS4 (4).
+
+
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 15]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |0|0 1 0 0 0|0 0|0 0 0 0 0 1 0 0|0 0 0 0|0 1 0 0|0 1 0 0|0 1 0 0|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | d(0) frame 13 |
+ . .
+ | d(639)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | d(0) frame 18 |
+ . .
+ | d(639)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | d(0) frame 23 |
+ . .
+ | d(639)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | d(0) frame 28 |
+ . .
+ | d(639)|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+7. Payload Format Parameters
+
+ This RTP payload format is identified using the media type audio/
+ G719, which is registered in accordance with [RFC4855] and uses the
+ template of [RFC4288].
+
+7.1. Media Type Definition
+
+ The media type for the G.719 codec is allocated from the IETF tree
+ since G.719 has the potential to become a widely used audio codec in
+ general Voice over IP (VoIP), teleconferencing, and streaming
+ applications. This media type registration covers real-time transfer
+ via RTP.
+
+ Note, any unspecified parameter MUST be ignored by the receiver to
+ ensure that additional parameters can be added in any future revision
+ of this specification.
+
+ Type name: audio
+
+ Subtype name: G719
+
+ Required parameters: none
+
+ Optional parameters:
+
+
+
+
+Westerlund & Johansson Standards Track [Page 16]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ interleaving: Indicates that interleaved mode SHALL be used for the
+ payload. The parameter specifies the number of frame-block slots
+ available in a de-interleaving buffer (including the frame that is
+ ready to be consumed) for each source. Its value is equal to one
+ plus the maximum number of frames that can precede any frame in
+ transmission order and follow the frame in RTP timestamp order.
+ The value MUST be greater than zero. If this parameter is not
+ present, interleaved mode SHALL NOT be used.
+
+ int-delay: The minimal media time delay in milliseconds that is
+ needed to avoid underrun in the de-interleaving buffer before
+ starting decoding, i.e., the difference in RTP timestamp ticks
+ between the earliest and latest audio frame present in the de-
+ interleaving buffer expressed in milliseconds. The value is a
+ stream property and provided per source. The allowed values are
+ zero to the largest value expressible by an unsigned 16-bit
+ integer (65535). Please note that in practice, the largest value
+ that can be used is equal to the declared size of the interleaving
+ buffer of the receiver. If the value for some reason is larger
+ than the receiver buffer declared by or for the receiver, this
+ value defaults to the size of the receiver buffer. For sources
+ for which this value hasn't been provided, the value defaults to
+ the size of the receiver buffer. The format is a comma-separated
+ list of synchronization source (SSRC) ":" delay in ms pairs, which
+ in ABNF [RFC5234] is expressed as:
+
+ int-delay = "int-delay:" source-delay *("," source-delay)
+
+ source-delay = SSRC ":" delay-value
+
+ SSRC = 1*8HEXDIG ; The 32-bit SSRC encoded in hex format
+
+ delay-value = 1*5DIGIT ; The delay value in milliseconds
+
+ Example: int-delay=ABCD1234:1000,4321DCB:640
+
+ NOTE: No white space allowed in the parameter before the end of
+ all the value pairs
+
+ max-red: The maximum duration in milliseconds that elapses between
+ the primary (first) transmission of a frame and any redundant
+ transmission that the sender will use. This parameter allows a
+ receiver to have a bounded delay when redundancy is used. Allowed
+ values are between zero (no redundancy will be used) and 65535.
+ If the parameter is omitted, no limitation on the use of
+ redundancy is present.
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 17]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ channels: The number of audio channels. The possible values (1-6)
+ and their respective channel order is specified in Section 4.1 of
+ [RFC3551]. If omitted, it has the default value of 1.
+
+ CBR: Constant Bitrate (CBR) indicates the exact codec bitrate in
+ bits per second (not including the overhead from packetization,
+ RTP header, or lower layers) that the codec MUST use. "CBR" is to
+ be used when the dynamic rate cannot be supported (one case is,
+ e.g., gateway to H.320). "CBR" is mostly used for gateways to
+ circuit switch networks. Therefore, the "CBR" is the rate not
+ including any FEC as specified in Section 4.3.1. If FEC is to be
+ used, the "b=" parameter MUST be used to allow the extra bitrate
+ needed to send the redundant information. It is RECOMMENDED that
+ this parameter is only used when necessary to establish a working
+ communication. The usage of this parameter has implications for
+ congestion control that need to be considered; see Section 9.
+
+ ptime: see [RFC4566].
+
+ maxptime: see [RFC4566].
+
+ Encoding considerations: This media type is framed and binary; see
+ Section 4.8 of [RFC4288].
+
+ Security considerations: See Section 10 of RFC 5404.
+
+ Interoperability considerations: The support of the Interleaving
+ mode is not mandatory and needs to be negotiated. See Section 7.2
+ for how to do that for SDP-based protocols.
+
+ Published specification: RFC 5404
+
+ Applications that use this media type: Real-time audio applications
+ like Voice over IP and teleconference, and multi-media streaming.
+
+ Additional information: none
+
+ Person & email address to contact for further information:
+ Ingemar Johansson
+ <ingemar.s.johansson@ericsson.com>
+
+ Intended usage: COMMON
+
+ Restrictions on usage: This media type depends on RTP framing, and
+ hence is only defined for transfer via RTP [RFC3550]. Transport
+ within other framing protocols is not defined at this time.
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 18]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ Author:
+ Ingemar Johansson <ingemar.s.johansson@ericsson.com>
+ Magnus Westerlund <magnus.westerlund@ericsson.com>
+
+ Change controller: IETF Audio/Video Transport working group
+ delegated from the IESG.
+
+ Additionally, note that file storage of G.719-encoded audio in ISO
+ base media file format is specified in Annex A of [ITU-T-G719].
+ Thus, media file formats such as MP4 (audio/mp4 or video/mp4)
+ [RFC4337] and 3GP (audio/3GPP and video/3GPP) [RFC3839] can contain
+ G.719-encoded audio.
+
+7.2. Mapping to SDP
+
+ The information carried in the media type specification has a
+ specific mapping to fields in the Session Description Protocol (SDP)
+ [RFC4566], which is commonly used to describe RTP sessions. When SDP
+ is used to specify sessions employing the G.719 codec, the mapping is
+ as follows:
+
+ o The media type ("audio") goes in SDP "m=" as the media name.
+
+ o The media subtype (payload format name) goes in SDP "a=rtpmap" as
+ the encoding name. The RTP clock rate in "a=rtpmap" MUST be
+ 48000, and the encoding parameter "channels" (Section 7.1) MUST
+ either be explicitly set to N or omitted, implying a default value
+ of 1. The values of N that are allowed are specified in Section
+ 4.1 in [RFC3551].
+
+ o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and
+ "a=maxptime" attributes, respectively.
+
+ o Any remaining parameters go in the SDP "a=fmtp" attribute by
+ copying them directly from the media type parameter string as a
+ semicolon-separated list of parameter=value pairs.
+
+7.2.1. Offer/Answer Considerations
+
+ The following considerations apply when using SDP offer/answer
+ procedures to negotiate the use of G.719 payload in RTP:
+
+ o Each combination of the RTP payload transport format configuration
+ parameters ("interleaving" and "channels") is unique in its bit
+ pattern and not compatible with any other combination. When
+ creating an offer in an application desiring to use the more
+ advanced features (interleaving or more than one channel), the
+ offerer is RECOMMENDED to also offer a payload type containing
+
+
+
+Westerlund & Johansson Standards Track [Page 19]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ only the configuration with a single channel. If multiple
+ configurations are of interest to the application, they may all be
+ offered; however, care should be taken not to offer too many
+ payload types. An SDP answerer MUST include, in the SDP answer
+ for a payload type, the following parameters unmodified from the
+ SDP offer (unless it removes the payload type): "interleaving" and
+ "channels". However, the value of the "interleaving" parameter
+ MAY be changed. The SDP offerer and answerer MUST generate G.719
+ packets as described by these parameters.
+
+ o The "interleaving" and "int-delay" parameters' values have a
+ specific relationship that needs to be considered. It also
+ depends on the directionality of the streams and their delivery
+ method. The high-level explanation that can be understood from
+ the definition is that the value of "interleaving" declares the
+ size of the receiver buffer, while "int-delay" is a stream
+ property provided by the sender to inform how much buffer space it
+ in practice is using for the stream it sends.
+
+ * For media streams that are sent over multicast, the value of
+ "interleaving" SHALL NOT be changed by the answerer. It shall
+ either be accepted or the payload type deleted. The value of
+ the "int-delay" parameter is a stream property and provided by
+ the offer/answer agent that intends to send media with this
+ payload type, and for each stream coming from that agent (one
+ or more). The value MUST be between zero and what corresponds
+ to the buffer size declared by the value of the "interleaving"
+ parameter.
+
+ * For unicast streams that the offerer declares as send-only, the
+ value of the "interleaving" parameter is the size that the
+ answerer is RECOMMENDED to use by the offerer. The answerer
+ MAY change it to any allowed value. The "int-delay" parameter
+ value will be the one the offerer intends to use unless the
+ answerer reduces the value of the "interleaving" parameter
+ below what is needed for that "int-delay" value. If the
+ "interleaving" value in the answer is smaller than the offer's
+ "int-delay" value, the "int-delay" value is per default reduced
+ to be corresponding to the "interleaving" value. If the
+ offerer is not satisfied with this, he will need to perform
+ another round of offer/answer. As the answerer will not send
+ any media, it doesn't include any "int-delay" in the answer.
+
+ * For unicast streams that the offerer declares as recvonly, the
+ value of "interleaving" in the offer will be the offerer's size
+ of the interleaving buffer. The answerer indicates its
+ preferred size of the interleaving buffer for any future round
+ of offer/answer. The offerer will not provide any "int-delay"
+
+
+
+Westerlund & Johansson Standards Track [Page 20]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ parameter as it is not sending any media. The answerer is
+ recommended to include in its answer an "int-delay" parameter
+ to declare what the property is for the stream it is going to
+ send. The answer is expected to be capable of selecting a
+ valid parameter value that is between zero and the declared
+ maximum number of slots in the de-interleaving buffer.
+
+ * For unicast streams that the offer declares as sendrecv
+ streams, the value of the "interleaving" parameter in the offer
+ will be the offerer's size of the interleaving buffer. The
+ answerer will in the answer indicate the size of its actual
+ interleaving buffer. It is recommended that this value is at
+ least as big as the offer's. The offerer is recommended to
+ include an "int-delay" parameter that is selected based on the
+ answerer having at least as much interleaving space as the
+ offerer unless nothing else is known. As the offerer's
+ interleaving buffer size is not yet known, this may fail, in
+ which case the default rule is to downgrade the value of the
+ "int-delay" to correspond to the full size of the answerer's
+ interleaving buffer. If the offerer isn't satisfied with this,
+ it will need to initiate another round of offer/answer. The
+ answerer is recommended in its answer to include an "int-delay"
+ parameter to declare what the property is for the stream(s) it
+ is going to send. The answer is expected to be capable of
+ selecting a valid parameter value that is between zero and the
+ declared maximum number of slots in the de-interleaving buffer.
+
+ o In most cases, the parameters "maxptime" and "ptime" will not
+ affect interoperability; however, the setting of the parameters
+ can affect the performance of the application. The SDP offer/
+ answer handling of the "ptime" parameter is described in
+ [RFC3264]. The "maxptime" parameter MUST be handled in the same
+ way.
+
+ o The parameter "max-red" is a stream property parameter. For
+ sendonly or sendrecv unicast media streams, the parameter declares
+ the limitation on redundancy that the stream sender will use. For
+ recvonly streams, it indicates the desired value for the stream
+ sent to the receiver. The answerer MAY change the value, but is
+ RECOMMENDED to use the same limitation as the offer declares. In
+ the case of multicast, the offerer MAY declare a limitation; this
+ SHALL be answered using the same value. A media sender using this
+ payload format is RECOMMENDED to always include the "max-red"
+ parameter. This information is likely to simplify the media
+ stream handling in the receiver. This is especially true if no
+ redundancy will be used, in which case "max-red" is set to zero.
+
+ o Any unknown parameter in an offer SHALL be removed in the answer.
+
+
+
+Westerlund & Johansson Standards Track [Page 21]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ o The "b=" SDP parameter SHOULD be used to negotiate the maximum
+ bandwidth to be used for the audio stream. The offerer may offer
+ a maximum rate and the answer may contain a lower rate. If no
+ "b=" parameter is present in the offer or answer, it implies a
+ rate up to 128 kbps.
+
+ o The parameter "CBR" is a receiver capability; i.e., only receivers
+ that really require a constant bitrate should use it. Usage of
+ this parameter has a negative impact on the possibility to perform
+ congestion control; see Section 9. For recvonly and sendrecv
+ streams, it indicates the desired constant bitrate that the
+ receiver wants to accept. A sender MUST be able to send a
+ constant bitrate stream since it is a subset of the variable
+ bitrate capability. If the offer includes this parameter, the
+ answerer MUST send G.719 audio at the constant bitrate if it is
+ within the allowed session bitrate ("b=" parameter). If the
+ answerer cannot support the stated CBR, this payload type must be
+ refused in the answer. The answerer SHOULD only include this
+ parameter if the answerer itself requires to receive at a constant
+ bitrate, even if the offer did not include the "CBR" parameter.
+ In this case, the offerer SHALL send at the constant bitrate, but
+ SHALL be able to accept media at a variable bitrate. An answerer
+ is RECOMMEND to use the same CBR as in the offer, as symmetric
+ usage is more likely to work. If both sides require a particular
+ CBR, there is the possibility of communication failure when one or
+ both sides can't transmit the requested rate. In this case, the
+ agent detecting this issue will have to perform a second round of
+ offer/answer to try to find another working configuration or end
+ the established session. In case the offer contained a "CBR"
+ parameter but the answer does not, then the offerer is free to
+ transmit at any rate to the answerer, but the answerer is
+ restricted to the declared rate.
+
+7.2.2. Declarative SDP Considerations
+
+ In declarative usage, like SDP in the Real Time Streaming Protocol
+ (RTSP) [RFC2326] or the Session Announcement Protocol (SAP)
+ [RFC2974], the parameters SHALL be interpreted as follows:
+
+ o The payload format configuration parameters ("interleaving" and
+ "channels") are all declarative, and a participant MUST use the
+ configuration(s) that is provided for the session. More than one
+ configuration may be provided if necessary by declaring multiple
+ RTP payload types; however, the number of types should be kept
+ small.
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 22]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ o It might not be possible to know the SSRC values that are going to
+ be used by the sources at the time of sending the SDP. This is
+ not a major issue as the size of the interleaving buffer can be
+ tailored towards the values that are actually going to be used,
+ thus ensuring that the default values for "int-delay" are not
+ resulting in too much extra buffering.
+
+ o Any "maxptime" and "ptime" values should be selected with care to
+ ensure that the session's participants can achieve reasonable
+ performance.
+
+ o The parameter "CBR" if included applies to all RTP streams using
+ that payload type for which a particular CBR is declared. Usage
+ of this parameter has a negative impact on the possibility to
+ perform congestion control; see Section 9.
+
+8. IANA Considerations
+
+ One media type (audio/G719) has been defined and registered in the
+ media types registry; see Section 7.1.
+
+9. Congestion Control
+
+ The general congestion control considerations for transporting RTP
+ data apply; see RTP [RFC3550] and any applicable RTP profile like AVP
+ [RFC3551]. However, the multi-rate capability of G.719 audio coding
+ provides a mechanism that may help to control congestion, since the
+ bandwidth demand can be adjusted (within the limits of the codec) by
+ selecting a different encoding bitrate.
+
+ The number of frames encapsulated in each RTP payload highly
+ influences the overall bandwidth of the RTP stream due to header
+ overhead constraints. Packetizing more frames in each RTP payload
+ can reduce the number of packets sent and hence the header overhead,
+ at the expense of increased delay and reduced error robustness. If
+ forward error correction (FEC) is used, the amount of FEC-induced
+ redundancy needs to be regulated such that the use of FEC itself does
+ not cause a congestion problem. In other words, a sender SHALL NOT
+ increase the total bitrate when adding redundancy in response to
+ packet loss, and needs instead to adjust it down in accordance to the
+ congestion control algorithm being run. Thus, when adding
+ redundancy, the media bitrate will need to be reduced to provide room
+ for the redundancy.
+
+ The "CBR" signaling parameter allows a receiver to lock down an RTP
+ payload type to use a single encoding rate. As this prevents the
+ codec rate from being lowered when congestion is experienced, the
+ sender is constrained to either change the packetization or abort the
+
+
+
+Westerlund & Johansson Standards Track [Page 23]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ transmission. Since these responses to congestion are severely
+ limited, implementations SHOULD NOT use the "CBR" parameter unless
+ they are interacting with a device that cannot support a variable
+ bitrate (e.g., a gateway to H.320 systems). When using CBR mode, a
+ receiver MUST monitor the packet loss rate to ensure congestion is
+ not caused, following the guidelines in Section 2 of RFC 3551.
+
+10. Security Considerations
+
+ RTP packets using the payload format defined in this specification
+ are subject to the security considerations discussed in the RTP
+ specification [RFC3550] and in any applicable RTP profile. The main
+ security considerations for the RTP packet carrying the RTP payload
+ format defined within this memo are confidentiality, integrity, and
+ source authenticity. Confidentiality is achieved by encryption of
+ the RTP payload. Integrity of the RTP packets is achieved through a
+ suitable cryptographic integrity protection mechanism. Such a
+ cryptographic system may also allow the authentication of the source
+ of the payload. A suitable security mechanism for this RTP payload
+ format should provide confidentiality, integrity protection, and at
+ least source authentication capable of determining if an RTP packet
+ is from a member of the RTP session.
+
+ Note that the appropriate mechanism to provide security to RTP and
+ payloads following this memo may vary. It is dependent on the
+ application, the transport, and the signaling protocol employed.
+ Therefore, a single mechanism is not sufficient, although if
+ suitable, usage of the Secure Real-time Transport Protocol (SRTP)
+ [RFC3711] is recommended. Other mechanisms that may be used are
+ IPsec [RFC4301] and Transport Layer Security (TLS) [RFC5246] (RTP
+ over TCP); other alternatives may exist.
+
+ The use of interleaving in conjunction with encryption can have a
+ negative impact on confidentiality for a short period of time.
+ Consider the following packets (in brackets) containing frame numbers
+ as indicated: {10, 14, 18}, {13, 17, 21}, {16, 20, 24} (a popular
+ continuous diagonal interleaving pattern). The originator wishes to
+ deny some participants the ability to hear material starting at time
+ 16. Simply changing the key on the packet with the timestamp at or
+ after 16, and denying that new key to those participants, does not
+ achieve this; frames 17, 18, and 21 have been supplied in prior
+ packets under the prior key, and error concealment may make the audio
+ intelligible at least as far as frame 18 or 19, and possibly further.
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 24]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+ This RTP payload format and its media decoder do not exhibit any
+ significant non-uniformity in the receiver-side computational
+ complexity for packet processing, and thus are unlikely to pose a
+ denial-of-service threat due to the receipt of pathological data.
+ Nor does the RTP payload format contain any active content.
+
+11. Acknowledgements
+
+ The authors would like to thank Roni Even and Anisse Taleb for their
+ help with this document. We would also like to thank the people who
+ have provided feedback: Colin Perkins, Mark Baker, and Stephen
+ Botzko.
+
+12. References
+
+12.1. Normative References
+
+ [ITU-T-G719] ITU-T, "Specification : ITU-T G.719 extension for 20
+ kHz fullband audio", April 2008.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer
+ Model with Session Description Protocol (SDP)",
+ RFC 3264, June 2002.
+
+ [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
+ Jacobson, "RTP: A Transport Protocol for Real-Time
+ Applications", STD 64, RFC 3550, July 2003.
+
+ [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio
+ and Video Conferences with Minimal Control", STD 65,
+ RFC 3551, July 2003.
+
+ [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP:
+ Session Description Protocol", RFC 4566, July 2006.
+
+ [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
+ Specifications: ABNF", STD 68, RFC 5234, January 2008.
+
+ [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage
+ Guidelines for Application Designers", BCP 145,
+ RFC 5405, November 2008.
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 25]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+12.2. Informative References
+
+ [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
+ Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse-
+ Parisis, "RTP Payload for Redundant Audio Data",
+ RFC 2198, September 1997.
+
+ [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time
+ Streaming Protocol (RTSP)", RFC 2326, April 1998.
+
+ [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session
+ Announcement Protocol", RFC 2974, October 2000.
+
+ [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and
+ K. Norrman, "The Secure Real-time Transport Protocol
+ (SRTP)", RFC 3711, March 2004.
+
+ [RFC3839] Castagno, R. and D. Singer, "MIME Type Registrations
+ for 3rd Generation Partnership Project (3GPP)
+ Multimedia files", RFC 3839, July 2004.
+
+ [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications
+ and Registration Procedures", BCP 13, RFC 4288,
+ December 2005.
+
+ [RFC4301] Kent, S. and K. Seo, "Security Architecture for the
+ Internet Protocol", RFC 4301, December 2005.
+
+ [RFC4337] Y Lim and D. Singer, "MIME Type Registration for
+ MPEG-4", RFC 4337, March 2006.
+
+ [RFC4855] Casner, S., "Media Type Registration of RTP Payload
+ Formats", RFC 4855, February 2007.
+
+ [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error
+ Correction", RFC 5109, December 2007.
+
+ [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer
+ Security (TLS) Protocol Version 1.2", RFC 5246,
+ August 2008.
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 26]
+
+RFC 5404 RTP Payload Format for G.719 January 2009
+
+
+Authors' Addresses
+
+ Magnus Westerlund
+ Ericsson AB
+ Torshamnsgatan 21-23
+ SE-164 83 Stockholm
+ SWEDEN
+
+ Phone: +46 10 7190000
+ EMail: magnus.westerlund@ericsson.com
+
+ Ingemar Johansson
+ Ericsson AB
+ Laboratoriegrand 11
+ SE-971 28 Lulea
+ SWEDEN
+
+ Phone: +46 10 7190000
+ EMail: ingemar.s.johansson@ericsson.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Westerlund & Johansson Standards Track [Page 27]
+