diff options
Diffstat (limited to 'doc/rfc/rfc4598.txt')
-rw-r--r-- | doc/rfc/rfc4598.txt | 955 |
1 files changed, 955 insertions, 0 deletions
diff --git a/doc/rfc/rfc4598.txt b/doc/rfc/rfc4598.txt new file mode 100644 index 0000000..8397d1c --- /dev/null +++ b/doc/rfc/rfc4598.txt @@ -0,0 +1,955 @@ + + + + + + +Network Working Group B. Link +Request for Comments: 4598 Dolby Laboratories +Category: Standards Track July 2006 + + + Real-time Transport Protocol (RTP) + Payload Format for Enhanced AC-3 (E-AC-3) Audio + +Status of This Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2006). + +Abstract + + This document describes a Real-time Transport Protocol (RTP) payload + format for transporting Enhanced AC-3 (E-AC-3) encoded audio data. + E-AC-3 is a high-quality, multichannel audio coding format and is an + extension of the AC-3 audio coding format, which is used in US High- + Definition Television (HDTV), DVD, cable and satellite television, + and other media. E-AC-3 is an optional audio format in US and world + wide digital television and high-definition DVD formats. The RTP + payload format as presented in this document includes support for + data fragmentation. + + + + + + + + + + + + + + + + + + + + +Link Standards Track [Page 1] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + +Table of Contents + + 1. Introduction ....................................................2 + 2. Overview of Enhanced-AC-3 .......................................3 + 2.1. E-AC-3 Bit Stream ..........................................5 + 2.1.1. Sync Frames and Audio Blocks ........................5 + 2.1.2. Programs and Substreams .............................6 + 2.1.3. Frame Sets ..........................................7 + 3. RTP E-AC-3 Header Fields ........................................7 + 4. RTP E-AC-3 Payload Format .......................................8 + 4.1. Payload Specific Header ....................................8 + 4.2. Fragmentation of E-AC-3 Frames .............................9 + 4.3. Concatenation of E-AC-3 Frames .............................9 + 4.4. Carriage of AC-3 Frames ...................................10 + 5. Types and Names ................................................10 + 5.1. Media Type Registration ...................................10 + 5.2. SDP Usage .................................................13 + 6. Security Considerations ........................................14 + 7. Congestion Control .............................................15 + 8. IANA Considerations ............................................15 + 9. References .....................................................15 + 9.1. Normative References ......................................15 + 9.2. Informative References ....................................16 + +1. Introduction + + The Enhanced AC-3 (E-AC-3) [ETSI] audio coding system is built on a + foundation of AC-3. It is an enhancement and extension to AC-3, + which is an existing audio coding standard commonly used for DVD, + broadcast, cable, and satellite television content. E-AC-3 is + designed to enable operation at both higher and lower data rates than + AC-3, provide expanded channel configurations, and provide greater + flexibility for carriage of multiple audio program elements. The + relationship between E-AC-3 and AC-3 provides for low-loss, low-cost + conversion between the two and makes E-AC-3 especially suitable in + applications that require compatibility with the existing broadcast- + reception and audio/video decoding infrastructure. Dolby Digital + Plus is a branded version of Enhanced AC-3. + + E-AC-3 has been standardized within both the European + Telecommunications Standards Institute (ETSI) and the Advanced + Television Systems Committee (ATSC). It is an optional audio format + for use in US (ATSC) and Digital Video Broadcasting (DVB) television + transmission. It is also a required audio format for use in the High + Definition (HD)-DVD optical-storage media format and included in the + Blu-ray Disc format. + + + + + +Link Standards Track [Page 2] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + There is a need to stream E-AC-3 content over IP networks. E-AC-3 is + primarily used in audio-for-video applications, so RTP serves well as + a transport solution with its mechanism for synchronizing streams. + Applications for streaming E-AC-3 include Internet Protocol + television (IPTV), video on demand, interactive features of next + generation DVD formats, and transfer of movies across a home network. + + Section 2 gives a brief overview of the E-AC-3 algorithm. Section 3 + specifies values for fields in the RTP header, and Section 4 + specifies the E-AC-3 payload format, itself. Section 5 discusses + media types and Session Description Protocol (SDP) usage. Security + considerations are covered in Section 6, congestion control in + Section 7, and IANA considerations in Section 8. + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in [RFC2119]. + +2. Overview of Enhanced-AC-3 + + Enhanced AC-3 (E-AC-3) is a frequency-domain perceptual audio coding + system. Time blocks of an audio signal are converted from the time + domain to the frequency domain by a transform (the Modified Discrete + Cosine Transform (MDCT)) so that a model of the human auditory + perceptual system can be applied. In this domain, quantization noise + can be constrained to specific frequency regions. The perceptual + model predicts in which frequency regions the auditory system will be + least able to detect the quantization noise from data rate reduction. + A more detailed technical description of E-AC-3 can be found in + [2004AES]. + + E-AC-3 is built upon a foundation of AC-3. More background on AC-3 + can be found in the AC-3 specification [ETSI], a technical paper + [1994AES], and the AC-3 RTP payload format [RFC4184]. The frame + structure and meta-data of AC-3 are maintained. E-AC-3 content is + not directly compatible with AC-3 decoders, but it can be converted + to the AC-3 format to provide compatibility with existing decoders. + Because AC-3 is the foundation of E-AC-3, conversion between the two + formats can be done in a way that minimizes the degradations + associated with tandem coding. In addition, the computational cost + of the conversion is reduced compared to a full decode and re-encode. + + E-AC-3 exploits psychoacoustic phenomena that cause a significant + fraction of the information contained in a typical audio signal to be + inaudible. Substantial data reduction occurs via the removal of + inaudible information contained in an audio stream. Source coding + techniques are further used to reduce the data rate. + + + + +Link Standards Track [Page 3] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + Like most perceptual coders, E-AC-3 operates in the frequency domain. + A 512-point MDCT transform is taken with 50% overlap, providing 256 + new frequency samples. Frequency samples are then converted to + exponents and mantissas. Exponents are differentially encoded. + Mantissas are allocated a varying number of bits depending on the + audibility of the spectral components associated with them. + Audibility is determined via a masking curve. Bits for mantissas are + allocated from a global bit pool. + + E-AC-3 adds new coding tools, such as a longer filter bank, vector + quantization, and spectral extension, to provide greater data + efficiency and to operate at lower data rates than AC-3. In the + other direction, an expanded bit stream syntax and new frame + constraints permit operation at higher data rates than AC-3. The + E-AC-3 syntax also allows a larger number of audio channels in one + bit stream. E-AC-3 operates at data rates from 32 kbps to 6.144 Mbps + and at three sampling rates: 32 kHz, 44.1 kHz, and 48 kHz. + + E-AC-3 supports the carriage of multiple programs and the carriage of + programs with more than a baseline of 5.1 audio channels. Both of + these extensions beyond AC-3 are accomplished by time multiplexing + additional data with baseline data. In the case of multiple + programs, frames with data for the programs are interleaved. In the + case of more than 5.1 channels, frames from substreams carrying the + extra channels are interleaved with the independent substream that + carries a 5.1-channel compatible mix. Both of these forms of + multiplexing can occur in the same bit stream. In other words, + mixing multiple programs, some or all with more than 5.1 channels, is + permitted. + + Additional channel capacity is enabled by adding substreams to a + program. One primary substream, called the "independent substream", + is required for each program. This substream carries a self- + contained mix of the audio, using a maximum of 5.1 channels, which + makes its channel configuration compatible with AC-3. Then, + additional, optional substreams are used in the program to carry + additional channels. The data for each additional channel carries an + indication of whether that channel provides data for an additional + speaker location or replacement data for one of the speaker locations + already defined by a previous substream. For example, one common + 7.1-channel format uses three front channels and four surround + channels. It is packaged with a primary substream, which contains a + 5.1-channel downmix of the 7.1-channel content, using left, center, + right, left surround, right surround, and low-frequency effects + channels. One dependent substream supplies four channels: + replacements for left surround and right surround, along with two + additional surround channels (left back and right back). + + + + +Link Standards Track [Page 4] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + The specification for E-AC-3 [ETSI] requires that all E-AC-3 decoders + be capable of decoding at least a baseline portion of any E-AC-3 bit + stream, which consists of the first independent substream of the + first program, and of ignoring the other elements of the bit stream. + This baseline is limited to 5.1 channels, and a system is also able + to convert to configurations with fewer channels for a presentation + that matches its output capabilities, if needed. More capable + decoders can optionally choose among and mix multiple programs, and + also decode configurations with more channels than the baseline by + decoding dependent substreams. + +2.1. E-AC-3 Bit Stream + +2.1.1. Sync Frames and Audio Blocks + + The basic organizational building block in an E-AC-3 bit stream is + the sync frame (also called a frame in this document). A sync frame + contains the data necessary to decode time domain audio samples for + one or more channels over a time of one or more audio blocks, so a + frame is an Application Data Unit (ADU). Each E-AC-3 frame contains + a Sync Information (SI) field, a Bit Stream Information (BSI) field, + an Audio Frame (AF) field, and up to six audio blocks (ABs). Each AB + represents 256 Pulse Code Modulation (PCM) samples for each channel. + The frame ends with an optional auxiliary data field (AUX) and an + error correction field (CRC). Figure 1 shows the structure of an + E-AC-3 frame, where N is the number of blocks in the frame. + + +---+---+---+---------+- ... -+---------+---+---+ + |SI |BSI|AF | AB(0) | ... | AB(N) |AUX|CRC| + +---+---+---+---------+- ... -+---------+---+---+ + + Figure 1. E-AC-3 frame format with more than one block + + The SI field contains information needed to acquire and maintain + codec synchronization. The BSI field contains parameters that + describe the coded audio service. It carries an indication of the + size of the frame in 16-bit words ('frmsiz', Section E.1.3 of [ETSI]) + and an indication of the sampling rate ('fscod'). It also carries an + indication of the number of blocks in the frame ('numblkscod'); + permitted values are one, two, three, or six blocks. The AF field + contains information about coding tools that applies to the entire + frame. Each block has a duration of 256 samples, so a frame's + duration is the corresponding multiple of 256 samples. The time + duration of the frame is also dependent on the sampling rate, as + shown in Table 1. + + + + + + +Link Standards Track [Page 5] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + Table 1. Time duration of E-AC-3 frame (number of blocks vs. + sampling rate) + + +------------------+--------+-----------------+-----------------+ + | blocks per frame | 32 kHz | 44.1 kHz | 48 kHz | + +------------------+--------+-----------------+-----------------+ + | 1 | 8 ms | approx. 5.8 ms | approx. 5.3 ms | + | 2 | 16 ms | approx. 11.6 ms | approx. 10.7 ms | + | 3 | 24 ms | approx. 17.4 ms | 16 ms | + | 6 | 48 ms | approx. 34.8 ms | 32 ms | + +------------------+--------+-----------------+-----------------+ + + Each audio block contains header fields that indicate the use of + various coding tools: block switching, dither, coupling, spectral + extension, and exponent strategy. They also contain metadata, + optionally used to enhance playback, such as dynamic range control. + Finally, the exponents and bit allocation data needed to decode the + mantissas into audio data, and the mantissas themselves, are + included. The format of audio blocks is described in detail in + [ETSI]. + +2.1.2. Programs and Substreams + + An E-AC-3 bit stream is logically arranged into programs. A bit + stream contains one or more programs, up to a maximum of eight. When + multiple programs are present in a bit stream, the frames that + constitute them are interleaved in time. + + +----------+- -+----------+----------+- -+----------+- + |Program(1)| ... |Program(N)|Program(1)| ... |Program(N)| ... + | Frame 0 | | Frame 0 | Frame 1 | | Frame 1 | + +----------+- -+----------+----------+- -+----------+- + + Figure 2. Interleaving of multiple programs in an E-AC-3 bit stream + + Each program contains one independent substream and optionally + contains up to eight dependent substreams. The independent substream + carries a soundtrack of up to 5.1 channels, the multichannel format + that matches the capabilities of AC-3, and can be meaningfully + decoded and presented without any of the associated dependent + substreams. The dependent substreams are used to provide alternate + channel data that enable different channel configurations, for + example, to increase the number of channels beyond 5.1. A frame of a + dependent substream can be decoded by itself, but its content can + only be meaningfully presented in conjunction with the corresponding + independent substream. The type and identity of the substream to + which a frame belongs can be determined from parameters in the + frame's BSI (strmtyp and substreamid, in Section E.1.3.1 of [ETSI]). + + + +Link Standards Track [Page 6] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + When a program contains more than one substream, the frames belonging + to those substreams are interleaved in time, and taken together, the + frames of a program that correspond to the same time period are + called a 'program set'. Figure 3 shows the interleaving of + substreams for a single program. + + / --------- program set for frame 0 ------- \ + : : + +-------------+-------------+- -+-------------+-------------+- + | Program(1) | Program(1) | | Program(1) | Program(1) | + | Independent | Dependent | ... | Dependent | Independent | ... + | Substream | Substream(0)| | Substream(n)| Substream | + | Frame 0 | Frame 0 | | Frame 0 | Frame 1 | + +-------------+-------------+- -+-------------+-------------+- + + Figure 3. Interleaving of multiple substreams in an E-AC-3 program + +2.1.3. Frame Sets + + A further logical organization of the E-AC-3 bit stream is applied to + facilitate conversion of E-AC-3 bit streams to AC-3 bit streams. In + this organization, the frames carrying six consecutive audio blocks + are treated as a group, called a 'frame set', regardless of the + number of frames needed to carry six audio blocks. This grouping + extends across all programs and substreams that cover the time period + of the six blocks. Since E-AC-3 frames may carry one, two, three, or + six blocks, a frame set will consist of six, three, two, or one + frames. AC-3 frames always carry six blocks, so the frame set + provides framing synchronization between an E-AC-3 bit stream and an + AC-3 bit stream. Metadata that indicates the alignment is carried in + the first frame (which will be part of an independent substream) of + each frame set in an E-AC-3 stream. This first frame can be + identified by a parameter in the BSI field of the bit stream: the + Converter Synchronization flag (convsync, in Section E.1.3.1.34 of + [ETSI]) is set to true (1). + +3. RTP E-AC-3 Header Fields + + The RTP header is defined in the RTP specification [RFC3550]. This + section defines how a number of fields in the header are used. + + o Payload Type (PT): The assignment of an RTP payload type for this + packet format is outside the scope of this document; it is + specified by the RTP profile under which this payload format is + used, or signaled dynamically out-of-band (e.g., using SDP). + + + + + + +Link Standards Track [Page 7] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + o Marker (M) bit: The M bit is set to one to indicate that the RTP + packet payload contains at least one complete E-AC-3 frame or + contains the final fragment of an E-AC-3 frame. + + o Extension (X) bit: Defined by the RTP profile used. + + o Timestamp: A 32-bit word that corresponds to the sampling instant + for the first E-AC-3 frame in the RTP packet. Packets containing + fragments of the same frame MUST have the same timestamp. The + timestamp of the first RTP packet sent SHOULD be selected at + random; thereafter, it increases linearly according to the number + of samples included in each frame. Note that the number of + samples in a frame depends on the number of blocks in the frame, + with 256 samples in each block. Also note that more than one + frame might correspond to the same time period when multiple + channel configurations or programs are present. If these frames + occupy multiple packets, it is possible that the resulting packets + will have the same timestamp value. + +4. RTP E-AC-3 Payload Format + + This payload format is defined for E-AC-3, as defined in Annex E of + [ETSI]. Note that E-AC-3 decoders are required to be capable of + decoding AC-3 bit streams, so a receiver capable of receiving the + E-AC-3 payload format defined in this document MUST also receive the + payload format for AC-3 defined in [RFC4184]. + + According to [RFC2736], RTP payload formats should contain an + integral number of application data units (ADUs). The E-AC-3 frame + corresponds to an ADU in the context of this payload format. Each + RTP payload MUST start with the two-byte payload specific header + followed by an integral number of complete E-AC-3 frames, or a single + fragment of an E-AC-3 frame. + + If an E-AC-3 frame exceeds the MTU for a network, it SHOULD be + fragmented for transmission within an RTP packet. Section 4.2 + provides guidelines for creating frame fragments. + +4.1. Payload Specific Header + + There is a two-octet Payload header at the beginning of each payload. + Each E-AC-3 RTP payload MUST begin with the following Payload header. + + + + + + + + + +Link Standards Track [Page 8] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | MBZ |F| NF | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 4. E-AC-3 RTP Payload header + + o Must Be Zero (MBZ): Bits marked MBZ SHALL be set to the value zero + and SHALL be ignored by receivers. The bits are reserved for + future extensions. + + o Frame Type (F): This one-bit field indicates the type of frame(s) + present in the payload. It takes the following values: 0 - One + or more complete frames. 1 - Fragment of frame. (Note that the M + bit in the RTP header is set for the final fragment.) + + o Number of frames/fragments (NF): An 8-bit field whose meaning + depends on the Frame Type (F) in this payload. For complete + frames (F of 0), it is used to indicate the number of E-AC-3 + frames in the RTP payload. For frame fragments (F of 1), it is + used to indicate the number of fragments (and therefore packets) + that make up the current frame. NF MUST be identical for packets + containing fragments of the same frame. + + When receiving E-AC-3 payloads with F = 0 and more than a single + frame (NF > 1), a receiver needs to use the "frmsiz" field in the BSI + header in each E-AC-3 frame to determine the frame's length if the + receiver needs to determine the boundary of the next frame. Note + that the frame length varies from frame to frame in some + circumstances. + +4.2. Fragmentation of E-AC-3 Frames + + The size of an E-AC-3 frame is signaled in the Frame Size (frmsiz) + field in a frame's BSI header. The value of this field is one less + than the number of 16-bit words in the frame. If the size of an + E-AC-3 frame exceeds the MTU size, the frame SHOULD be fragmented at + the RTP level. The fragmentation MAY be performed at any byte + boundary in the frame. RTP packets containing fragments of the same + E-AC-3 frame SHALL be sent in consecutive order, from first to last + fragment. This enables a receiver to assemble the fragments in the + correct order. + +4.3. Concatenation of E-AC-3 Frames + + There are cases where E-AC-3 frame sizes are smaller than the MTU + size and it is advantageous to include multiple frames in a packet. + + + +Link Standards Track [Page 9] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + It is useful to take into account the logical arrangement of the bit + stream into program sets and frame sets to constrain the effects of + the loss of a packet. It is desirable for a complete program set or + a complete frame set to be included in one packet. Also, it is + undesirable for frames from more than one program set or frame set to + be in the same packet, unless the sets are complete. In this way, + the loss of a packet is kept from causing the contents of another + packet to be unusable. + + Frames from more than one program set SHOULD NOT be included in the + same packet unless all program sets in the packet are complete. + Frames from more than one frame set SHOULD NOT be included in the + same packet unless all frame sets in the packet are complete. + +4.4. Carriage of AC-3 Frames + + The E-AC-3 specification [ETSI] requires that E-AC-3 decoders be + capable of decoding AC-3 frames. That specification also supports + carriage of AC-3 frames in an E-AC-3 bit stream. Due to differences + between E-AC-3 and AC-3 frames, there are restrictions placed on the + use of AC-3 frames: they are only used for the independent substream + of the first (or only) program in an E-AC-3 bit stream. Note that + carriage of only E-AC-3 frames, only AC-3 frames, and a mixture of + E-AC-3 and AC-3 frames are all legal configurations. It is legal to + change among the configurations in a bit stream. The AC-3 frame + format is described in [RFC4184] and specified in [ETSI]. + +5. Types and Names + +5.1. Media Type Registration + + This registration uses the template defined in [RFC4288] and follows + [RFC3555]. + + To: ietf-types@iana.org + Subject: Registration of media type audio/eac3 + + Type name: audio + + Subtype name: eac3 + + Required parameter: + + o rate: The RTP timestamp clock rate that is equal to the audio + sampling rate. Permitted rates are 32000, 44100, and 48000. + + + + + + +Link Standards Track [Page 10] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + Optional parameter: + + o bitStreamConfig: The configuration of programs and substreams in + the bit stream, expressed as a sequence of ASCII characters. This + parameter can serve two purposes. First, during the creation of a + session, the bitStreamConfig parameter might be used to negotiate + a match between the requirements of a bit stream and the + capabilities of a receiver to avoid using network bandwidth for + data that cannot be used. Second, it makes the configuration of + the bit stream explicit to the receiver so that whenever a packet + is lost, the receiver can identify which kind of frame(s) has been + lost to aid error mitigation. + + The format for the value for this parameter is to represent each + substream of the bit stream by a single character indicating its + type, immediately followed by the number of audio channels + resulting if a frame of that substream (plus any other required + substreams) is decoded. Note that even though Low-Frequency + Effects (LFE) channels are often described as "fractional" + channels (e.g., the ".1" in 5.1), for this parameter, an LFE + channel is counted as one (e.g., a 5.1-channel configuration is + indicated as 6). The configuration of the bit stream MUST match + the value of this parameter for the duration of the session. + + Allowed values for the substream type are as follows: + + i - Independent substream. + d - Dependent substream. + + The E-AC-3 specification [ETSI] defines which configurations of bit + streams are legal, which constrains the values the bitStreamConfig + parameter will take. Each program starts with, and contains exactly + one, independent substream ('i'). Each independent substream is + followed by between 0 and 8 dependent substreams ('d'), which belong + to the same program. See Section 2.1.2 for more discussion of + programs and substreams. + + For example, consider a bit stream containing two programs: + + * the first program with + + + a six-channel independent substream + + a dependent substream containing the additional channels needed + for eight channels + + a second dependent substream containing the further channels + needed for 14 channels + + + + + +Link Standards Track [Page 11] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + * along with a second program with + + + another six-channel independent substream + + a dependent substream containing the additional channels needed + for eight channels + + Then the configuration of the bit stream is indicated as follows: + + bitStreamConfig = i6d8d14i6d8 + + When the bitStreamConfig parameter is being used in an offer/answer + exchange, zero (0) for the number of channels for a substream in an + answer is used to indicate a substream that the answerer desires not + to receive. + + Encoding considerations: + + This media type is framed and contains binary data. + + Security considerations: + + See Section 6 of RFC 4598. + + Interoperability considerations: + + To maintain interoperability with AC-3-capable end-points, in cases + where negotiation is possible, an E-AC-3 end-point SHOULD declare + itself also as AC-3 capable (i.e., supporting also "audio/ac3" as + specified in RFC 4184 [RFC4184]). Note that all E-AC-3 end-points + are required to be AC-3 capable. + + Published specification: + + RFC 4598 and ETSI TS 102.366 [ETSI]. + + Applications that use this media type: + + Multichannel audio compression of audio, and audio for video. + + Additional information: + + Magic number(s): The first two octets of an E-AC-3 frame are + always the synchronization word, which has the hex value + 0x0B77. + + Person & email address to contact for further information: + + Brian Link <bdl@dolby.com> IETF AVT working group. + + + +Link Standards Track [Page 12] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + Intended usage: + + COMMON + + Restrictions on usage: + + This media type depends on RTP framing, and hence is only defined + for transfer via RTP [RFC3550]. Transport within other framing + protocols is not defined at this time. + + Author/Change controller: + + IETF Audio/Video Transport Working Group delegated from the IESG. + +5.2. SDP Usage + + The information carried in the media type specification has a + specific mapping to fields in the Session Description Protocol (SDP) + [RFC2327], which is commonly used to describe RTP sessions. When SDP + is used to specify sessions employing E-AC-3, the mapping is as + follows: + + o The Media type ("audio") goes in SDP "m=" as the media name. + + o The Media subtype ("eac3") goes in SDP "a=rtpmap" as the encoding + name. + + o The required parameter "rate" also goes in "a=rtpmap" as the clock + rate. (The optional "channels" rtpmap encoding parameter is not + used. Instead, the information is included in the optional + parameter bitStreamConfig.) + + o The optional parameter "bitStreamConfig" goes in the SDP "a=fmtp" + attribute. + + The following is an example of the SDP data for E-AC-3: + + m=audio 49111 RTP/AVP 100 + a=rtpmap:100 eac3/48000 + a=fmtp:100 bitStreamConfig i6d8d14i6d8 + + Certain considerations are needed when SDP is used to perform + offer/answer exchanges [RFC3264]. + + o The "rate" is a symmetric parameter, and the answer MUST use the + same value or the answerer removes the payload type. + + + + + +Link Standards Track [Page 13] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + + o The "bitStreamConfig" parameter is declarative and indicates, for + sendonly, the intended arrangement of substreams in the bit + stream, along with the channel configuration, to transmit, and for + recvonly or sendrecv, the desired bit stream arrangement and + channel configuration to receive. The format of the + bitStreamConfig value in an answer MAY differ from the offer value + by replacing the number of channels for any undesired substreams + with '0'. It is valid to zero out dependent substreams containing + undesired channel configurations and to zero out all the + substreams of an undesired program. Then the sender MAY reoffer + the stream in the receiver's preferred configuration if it is + capable of providing that configuration. Note that all receivers + are capable of receiving, and all decoders are capable of + decoding, any of the legal bit stream configurations, so the + parameter exchange is not needed for interoperability. The + parameter exchange might be used to help optimize the transmission + to the number of programs or channels the receiver requests. + + o Since an AC-3 bit stream is a special case of an E-AC-3 bit + stream, it is permissible for an AC-3 bit stream to be carried in + the E-AC-3 payload format. To ensure interoperability with + receivers that support the AC-3 payload format but not the E-AC-3 + payload format, a sender that desires to send an AC-3 bit stream + in the E-AC-3 payload format SHOULD also offer the session in the + AC-3 payload format by including payload types for both media + subtypes: 'ac3' and 'eac3'. + +6. Security Considerations + + The payload format described in this document is subject to the + security considerations defined in RTP [RFC3550] and in any + applicable RTP profile (e.g., [RFC3551]). To protect the user's + privacy and any copyrighted material, confidentiality protection + would have to be applied. To also protect against modification by + intermediate entities and ensure the authenticity of the stream, + integrity protection and authentication would be required. + Confidentiality, integrity protection, and authentication have to be + solved by a mechanism external to this payload format, for example, + Secure Real-time Transport Protocol (SRTP) [RFC3711]. + + The E-AC-3 format is designed so that the validity of data frames can + be determined by decoders. The required decoder response to a + malformed frame is to discard the malformed data and conceal the + errors in the audio output until a valid frame is detected and + decoded. This is expected to prevent crashes and other abnormal + decoder behavior in response to errors or attacks. + + + + + +Link Standards Track [Page 14] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + +7. Congestion Control + + The general congestion control considerations for transporting RTP + data apply to E-AC-3 audio over RTP as well; see RTP [RFC3550], and + any applicable RTP profile (e.g., [RFC3551]). + + E-AC-3 is a variable bit rate coding system so it is possible to use + a variety of techniques to adapt to network bandwidth. + +8. IANA Considerations + + The IANA has registered a new media subtype for E-AC-3 (see Section + 5). + +9. References + +9.1. Normative References + + [ETSI] ETSI, "Digital Audio Compression (AC-3, Enhanced AC-3) + Standard", TS 102 366, February 2005. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC4184] Link, B., Hager, T., and J. Flaks, "RTP Payload Format for + AC-3 Audio", RFC 4184, October 2005. + + [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. + Jacobson, "RTP: A Transport Protocol for Real-Time + Applications", STD 64, RFC 3550, July 2003. + + [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and + Registration Procedures", BCP 13, RFC 4288, December 2005. + + [RFC3555] Casner, S. and P. Hoschka, "MIME Type Registration of RTP + Payload Formats", RFC 3555, July 2003. + + [RFC2327] Handley, M. and V. Jacobson, "SDP: Session Description + Protocol", RFC 2327, April 1998. + + [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model + with Session Description Protocol (SDP)", RFC 3264, June + 2002. + + + + + + + + +Link Standards Track [Page 15] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + +9.2. Informative References + + [2004AES] Fielder, L., Andersen, R., Crockett, B., Davidson, G., + Davis, M., Turner, S., Vinton, M., and P. Williams, + "Introduction to Dolby Digital Plus, an Enhancement to the + Dolby Digital Coding System", Preprint 6196, Presented at + the 117th Convention of the Audio Engineering Society, + October 2004. + + [1994AES] Todd, C., Davidson, G., Davis, M., Fielder, L., Link, B., + and S. Vernon, "AC-3: Flexible Perceptual Coding for Audio + Transmission and Storage", Preprint 3796, Presented at the + 96th Convention of the Audio Engineering Society, May + 1994. + + [RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP + Payload Format Specifications", BCP 36, RFC 2736, December + 1999. + + [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and + Video Conferences with Minimal Control", STD 65, RFC 3551, + July 2003. + + [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. + Norrman, "The Secure Real-time Transport Protocol (SRTP)", + RFC 3711, March 2004. + +Author's Address + + Brian Link + Dolby Laboratories + 100 Potrero Ave. + San Francisco, CA 94103 + US + + Phone: +1 415 558 0200 + EMail: bdl@dolby.com + + + + + + + + + + + + + + +Link Standards Track [Page 16] + +RFC 4598 RTP Payload Format for E-AC-3-Audio July 2006 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2006). + + This document is subject to the rights, licenses and restrictions + contained in BCP 78, and except as set forth therein, the authors + retain all their rights. + + This document and the information contained herein are provided on an + "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS + OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET + ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, + INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE + INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED + WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Intellectual Property + + The IETF takes no position regarding the validity or scope of any + Intellectual Property Rights or other rights that might be claimed to + pertain to the implementation or use of the technology described in + this document or the extent to which any license under such rights + might or might not be available; nor does it represent that it has + made any independent effort to identify any such rights. Information + on the procedures with respect to rights in RFC documents can be + found in BCP 78 and BCP 79. + + Copies of IPR disclosures made to the IETF Secretariat and any + assurances of licenses to be made available, or the result of an + attempt made to obtain a general license or permission for the use of + such proprietary rights by implementers or users of this + specification can be obtained from the IETF on-line IPR repository at + http://www.ietf.org/ipr. + + The IETF invites any interested party to bring to its attention any + copyrights, patents or patent applications, or other proprietary + rights that may cover technology that may be required to implement + this standard. Please address the information to the IETF at + ietf-ipr@ietf.org. + +Acknowledgement + + Funding for the RFC Editor function is provided by the IETF + Administrative Support Activity (IASA). + + + + + + + +Link Standards Track [Page 17] + |