summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc6190.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc6190.txt')
-rw-r--r--doc/rfc/rfc6190.txt5603
1 files changed, 5603 insertions, 0 deletions
diff --git a/doc/rfc/rfc6190.txt b/doc/rfc/rfc6190.txt
new file mode 100644
index 0000000..152a8d0
--- /dev/null
+++ b/doc/rfc/rfc6190.txt
@@ -0,0 +1,5603 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) S. Wenger
+Request for Comments: 6190 Independent
+Category: Standards Track Y.-K. Wang
+ISSN: 2070-1721 Huawei Technologies
+ T. Schierl
+ Fraunhofer HHI
+ A. Eleftheriadis
+ Vidyo
+ May 2011
+
+
+ RTP Payload Format for Scalable Video Coding
+
+Abstract
+
+ This memo describes an RTP payload format for Scalable Video Coding
+ (SVC) as defined in Annex G of ITU-T Recommendation H.264, which is
+ technically identical to Amendment 3 of ISO/IEC International
+ Standard 14496-10. The RTP payload format allows for packetization
+ of one or more Network Abstraction Layer (NAL) units in each RTP
+ packet payload, as well as fragmentation of a NAL unit in multiple
+ RTP packets. Furthermore, it supports transmission of an SVC stream
+ over a single as well as multiple RTP sessions. The payload format
+ defines a new media subtype name "H264-SVC", but is still backward
+ compatible to RFC 6184 since the base layer, when encapsulated in its
+ own RTP stream, must use the H.264 media subtype name ("H264") and
+ the packetization method specified in RFC 6184. The payload format
+ has wide applicability in videoconferencing, Internet video
+ streaming, and high-bitrate entertainment-quality video, among
+ others.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 5741.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc6190.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 1]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+Copyright Notice
+
+ Copyright (c) 2011 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+ This document may contain material from IETF Documents or IETF
+ Contributions published or made publicly available before November
+ 10, 2008. The person(s) controlling the copyright in some of this
+ material may not have granted the IETF Trust the right to allow
+ modifications of such material outside the IETF Standards Process.
+ Without obtaining an adequate license from the person(s) controlling
+ the copyright in such materials, this document may not be modified
+ outside the IETF Standards Process, and derivative works of it may
+ not be created outside the IETF Standards Process, except to format
+ it for publication as an RFC or to translate it into languages other
+ than English.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 2]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+Table of Contents
+
+ 1. Introduction ....................................................5
+ 1.1. The SVC Codec ..............................................6
+ 1.1.1. Overview ............................................6
+ 1.1.2. Parameter Sets ......................................8
+ 1.1.3. NAL Unit Header .....................................9
+ 1.2. Overview of the Payload Format ............................12
+ 1.2.1. Design Principles ..................................12
+ 1.2.2. Transmission Modes and Packetization Modes .........13
+ 1.2.3. New Payload Structures .............................15
+ 2. Conventions ....................................................16
+ 3. Definitions and Abbreviations ..................................16
+ 3.1. Definitions ...............................................16
+ 3.1.1. Definitions from the SVC Specification .............16
+ 3.1.2. Definitions Specific to This Memo ..................18
+ 3.2. Abbreviations .............................................22
+ 4. RTP Payload Format .............................................23
+ 4.1. RTP Header Usage ..........................................23
+ 4.2. NAL Unit Extension and Header Usage .......................23
+ 4.2.1. NAL Unit Extension .................................23
+ 4.2.2. NAL Unit Header Usage ..............................24
+ 4.3. Payload Structures ........................................25
+ 4.4. Transmission Modes ........................................28
+ 4.5. Packetization Modes .......................................28
+ 4.5.1. Packetization Modes for Single-Session
+ Transmission .......................................28
+ 4.5.2. Packetization Modes for Multi-Session
+ Transmission .......................................29
+ 4.6. Single NAL Unit Packets ...................................32
+ 4.7. Aggregation Packets .......................................33
+ 4.7.1. Non-Interleaved Multi-Time Aggregation
+ Packets (NI-MTAPs) .................................33
+ 4.8. Fragmentation Units (FUs) .................................35
+ 4.9. Payload Content Scalability Information (PACSI) NAL Unit ..35
+ 4.10. Empty NAL unit ...........................................43
+ 4.11. Decoding Order Number (DON) ..............................43
+ 4.11.1. Cross-Session DON (CS-DON) for
+ Multi-Session Transmission ........................43
+ 5. Packetization Rules ............................................45
+ 5.1. Packetization Rules for Single-Session Transmission .......45
+ 5.2. Packetization Rules for Multi-Session Transmission ........46
+ 5.2.1. NI-T/NI-TC Packetization Rules .....................47
+ 5.2.2. NI-C/NI-TC Packetization Rules .....................49
+ 5.2.3. I-C Packetization Rules ............................50
+ 5.2.4. Packetization Rules for Non-VCL NAL Units ..........50
+ 5.2.5. Packetization Rules for Prefix NAL Units ...........51
+
+
+
+
+Wenger, et al. Standards Track [Page 3]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ 6. De-Packetization Process .......................................51
+ 6.1. De-Packetization Process for Single-Session Transmission ..51
+ 6.2. De-Packetization Process for Multi-Session Transmission ...51
+ 6.2.1. Decoding Order Recovery for the NI-T and
+ NI-TC Modes ........................................52
+ 6.2.1.1. Informative Algorithm for NI-T
+ Decoding Order Recovery within
+ an Access Unit ............................55
+ 6.2.2. Decoding Order Recovery for the NI-C,
+ NI-TC, and I-C Modes ...............................57
+ 7. Payload Format Parameters ......................................59
+ 7.1. Media Type Registration ...................................60
+ 7.2. SDP Parameters ............................................75
+ 7.2.1. Mapping of Payload Type Parameters to SDP ..........75
+ 7.2.2. Usage with the SDP Offer/Answer Model ..............76
+ 7.2.3. Dependency Signaling in Multi-Session
+ Transmission .......................................84
+ 7.2.4. Usage in Declarative Session Descriptions ..........85
+ 7.3. Examples ..................................................86
+ 7.3.1. Example for Offering a Single SVC Session ..........86
+ 7.3.2. Example for Offering a Single SVC Session Using
+ scalable-layer-id ..................................87
+ 7.3.3. Example for Offering Multiple Sessions in MST ......87
+ 7.3.4. Example for Offering Multiple Sessions in
+ MST Including Operation with Answerer Using
+ scalable-layer-id ..................................89
+ 7.3.5. Example for Negotiating an SVC Stream with
+ a Constrained Base Layer in SST ....................90
+ 7.4. Parameter Set Considerations ..............................91
+ 8. Security Considerations ........................................91
+ 9. Congestion Control .............................................92
+ 10. IANA Considerations ...........................................93
+ 11. Informative Appendix: Application Examples ....................93
+ 11.1. Introduction .............................................93
+ 11.2. Layered Multicast ........................................93
+ 11.3. Streaming ................................................94
+ 11.4. Videoconferencing (Unicast to MANE, Unicast to
+ Endpoints) ...............................................95
+ 11.5. Mobile TV (Multicast to MANE, Unicast to Endpoint) .......96
+ 12. Acknowledgements ..............................................97
+ 13. References ....................................................97
+ 13.1. Normative References .....................................97
+ 13.2. Informative References ...................................98
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 4]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+1. Introduction
+
+ This memo specifies an RTP [RFC3550] payload format for the Scalable
+ Video Coding (SVC) extension of the H.264/AVC video coding standard.
+ SVC is specified in Amendment 3 to ISO/IEC 14496 Part 10
+ [ISO/IEC14496-10] and equivalently in Annex G of ITU-T Rec. H.264
+ [H.264]. In this memo, unless explicitly stated otherwise,
+ "H.264/AVC" refers to the specification of [H.264] excluding Annex G.
+
+ SVC covers the entire application range of H.264/AVC, from low-
+ bitrate mobile applications, to High-Definition Television (HDTV)
+ broadcasting, and even Digital Cinema that requires nearly lossless
+ coding and hundreds of megabits per second. The scalability features
+ that SVC adds to H.264/AVC enable several system-level
+ functionalities related to the ability of a system to adapt the
+ signal to different system conditions with no or minimal processing.
+ The adaptation relates both to the capabilities of potentially
+ heterogeneous receivers (differing in screen resolution, processing
+ speed, etc.), and to differing or time-varying network conditions.
+ The adaptation can be performed at the source, the destination, or in
+ intermediate media-aware network elements (MANEs). The payload
+ format specified in this memo exposes these system-level
+ functionalities so that system designers can take direct advantage of
+ these features.
+
+ Informative note: Since SVC streams contain, by design, a sub-
+ stream that is compliant with H.264/AVC, it is trivial for a MANE
+ to filter the stream so that all SVC-specific information is
+ removed. This memo, in fact, defines a media type parameter
+ (sprop-avc-ready, Section 7.2) that indicates whether or not the
+ stream can be converted to one compliant with [RFC6184] by
+ eliminating RTP packets, and rewriting RTP Control Protocol (RTCP)
+ to match the changes to the RTP packet stream as specified in
+ Section 7 of [RFC3550].
+
+ This memo defines two basic modes for transmission of SVC data,
+ single-session transmission (SST) and multi-session transmission
+ (MST). In SST, a single RTP session is used for the transmission of
+ all scalability layers comprising an SVC bitstream; in MST, the
+ scalability layers are transported on different RTP sessions. In
+ SST, packetization is a straightforward extension of [RFC6184]. For
+ MST, four different modes are defined in this memo. They differ on
+ whether or not they allow interleaving, i.e., transmitting Network
+ Abstraction Layer (NAL) units in an order different than the decoding
+ order, and by the technique used to effect inter-session NAL unit
+ decoding order recovery. Decoding order recovery is performed using
+ either inter-session timestamp alignment [RFC3550] or cross-session
+ decoding order numbers (CS-DONs). One of the MST modes supports both
+
+
+
+Wenger, et al. Standards Track [Page 5]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ decoding order recovery techniques, so that receivers can select
+ their preferred technique. More details can be found in Section
+ 1.2.2.
+
+ This memo further defines three new NAL unit types. The first type
+ is the payload content scalability information (PACSI) NAL unit,
+ which is used to provide an informative summary of the scalability
+ information of the data contained in an RTP packet, as well as
+ ancillary data (e.g., CS-DON values). The second and third new NAL
+ unit types are the empty NAL unit and the non-interleaved multi-time
+ aggregation packet (NI-MTAP) NAL unit. The empty NAL unit is used to
+ ensure inter-session timestamp alignment required for decoding order
+ recovery in MST. The NI-MTAP is used as a new payload structure
+ allowing the grouping of NAL units of different time instances in
+ decoding order. More details about the new packet structures can be
+ found in Section 1.2.3.
+
+ This memo also defines the signaling support for SVC transport over
+ RTP, including a new media subtype name (H264-SVC).
+
+ A non-normative overview of the SVC codec and the payload is given in
+ the remainder of this section.
+
+1.1. The SVC Codec
+
+1.1.1. Overview
+
+ SVC defines a coded video representation in which a given bitstream
+ offers representations of the source material at different levels of
+ fidelity (hence the term "scalable"). Scalable video coding
+ bitstreams, or scalable bitstreams, are constructed in a pyramidal
+ fashion: the coding process creates bitstream components that improve
+ the fidelity of hierarchically lower components.
+
+ The fidelity dimensions offered by SVC are spatial (picture size),
+ quality (or Signal-to-Noise Ratio (SNR)), and temporal (pictures per
+ second). Bitstream components associated with a given level of
+ spatial, quality, and temporal fidelity are identified using
+ corresponding parameters in the bitstream: dependency_id, quality_id,
+ and temporal_id (see also Section 1.1.3). The fidelity identifiers
+ have integer values, where higher values designate components that
+ are higher in the hierarchy. It is noted that SVC offers significant
+ flexibility in terms of how an encoder may choose to structure the
+ dependencies between the various components. Decoding of a
+ particular component requires the availability of all the components
+ it depends upon, either directly, or indirectly. An operation point
+
+
+
+
+
+Wenger, et al. Standards Track [Page 6]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ of an SVC bitstream consists of the bitstream components required to
+ be able to decode a particular dependency_id, quality_id, and
+ temporal_id combination.
+
+ The term "layer" is used in various contexts in this memo. For
+ example, in the terms "Video Coding Layer" and "Network Abstraction
+ Layer" it refers to conceptual organization levels. When referring
+ to bitstream syntax elements such as block layer or macroblock layer,
+ it refers to hierarchical bitstream structure levels. When used in
+ the context of bitstream scalability, e.g., "AVC base layer", it
+ refers to a level of representation fidelity of the source signal
+ with a specific set of NAL units included. The correct
+ interpretation is supported by providing the appropriate context.
+
+ SVC maintains the bitstream organization introduced in H.264/AVC.
+ Specifically, all bitstream components are encapsulated in Network
+ Abstraction Layer (NAL) units, which are organized as Access Units
+ (AUs). An AU is associated with a single sampling instance in time.
+ A subset of the NAL unit types correspond to the Video Coding Layer
+ (VCL), and contain the coded picture data associated with the source
+ content. Non-VCL NAL units carry ancillary data that may be
+ necessary for decoding (e.g., parameter sets as explained below) or
+ that facilitate certain system operations but are not needed by the
+ decoding process itself. Coded picture data at the various fidelity
+ dimensions are organized in slices. Within one AU, a coded picture
+ of an operation point consists of all the coded slices required for
+ decoding up to the particular combination of dependency_id and
+ quality_id values at the time instance corresponding to the AU.
+
+ It is noted that the concept of temporal scalability is already
+ present in H.264/AVC, as profiles defined in Annex A of [H.264]
+ already support it. Specifically, in H.264/AVC, the concept of sub-
+ sequences has been introduced to allow optional use of temporal
+ layers through Supplemental Enhancement Information (SEI) messages.
+ SVC extends this approach by exposing the temporal scalability
+ information using the temporal_id parameter, alongside (and unified
+ with) the dependency_id and quality_id values that are used for
+ spatial and quality scalability, respectively. For coded picture
+ data defined in Annex G of [H.264], this is accomplished by using a
+ new type of NAL unit, namely, coded slice in scalable extension NAL
+ unit (type 20), where the fidelity parameters are part of its header.
+ For coded picture data that follow H.264/AVC, and to ensure
+ compatibility with existing H.264/AVC decoders, another new type of
+ NAL unit, namely, prefix NAL unit (type 14), has been defined to
+ carry this header information. SVC additionally specifies a third
+ new type of NAL unit, namely, subset sequence parameter set NAL unit
+ (type 15), to contain sequence parameter set information for quality
+ and spatial enhancement layers. All these three newly specified NAL
+
+
+
+Wenger, et al. Standards Track [Page 7]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ unit types (14, 15, and 20) are among those reserved in H.264/AVC and
+ are to be ignored by decoders conforming to one or more of the
+ profiles specified in Annex A of [H.264].
+
+ Within an AU, the VCL NAL units associated with a given dependency_id
+ and quality_id are referred to as a "layer representation". The
+ layer representation corresponding to the lowest values of
+ dependency_id and quality_id (i.e., zero for both) is compliant by
+ design to H.264/AVC. The set of VCL and associated non-VCL NAL units
+ across all AUs in a bitstream associated with a particular
+ combination of values of dependency_id and quality_id, and regardless
+ of the value of temporal_id, is conceptually a scalable layer. For
+ backward compatibility with H.264/AVC, it is important to
+ differentiate, however, whether or not SVC-specific NAL units are
+ present in a given bitstream. This is particularly important for the
+ lowest fidelity values in terms of dependency_id and quality_id (zero
+ for both), as the corresponding VCL data are compliant with
+ H.264/AVC, and may or may not be accompanied by associated prefix NAL
+ units. This memo therefore uses the term "AVC base layer" to
+ designate the layer that does not contain SVC-specific NAL units, and
+ "SVC base layer" to designate the same layer but with the addition of
+ the associated SVC prefix NAL units. Note that the SVC specification
+ uses the term "base layer" for what in this memo will be referred to
+ as "AVC base layer". Similarly, it is also important to be able to
+ differentiate, within a layer, the temporal fidelity components it
+ contains. This memo uses the term "T0" to indicate, within a
+ particular layer, the subset that contains the NAL units associated
+ with temporal_id equal to 0.
+
+ SNR scalability in SVC is offered in two different ways. In what is
+ called coarse-grain scalability (CGS), scalability is provided by
+ including or excluding a complete layer when decoding a particular
+ bitstream. In contrast, in medium-grain scalability (MGS),
+ scalability is provided by selectively omitting the decoding of
+ specific NAL units belonging to MGS layers. The selection of the NAL
+ units to omit can be based on fixed-length fields present in the NAL
+ unit header (see also Sections 1.1.3 and 4.2).
+
+1.1.2. Parameter Sets
+
+ SVC maintains the parameter sets concept in H.264/AVC and introduces
+ a new type of sequence parameter set, referred to as the subset
+ sequence parameter set [H.264]. Subset sequence parameter sets have
+ NAL unit type equal to 15, which is different from the NAL unit type
+ value (7) of sequence parameter sets. VCL NAL units of NAL unit type
+ 1 to 5 must only (indirectly) refer to sequence parameter sets, while
+ VCL NAL units of NAL unit type 20 must only (indirectly) refer to
+ subset sequence parameter sets. The references are indirect because
+
+
+
+Wenger, et al. Standards Track [Page 8]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ VCL NAL units refer to picture parameter sets (in their slice
+ header), which in turn refer to regular or subset sequence parameter
+ sets. Subset sequence parameter sets use a separate identifier value
+ space than sequence parameter sets.
+
+ In SVC, coded picture data from different layers may use the same or
+ different sequence and picture parameter sets. Let the variable DQId
+ be equal to dependency_id * 16 + quality_id. At any time instant
+ during the decoding process there is one active sequence parameter
+ set for the layer representation with the highest value of DQId and
+ one or more active layer SVC sequence parameter set(s) for layer
+ representations with lower values of DQId. The active sequence
+ parameter set or an active layer SVC sequence parameter set remains
+ unchanged throughout a coded video sequence in the scalable layer in
+ which the active sequence parameter set or active layer SVC sequence
+ parameter set is referred to. This means that the referred sequence
+ parameter set or subset sequence parameter set can only change at
+ instantaneous decoding refresh (IDR) access units for any layer. At
+ any time instant during the decoding process there may be one active
+ picture parameter set (for the layer representation with the highest
+ value of DQId) and one or more active layer picture parameter set(s)
+ (for layer representations with lower values of DQId). The active
+ picture parameter set or an active layer picture parameter set
+ remains unchanged throughout a layer representation in which the
+ active picture parameter set or active layer picture parameter set is
+ referred to, but may change from one AU to the next.
+
+1.1.3. NAL Unit Header
+
+ SVC extends the one-byte H.264/AVC NAL unit header by three
+ additional octets for NAL units of types 14 and 20. The header
+ indicates the type of the NAL unit, the (potential) presence of bit
+ errors or syntax violations in the NAL unit payload, information
+ regarding the relative importance of the NAL unit for the decoding
+ process, the layer identification information, and other fields as
+ discussed below.
+
+ The syntax and semantics of the NAL unit header are specified in
+ [H.264], but the essential properties of the NAL unit header are
+ summarized below for convenience.
+
+ The first byte of the NAL unit header has the following format (the
+ bit fields are the same as defined for the one-byte H.264/AVC NAL
+ unit header, while the semantics of some fields have changed
+ slightly, in a backward-compatible way):
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 9]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ +---------------+
+ |0|1|2|3|4|5|6|7|
+ +-+-+-+-+-+-+-+-+
+ |F|NRI| Type |
+ +---------------+
+
+ The semantics of the components of the NAL unit type octet, as
+ specified in [H.264], are described briefly below. In addition to
+ the name and size of each field, the corresponding syntax element
+ name in [H.264] is also provided.
+
+ F: 1 bit
+ forbidden_zero_bit. H.264/AVC declares a value of 1 as a
+ syntax violation.
+
+ NRI: 2 bits
+ nal_ref_idc. A value of "00" (in binary form) indicates that
+ the content of the NAL unit is not used to reconstruct
+ reference pictures for future prediction. Such NAL units can
+ be discarded without risking the integrity of the reference
+ pictures in the same layer. A value greater than "00"
+ indicates that the decoding of the NAL unit is required to
+ maintain the integrity of reference pictures in the same layer
+ or that the NAL unit contains parameter sets.
+
+ Type: 5 bits
+ nal_unit_type. This component specifies the NAL unit type as
+ defined in Table 7-1 of [H.264], and later within this memo.
+ For a reference of all currently defined NAL unit types and
+ their semantics, please refer to Section 7.4.1 in [H.264].
+
+ In H.264/AVC, NAL unit types 14, 15, and 20 are reserved for
+ future extensions. SVC uses these three NAL unit types as
+ follows: NAL unit type 14 is used for prefix NAL unit, NAL unit
+ type 15 is used for subset sequence parameter set, and NAL unit
+ type 20 is used for coded slice in scalable extension (see
+ Section 7.4.1 in [H.264]). NAL unit types 14 and 20 indicate
+ the presence of three additional octets in the NAL unit header,
+ as shown below.
+
+ +---------------+---------------+---------------+
+ |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |R|I| PRID |N| DID | QID | TID |U|D|O| RR|
+ +---------------+---------------+---------------+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 10]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ R: 1 bit
+ reserved_one_bit. Reserved bit for future extension. R must
+ be equal to 1. The value of R must be ignored by decoders.
+
+ I: 1 bit
+ idr_flag. This component specifies whether the layer
+ representation is an instantaneous decoding refresh (IDR) layer
+ representation (when equal to 1) or not (when equal to 0).
+
+ PRID: 6 bits
+ priority_id. This flag specifies a priority identifier for the
+ NAL unit. A lower value of PRID indicates a higher priority.
+
+ N: 1 bit
+ no_inter_layer_pred_flag. This flag specifies, when present in
+ a coded slice NAL unit, whether inter-layer prediction may be
+ used for decoding the coded slice (when equal to 1) or not
+ (when equal to 0).
+
+ DID: 3 bits
+ dependency_id. This component indicates the inter-layer coding
+ dependency level of a layer representation. At any access
+ unit, a layer representation with a given dependency_id may be
+ used for inter-layer prediction for coding of a layer
+ representation with a higher dependency_id, while a layer
+ representation with a given dependency_id shall not be used for
+ inter-layer prediction for coding of a layer representation
+ with a lower dependency_id.
+
+ QID: 4 bits
+ quality_id. This component indicates the quality level of an
+ MGS layer representation. At any access unit and for identical
+ dependency_id values, a layer representation with quality_id
+ equal to ql uses a layer representation with quality_id equal
+ to ql-1 for inter-layer prediction.
+
+ TID: 3 bits
+ temporal_id. This component indicates the temporal level of a
+ layer representation. The temporal_id is associated with the
+ frame rate, with lower values of _temporal_id corresponding to
+ lower frame rates. A layer representation at a given
+ temporal_id typically depends on layer representations with
+ lower temporal_id values, but it never depends on layer
+ representations with higher temporal_id values.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 11]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ U: 1 bit
+ use_ref_base_pic_flag. A value of 1 indicates that only
+ reference base pictures are used during the inter prediction
+ process. A value of 0 indicates that the reference base
+ pictures are not used during the inter prediction process.
+
+ D: 1 bit
+ discardable_flag. A value of 1 indicates that the current NAL
+ unit is not used for decoding NAL units with values of
+ dependency_id higher than the one of the current NAL unit, in
+ the current and all subsequent access units. Such NAL units
+ can be discarded without risking the integrity of layers with
+ higher dependency_id values. discardable_flag equal to 0
+ indicates that the decoding of the NAL unit is required to
+ maintain the integrity of layers with higher dependency_id.
+
+ O: 1 bit
+ output_flag: Affects the decoded picture output process as
+ defined in Annex C of [H.264].
+
+ RR: 2 bits
+ reserved_three_2bits. Reserved bits for future extension. RR
+ MUST be equal to "11" (in binary form). The value of RR must
+ be ignored by decoders.
+
+ This memo extends the semantics of F, NRI, I, PRID, DID, QID, TID, U,
+ and D per Annex G of [H.264] as described in Section 4.2.
+
+1.2. Overview of the Payload Format
+
+ Similar to [RFC6184], this payload format can only be used to carry
+ the raw NAL unit stream over RTP and not the bytestream format
+ specified in Annex B of [H.264].
+
+ The design principles, transmission modes, and packetization modes as
+ well as new payload structures are summarized in this section. It is
+ assumed that the reader is familiar with the terminology and concepts
+ defined in [RFC6184].
+
+1.2.1. Design Principles
+
+ The following design principles have been observed for this payload
+ format:
+
+ o Backward compatibility with [RFC6184] wherever possible.
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 12]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o The SVC base layer or any H.264/AVC compatible subset of the SVC
+ base layer, when transmitted in its own RTP stream, must be
+ encapsulated using [RFC6184]. This ensures that such an RTP
+ stream can be understood by [RFC6184] receivers.
+
+ o Media-aware network elements (MANEs) as defined in [RFC6184] are
+ signaling-aware, rely on signaling information, and have state.
+
+ o MANEs can aggregate multiple RTP streams, possibly from multiple
+ RTP sessions.
+
+ o MANEs can perform media-aware stream thinning (selective
+ elimination of packets or portions thereof). By using the payload
+ header information identifying layers within an RTP session, MANEs
+ are able to remove packets or portions thereof from the incoming
+ RTP packet stream. This implies rewriting the RTP headers of the
+ outgoing packet stream, and rewriting of RTCP packets as specified
+ in Section 7 of [RFC3550].
+
+1.2.2. Transmission Modes and Packetization Modes
+
+ This memo allows the packetization of SVC data for both single-
+ session transmission (SST) and multi-session transmission (MST). In
+ the case of SST all SVC data are carried in a single RTP session. In
+ the case of MST two or more RTP sessions are used to carry the SVC
+ data, in accordance with the MST-specific packetization modes defined
+ in this memo, which are based on the packetization modes defined in
+ [RFC6184]. In MST, each RTP session is associated with one RTP
+ stream, which may carry one or more layers.
+
+ The base layer is, by design, compatible to H.264/AVC. During
+ transmission, the associated prefix NAL units, which are introduced
+ by SVC and, when present, are ignored by H.264/AVC decoders, may be
+ encapsulated within the same RTP packet stream as the H.264/AVC VCL
+ NAL units or in a different RTP packet stream (when MST is used).
+ For convenience, the term "AVC base layer" is used to refer to the
+ base layer without prefix NAL units, while the term "SVC base layer"
+ is used to refer to the base layer with prefix NAL units.
+
+ Furthermore, the base layer may have multiple temporal components
+ (i.e., supporting different frame rates). As a result, the lowest
+ temporal component ("T0") of the AVC or SVC base layer is used as the
+ starting point of the SVC bitstream hierarchy.
+
+ This memo allows encapsulating in a given RTP stream any of the
+ following three alternatives of layer combinations:
+
+
+
+
+
+Wenger, et al. Standards Track [Page 13]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ 1. the T0 AVC base layer or the T0 SVC base layer only;
+ 2. one or more enhancement layers only; or
+ 3. the T0 SVC base layer, and one or more enhancement layers.
+
+ SST should be used in point-to-point unicast applications and, in
+ general, whenever the potential benefit of using multiple RTP
+ sessions does not justify the added complexity. When SST is used,
+ the layer combination cases 1 and 3 above can be used. When an
+ H.264/AVC compatible subset of the SVC base layer is transmitted
+ using SST, the packetization of [RFC6184] must be used, thus ensuring
+ compatibility with [RFC6184] receivers. When, however, one or more
+ SVC quality or spatial enhancement layers are transmitted using SST,
+ the packetization defined in this memo must be used. In SST, any of
+ the three [RFC6184] packetization modes, namely, single NAL unit
+ mode, non-interleaved mode, and interleaved mode, can be used.
+
+ MST should be used in a multicast session when different receivers
+ may request different layers of the scalable bitstream. An operation
+ point for an SVC bitstream, as defined in this memo, corresponds to a
+ set of layers that together conform to one of the profiles defined in
+ Annex A or G of [H.264] and, when decoded, offer a representation of
+ the original video at a certain fidelity. The number of streams used
+ in MST should be at least equal to the number of operation points
+ that may be requested by the receivers. Depending on the
+ application, this may result in each layer being carried in its own
+ RTP session, or in having multiple layers encapsulated within one RTP
+ session.
+
+ Informative note: Layered multicast is a term commonly used to
+ describe the application where multicast is used to transmit
+ layered or scalable data that has been encapsulated into more than
+ one RTP session. This application allows different receivers in
+ the multicast session to receive different operation points of the
+ scalable bitstream. Layered multicast, among other application
+ examples, is discussed in more detail in Section 11.2.
+
+ When MST is used, any of the three layer combinations above can be
+ used for each of the sessions. When an H.264/AVC compatible subset
+ of the SVC base layer is transmitted in its own session in MST, the
+ packetization of [RFC6184] must be used, such that [RFC6184]
+ receivers can be part of the MST and receive only this session. For
+ MST, this memo defines four different MST-specific packetization
+ modes, namely, non-interleaved timestamp (NI-T) based mode, non-
+ interleaved CS-DON (NI-C) based mode, non-interleaved combined
+ timestamp and CS-DON mode (NI-TC), and interleaved CS-DON (I-C) based
+ mode (detailed in Section 4.5.2). The modes differ depending on
+ whether the SVC data are allowed to be interleaved, i.e., to be
+ transmitted in an order different than the intended decoding order,
+
+
+
+Wenger, et al. Standards Track [Page 14]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ and they also differ in the mechanisms provided in order to recover
+ the correct decoding order of the NAL units across the multiple RTP
+ sessions. These four MST modes reuse the packetization modes
+ introduced in [RFC6184] for the packetization of NAL units in each of
+ their individual RTP sessions.
+
+ As the names of the MST packetization modes imply, the NI-T, NI-C,
+ and NI-TC modes do not allow interleaved transmission, while the I-C
+ mode allows interleaved transmission. With any of the three non-
+ interleaved MST packetization modes, legacy [RFC6184] receivers with
+ implementation of the non-interleaved mode specified in [RFC6184] can
+ join a multi-session transmission of SVC, to receive the base RTP
+ session encapsulated according to [RFC6184].
+
+1.2.3. New Payload Structures
+
+ [RFC6184] specifies three basic payload structures, namely, single
+ NAL unit packet, aggregation packet, and fragmentation unit.
+ Depending on the basic payload structure, an RTP packet may contain a
+ NAL unit not aggregating other NAL units, one or more NAL units
+ aggregated in another NAL unit, or a fragment of a NAL unit not
+ aggregating other NAL units. Each NAL unit of a type specified in
+ [H.264] (i.e., 1 to 23, inclusive) may be carried in its entirety in
+ a single NAL unit packet, may be aggregated in an aggregation packet,
+ or may be fragmented and carried in a number of fragmentation unit
+ packets. To enable aggregation or fragmentation of NAL units while
+ still ensuring that the RTP packet payload is only composed of NAL
+ units, [RFC6184] introduced six new NAL unit types (24-29) to be used
+ as payload structures, selected from the NAL unit types left
+ unspecified in [H.264].
+
+ This memo reuses all the payload structures used in [RFC6184].
+ Furthermore, three new types of NAL units are defined: payload
+ content scalability information (PACSI) NAL unit, empty NAL unit, and
+ non-interleaved multi-time aggregation packet (NI-MTAP) (specified in
+ Sections 4.9, 4.10, and 4.7.1, respectively).
+
+ PACSI NAL units may be used for the following purposes:
+
+ o To enable MANEs to decide whether to forward, process, or discard
+ aggregation packets, by checking in PACSI NAL units the
+ scalability information and other characteristics of the
+ aggregated NAL units, rather than looking into the aggregated NAL
+ units themselves, which are defined by the video coding
+ specification.
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 15]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o To enable correct decoding order recovery in MST using the NI-C or
+ NI-TC mode, with the help of the CS-DON information included in
+ PACSI NAL units.
+
+ o To improve resilience to packet losses, e.g., by utilizing the
+ following data or information included in PACSI NAL units:
+ repeated Supplemental Enhancement Information (SEI) messages,
+ information regarding the start and end of layer representations,
+ and the indices to layer representations of the lowest temporal
+ subset.
+
+ Empty NAL units may be used to enable correct decoding order recovery
+ in MST using the NI-T or NI-TC mode. NI-MTAP NAL units may be used
+ to aggregate NAL units from multiple access units but without
+ interleaving.
+
+2. Conventions
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in BCP 14, RFC 2119
+ [RFC2119].
+
+ This specification uses the notion of setting and clearing a bit when
+ bit fields are handled. Setting a bit is the same as assigning that
+ bit the value of 1 (On). Clearing a bit is the same as assigning
+ that bit the value of 0 (Off).
+
+3. Definitions and Abbreviations
+
+3.1. Definitions
+
+ This document uses the terms and definitions of [H.264]. Section
+ 3.1.1 lists relevant definitions copied from [H.264] for convenience.
+
+ When there is discrepancy, the definitions in [H.264] take
+ precedence. Section 3.1.2 gives definitions specific to this memo.
+ Some of the definitions in Section 3.1.2 are also present in
+ [RFC6184] and copied here with slight adaptations as needed.
+
+3.1.1. Definitions from the SVC Specification
+
+ access unit: A set of NAL units always containing exactly one primary
+ coded picture. In addition to the primary coded picture, an access
+ unit may also contain one or more redundant coded pictures, one
+ auxiliary coded picture, or other NAL units not containing slices or
+ slice data partitions of a coded picture. The decoding of an access
+ unit always results in a decoded picture.
+
+
+
+Wenger, et al. Standards Track [Page 16]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ base layer: A bitstream subset that contains all the NAL units with
+ the nal_unit_type syntax element equal to 1 or 5 of the bitstream and
+ does not contain any NAL unit with the nal_unit_type syntax element
+ equal to 14, 15, or 20 and conforms to one or more of the profiles
+ specified in Annex A of [H.264].
+
+ base quality layer representation: The layer representation of the
+ target dependency representation of an access unit that is associated
+ with the quality_id syntax element equal to 0.
+
+ coded video sequence: A sequence of access units that consists, in
+ decoding order, of an IDR access unit followed by zero or more non-
+ IDR access units including all subsequent access units up to but not
+ including any subsequent IDR access unit.
+
+ dependency representation: A subset of Video Coding Layer (VCL) NAL
+ units within an access unit that are associated with the same value
+ of the dependency_id syntax element, which is provided as part of the
+ NAL unit header or by an associated prefix NAL unit. A dependency
+ representation consists of one or more layer representations.
+
+ IDR access unit: An access unit in which the primary coded picture is
+ an IDR picture.
+
+ IDR picture: Instantaneous decoding refresh picture. A coded picture
+ in which all slices of the target dependency representation within
+ the access unit are I or EI slices that causes the decoding process
+ to mark all reference pictures as "unused for reference" immediately
+ after decoding the IDR picture. After the decoding of an IDR picture
+ all following coded pictures in decoding order can be decoded without
+ inter prediction from any picture decoded prior to the IDR picture.
+ The first picture of each coded video sequence is an IDR picture.
+
+ layer representation: A subset of VCL NAL units within an access unit
+ that are associated with the same values of the dependency_id and
+ quality_id syntax elements, which are provided as part of the VCL NAL
+ unit header or by an associated prefix NAL unit. One or more layer
+ representations represent a dependency representation.
+
+ prefix NAL unit: A NAL unit with nal_unit_type equal to 14 that
+ immediately precedes in decoding order a NAL unit with nal_unit_type
+ equal to 1, 5, or 12. The NAL unit that immediately succeeds in
+ decoding order the prefix NAL unit is referred to as the associated
+ NAL unit. The prefix NAL unit contains data associated with the
+ associated NAL unit, which are considered to be part of the
+ associated NAL unit.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 17]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ reference base picture: A reference picture that is obtained by
+ decoding a base quality layer representation with the nal_ref_idc
+ syntax element not equal to 0 and the store_ref_base_pic_flag syntax
+ element equal to 1 of an access unit and all layer representations of
+ the access unit that are referred to by inter-layer prediction of the
+ base quality layer representation. A reference base picture is not
+ an output of the decoding process, but the samples of a reference
+ base picture may be used for inter prediction in the decoding process
+ of subsequent pictures in decoding order. Reference base picture is
+ a collective term for a reference base field or a reference base
+ frame.
+
+ scalable bitstream: A bitstream with the property that one or more
+ bitstream subsets that are not identical to the scalable bitstream
+ form another bitstream that conforms to the SVC specification
+ [H.264].
+
+ target dependency representation: The dependency representation of an
+ access unit that is associated with the largest value of the
+ dependency_id syntax element for all dependency representations of
+ the access unit.
+
+ target layer representation: The layer representation of the target
+ dependency representation of an access unit that is associated with
+ the largest value of the quality_id syntax element for all layer
+ representations of the target dependency representation of the access
+ unit.
+
+3.1.2. Definitions Specific to This Memo
+
+ anchor layer representation: An anchor layer representation is such a
+ layer representation that, if decoding of the operation point
+ corresponding to the layer starts from the access unit containing
+ this layer representation, all the following layer representations of
+ the layer, in output order, can be correctly decoded. The output
+ order is defined in [H.264] as the order in which decoded pictures
+ are output from the decoded picture buffer of the decoder. As H.264
+ does not specify the picture display process, this more general term
+ is used instead of display order. An anchor layer representation is
+ a random access point to the layer the anchor layer representation
+ belongs. However, some layer representations, succeeding an anchor
+ layer representation in decoding order but preceding the anchor layer
+ representation in output order, may refer to earlier layer
+ representations for inter prediction, and hence the decoding may be
+ incorrect if random access is performed at the anchor layer
+ representation.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 18]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ AVC base layer: The subset of the SVC base layer in which all prefix
+ NAL units (type 14) are removed. Note that this is equivalent to the
+ term "base layer" as defined in Annex G of [H.264].
+
+ base RTP session: When multi-session transmission is used, the RTP
+ session that carries the RTP stream containing the T0 AVC base layer
+ or the T0 SVC base layer, and zero or more enhancement layers. This
+ RTP session does not depend on any other RTP session as indicated by
+ mechanisms defined in Section 7.2.3. The base RTP session may carry
+ NAL units of NAL unit type equal to 14 and 15.
+
+ decoding order number (DON): A field in the payload structure or a
+ derived variable indicating NAL unit decoding order. Values of DON
+ are in the range of 0 to 65535, inclusive. After reaching the
+ maximum value, the value of DON wraps around to 0. Note that this
+ definition also exists in [RFC6184] in exactly the same form.
+
+ Empty NAL unit: A NAL unit with NAL unit type equal to 31 and sub-
+ type equal to 1. An empty NAL unit consists of only the two-byte NAL
+ unit header with an empty payload.
+
+ enhancement RTP session: When multi-session transmission is used, an
+ RTP session that is not the base RTP session. An enhancement RTP
+ session typically contains an RTP stream that depends on at least one
+ other RTP session as indicated by mechanisms defined in Section
+ 7.2.3. A lower RTP session to an enhancement RTP session is an RTP
+ session on which the enhancement RTP session depends. The lowest RTP
+ session for a receiver is the RTP session that does not depend on any
+ other RTP session received by the receiver. The highest RTP session
+ for a receiver is the RTP session on which no other RTP session
+ received by the receiver depends.
+
+ cross-session decoding order number (CS-DON): A derived variable
+ indicating NAL unit decoding order number over all NAL units within
+ all the session-multiplexed RTP sessions that carry the same SVC
+ bitstream.
+
+ default level: The level indicated by the profile-level-id parameter.
+ In Session Description Protocol (SDP) Offer/Answer, the level is
+ downgradable, i.e., the answer may either use the default level or a
+ lower level. Note that this definition also exists in [RFC6184] in a
+ slightly different form.
+
+ default sub-profile: The subset of coding tools, which may be all
+ coding tools of one profile or the common subset of coding tools of
+ more than one profile, indicated by the profile-level-id parameter.
+ In SDP Offer/Answer, the default sub-profile must be used in a
+
+
+
+
+Wenger, et al. Standards Track [Page 19]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ symmetric manner, i.e., the answer must either use the same sub-
+ profile as the offer or reject the offer. Note that this definition
+ also exists in [RFC6184] in a slightly different form.
+
+ enhancement layer: A layer in which at least one of the values of
+ dependency_id or quality_id is higher than 0, or a layer in which
+ none of the NAL units is associated with the value of temporal_id
+ equal to 0. An operation point constructed using the maximum
+ temporal_id, dependency_id, and quality_id values associated with an
+ enhancement layer may or may not conform to one or more of the
+ profiles specified in Annex A of [H.264].
+
+ H.264/AVC compatible: The property of a bitstream subset of
+ conforming to one or more of the profiles specified in Annex A of
+ [H.264].
+
+ intra layer representation: A layer representation that contains
+ only slices that use intra prediction, and hence do not refer to any
+ earlier layer representation in decoding order in the same layer.
+ Note that in SVC intra prediction includes intra-layer intra
+ prediction as well as inter-layer intra prediction.
+
+ layer: A bitstream subset in which all NAL units of type 1, 5, 12,
+ 14, or 20 have the same values of dependency_id and quality_id,
+ either directly through their NAL unit header (for NAL units of type
+ 14 or 20) or through association to a prefix (type 14) NAL unit (for
+ NAL unit type 1, 5, or 12). A layer may contain NAL units associated
+ with more than one values of temporal_id.
+
+ media-aware network element (MANE): A network element, such as a
+ middlebox or application layer gateway that is capable of parsing
+ certain aspects of the RTP payload headers or the RTP payload and
+ reacting to their contents. Note that this definition also exists in
+ [RFC6184] in exactly the same form.
+
+ Informative note: The concept of a MANE goes beyond normal routers
+ or gateways in that a MANE has to be aware of the signaling (e.g.,
+ to learn about the payload type mappings of the media streams),
+ and in that it has to be trusted when working with Secure Real-
+ time Transport Protocol (SRTP). The advantage of using MANEs is
+ that they allow packets to be dropped according to the needs of
+ the media coding. For example, if a MANE has to drop packets due
+ to congestion on a certain link, it can identify and remove those
+ packets whose elimination produces the least adverse effect on the
+ user experience. After dropping packets, MANEs must rewrite RTCP
+ packets to match the changes to the RTP packet stream as specified
+ in Section 7 of [RFC3550].
+
+
+
+
+Wenger, et al. Standards Track [Page 20]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ multi-session transmission: The transmission mode in which the SVC
+ stream is transmitted over multiple RTP sessions. Dependency between
+ RTP sessions MUST be signaled according to Section 7.2.3 of this
+ memo.
+
+ NAL unit decoding order: A NAL unit order that conforms to the
+ constraints on NAL unit order given in Section G.7.4.1.2 in [H.264].
+ Note that this definition also exists in [RFC6184] in a slightly
+ different form.
+
+ NALU-time: The value that the RTP timestamp would have if the NAL
+ unit would be transported in its own RTP packet. Note that this
+ definition also exists in [RFC6184] in exactly the same form.
+
+ operation point: An operation point is identified by a set of values
+ of temporal_id, dependency_id, and quality_id. A bitstream
+ corresponding to an operation point can be constructed by removing
+ all NAL units associated with a higher value of dependency_id, and
+ all NAL units associated with the same value of dependency_id but
+ higher values of quality_id or temporal_id. An operation point
+ bitstream conforms to at least one of the profiles defined in Annex A
+ or G of [H.264], and offers a representation of the original video
+ signal at a certain fidelity.
+
+ Informative note: Additional NAL units may be removed (with lower
+ dependency_id or same dependency_id but lower quality_id) if they
+ are not required for decoding the bitstream at the particular
+ operation point. The resulting bitstream, however, may no longer
+ conform to any of the profiles defined in Annex A or G of [H.264].
+
+ operation point representation: The set of all NAL units of an
+ operation point within the same access unit.
+
+ RTP packet stream: A sequence of RTP packets with increasing sequence
+ numbers (except for wrap-around), identical payload type and
+ identical SSRC (Synchronization Source), carried in one RTP session.
+ Within the scope of this memo, one RTP packet stream is utilized to
+ transport one or more layers.
+
+ single-session transmission: The transmission mode in which the SVC
+ bitstream is transmitted over a single RTP session.
+
+ SVC base layer: The layer that includes all NAL units associated with
+ dependency_id and quality_id values both equal to 0, including prefix
+ NAL units (NAL unit type 14).
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 21]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ SVC enhancement layer: A layer in which at least one of the values of
+ dependency_id or quality_id is higher than 0. An operation point
+ constructed using the maximum dependency_id and quality_id values and
+ any temporal_id value associated with an SVC enhancement layer does
+ not conform to any of the profiles specified in Annex A of [H.264].
+
+ SVC NAL unit: A NAL unit of NAL unit type 14, 15, or 20 as specified
+ in Annex G of [H.264].
+
+ SVC NAL unit header: A four-byte header resulting from the addition
+ of a three-byte SVC-specific header extension added in NAL unit types
+ 14 and 20.
+
+ SVC RTP session: Either the base RTP session or an enhancement RTP
+ session.
+
+ T0 AVC base layer: A subset of the AVC base layer constructed by
+ removing all VCL NAL units associated with temporal_id values higher
+ than 0 and non-VCL NAL units and SEI messages associated only with
+ the VCL NAL units being removed.
+
+ T0 SVC base layer: A subset of the SVC base layer constructed by
+ removing all VCL NAL units associated with temporal_id values higher
+ than 0 as well as prefix NAL units, non-VCL NAL units, and SEI
+ messages associated only with the VCL NAL units being removed.
+
+ transmission order: The order of packets in ascending RTP sequence
+ number order (in modulo arithmetic). Within an aggregation packet,
+ the NAL unit transmission order is the same as the order of
+ appearance of NAL units in the packet. Note that this definition
+ also exists in [RFC6184] in exactly the same form.
+
+3.2. Abbreviations
+
+ In addition to the abbreviations defined in [RFC6184], the following
+ abbreviations are used in this memo.
+
+ CGS: Coarse-Grain Scalability
+ CS-DON: Cross-Session Decoding Order Number
+ MGS: Medium-Grain Scalability
+ MST: Multi-Session Transmission
+ PACSI: Payload Content Scalability Information
+ SST: Single-Session Transmission
+ SNR: Signal-to-Noise Ratio
+ SVC: Scalable Video Coding
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 22]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+4. RTP Payload Format
+
+4.1. RTP Header Usage
+
+ In addition to Section 5.1 of [RFC6184], the following rules apply.
+
+ o Setting of the M bit:
+
+ The M bit of an RTP packet for which the packet payload is an NI-MTAP
+ MUST be equal to 1 if the last NAL unit, in decoding order, of the
+ access unit associated with the RTP timestamp is contained in the
+ packet.
+
+ o Setting of the RTP timestamp:
+
+ For an RTP packet for which the packet payload is an empty NAL unit,
+ the RTP timestamp must be set according to Section 4.10.
+
+ For an RTP packet for which the packet payload is a PACSI NAL unit,
+ the RTP timestamp MUST be equal to the NALU-time of the next non-
+ PACSI NAL unit in transmission order. Recall that the NALU-time of a
+ NAL unit in an MTAP is defined in [RFC6184] as the value that the RTP
+ timestamp would have if that NAL unit would be transported in its own
+ RTP packet.
+
+ o Setting of the SSRC:
+
+ For both SST and MST, the SSRC values MUST be set according to
+ [RFC3550].
+
+4.2. NAL Unit Extension and Header Usage
+
+4.2.1. NAL Unit Extension
+
+ This memo specifies a NAL unit extension mechanism to allow for
+ introduction of new types of NAL units, beyond the three NAL unit
+ types left undefined in [RFC6184] (i.e., 0, 30, and 31). The
+ extension mechanism utilizes the NAL unit type value 31 and is
+ specified as follows. When the NAL unit type value is equal to 31,
+ the one-byte NAL unit header consisting of the F, NRI, and Type
+ fields as specified in Section 1.1.3 is extended by one additional
+ octet, which consists of a 5-bit field named Subtype and three 1-bit
+ fields named J, K, and L, respectively. The additional octet is
+ shown in the following figure.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 23]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ +---------------+
+ |0|1|2|3|4|5|6|7|
+ +-+-+-+-+-+-+-+-+
+ | Subtype |J|K|L|
+ +---------------+
+
+ The Subtype value determines the (extended) NAL unit type of this NAL
+ unit. The interpretation of the fields J, K, and L depends on the
+ Subtype. The semantics of the fields are as follows.
+
+ When Subtype is equal to 1, the NAL unit is an empty NAL unit as
+ specified in Section 4.10. When Subtype is equal to 2, the NAL unit
+ is an NI-MTAP NAL unit as specified in Section 4.7.1. All other
+ values of Subtype (0, 3-31) are reserved for future extensions, and
+ receivers MUST ignore the entire NAL unit when Subtype is equal to
+ any of these reserved values.
+
+4.2.2. NAL Unit Header Usage
+
+ The structure and semantics of the NAL unit header according to the
+ H.264 specification [H.264] were introduced in Section 1.1.3. This
+ section specifies the extended semantics of the NAL unit header
+ fields F, NRI, I, PRID, DID, QID, TID, U, and D, according to this
+ memo. When the Type field is equal to 31, the semantics of the
+ fields in the extension NAL unit header were specified in Section
+ 4.2.1.
+
+ The semantics of F specified in Section 5.3 of [RFC6184] also apply
+ in this memo. That is, a value of 0 for F indicates that the NAL
+ unit type octet and payload should not contain bit errors or other
+ syntax violations, whereas a value of 1 for F indicates that the NAL
+ unit type octet and payload may contain bit errors or other syntax
+ violations. MANEs SHOULD set the F bit to indicate bit errors in the
+ NAL unit.
+
+ For NRI, for a bitstream conforming to one of the profiles defined in
+ Annex A of [H.264] and transported using [RFC6184], the semantics
+ specified in Section 5.3 of [RFC6184] apply, i.e., NRI also indicates
+ the relative importance of NAL units. For a bitstream conforming to
+ one of the profiles defined in Annex G of [H.264] and transported
+ using this memo, in addition to the semantics specified in Annex G of
+ [H.264], NRI also indicates the relative importance of NAL units
+ within a layer.
+
+ For I, in addition to the semantics specified in Annex G of [H.264],
+ according to this memo, MANEs MAY use this information to protect NAL
+ units with I equal to 1 better than NAL units with I equal to 0.
+ MANEs MAY also utilize information of NAL units with I equal to 1 to
+
+
+
+Wenger, et al. Standards Track [Page 24]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ decide when to forward more packets for an RTP packet stream. For
+ example, when it is detected that spatial layer switching has
+ happened such that the operation point has changed to a higher value
+ of DID, MANEs MAY start to forward NAL units with the higher value of
+ DID only after forwarding a NAL unit with I equal to 1 with the
+ higher value of DID.
+
+ Note that, in the context of this section, "protecting a NAL unit"
+ means any RTP or network transport mechanism that could improve the
+ probability of successful delivery of the packet conveying the NAL
+ unit, including applying a Quality of Service (QoS) enabled network,
+ Forward Error Correction (FEC), retransmissions, and advanced
+ scheduling behavior, whenever possible.
+
+ For PRID, the semantics specified in Annex G of [H.264] apply. Note
+ that MANEs implementing unequal error protection MAY use this
+ information to protect NAL units with smaller PRID values better than
+ those with larger PRID values, for example, by including only the
+ more important NAL units in a FEC protection mechanism. The
+ importance for the decoding process decreases as the PRID value
+ increases.
+
+ For DID, QID, or TID, in addition to the semantics specified in Annex
+ G of [H.264], according to this memo, values of DID, QID, or TID
+ indicate the relative importance in their respective dimension. A
+ lower value of DID, QID, or TID indicates a higher importance if the
+ other two components are identical. MANEs MAY use this information
+ to protect more important NAL units better than less important NAL
+ units.
+
+ For U, in addition to the semantics specified in Annex G of [H.264],
+ according to this memo, MANEs MAY use this information to protect NAL
+ units with U equal to 1 better than NAL units with U equal to 0.
+
+ For D, in addition to the semantics specified in Annex G of [H.264],
+ according to this memo, MANEs MAY use this information to determine
+ whether a given NAL unit is required for successfully decoding a
+ certain Operation Point of the SVC bitstream, hence to decide whether
+ to forward the NAL unit.
+
+4.3. Payload Structures
+
+ The NAL unit structure is central to H.264/AVC, [RFC6184], as well as
+ SVC and this memo. In H.264/AVC and SVC, all coded bits for
+ representing a video signal are encapsulated in NAL units. In
+ [RFC6184], each RTP packet payload is structured as a NAL unit, which
+ contains one or a part of one NAL unit specified in H.264/AVC, or
+ aggregates one or more NAL units specified in H.264/AVC.
+
+
+
+Wenger, et al. Standards Track [Page 25]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ [RFC6184] specifies three basic payload structures (in Section 5.2 of
+ [RFC6184]): single NAL unit packet, aggregation packet, fragmentation
+ unit, and six new types (24 to 29) of NAL units. The value of the
+ Type field of the RTP packet payload header (i.e., the first byte of
+ the payload) may be equal to any value from 1 to 23 for a single NAL
+ unit packet, any value from 24 to 27 for an aggregation packet, and
+ 28 or 29 for a fragmentation unit.
+
+ In addition to the NAL unit types defined originally for H.264/AVC,
+ SVC defines three new NAL unit types specifically for SVC: coded
+ slice in scalable extension NAL units (type 20), prefix NAL units
+ (type 14), and subset sequence parameter set NAL units (type 15), as
+ described in Section 1.1.
+
+ This memo further introduces three new types of NAL units, PACSI NAL
+ unit (NAL unit type 30) as specified in Section 4.9, empty NAL unit
+ (type 31, subtype 1) as specified in Section 4.10, and NI-MTAP NAL
+ unit (type 31, subtype 2) as specified in Section 4.7.1.
+
+ The RTP packet payload structure in [RFC6184] is maintained with
+ slight extensions in this memo, as follows. Each RTP packet payload
+ is still structured as a NAL unit, which contains one or a part of
+ one NAL unit specified in H.264/AVC and SVC, or contains one PACSI
+ NAL unit or one empty NAL unit, or aggregates zero or more NAL units
+ specified in H.264/AVC and SVC, zero or one PACSI NAL unit, and zero
+ or more empty NAL units.
+
+ In this memo, one of the three basic payload structures,
+ fragmentation unit, remains the same as in [RFC6184], and the other
+ two, single NAL unit packet and aggregation packet, are extended as
+ follows. The value of the Type field of the payload header may be
+ equal to any value from 1 to 23, inclusive, and 30 to 31, inclusive,
+ for a single NAL unit packet, and any value from 24 to 27, inclusive,
+ and 31, for an aggregation packet. When the Type field of the
+ payload header is equal to 31 and the Subtype field of the payload
+ header is equal to 2, the packet is an aggregation packet (containing
+ an NI-MTAP NAL unit). When the Type field of the payload header is
+ equal to 31 and the Subtype field of the payload header is equal to
+ 1, the packet is a single NAL unit packet (containing an empty NAL
+ unit).
+
+ Note that, in this memo, the length of the payload header varies
+ depending on the value of the Type field in the first byte of the RTP
+ packet payload. If the value is equal to 14, 20, or 30, the first
+ four bytes of the packet payload form the payload header; otherwise,
+ if the value is equal to 31, the first two bytes of the payload form
+ the payload header; otherwise, the payload header is the first byte
+ of the packet payload.
+
+
+
+Wenger, et al. Standards Track [Page 26]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Table 1 lists the NAL unit types introduced in SVC and this memo and
+ where they are described in this memo. Table 2 summarizes the basic
+ payload structure types for all NAL unit types when they are directly
+ used as RTP packet payloads according to this memo. Table 3
+ summarizes the NAL unit types allowed to be aggregated (i.e., used as
+ aggregation units in aggregation packets) or fragmented (i.e.,
+ carried in fragmentation units) according to this memo.
+
+ Table 1. NAL unit types introduced in SVC and this memo
+
+ Type Subtype NAL Unit Name Section Numbers
+ -----------------------------------------------------------
+ 14 - Prefix NAL unit 1.1
+ 15 - Subset sequence parameter set 1.1
+ 20 - Coded slice in scalable extension 1.1
+ 30 - PACSI NAL unit 4.9
+ 31 0 reserved 4.2.1
+ 31 1 Empty NAL unit 4.10
+ 31 2 NI-MTAP 4.7.1
+ 31 3-31 reserved 4.2.1
+
+ Table 2. Basic payload structure types for all NAL unit
+ types when they are directly used as RTP packet payloads
+
+ Type Subtype Basic Payload Structure
+ ------------------------------------------
+ 0 - reserved
+ 1-23 - Single NAL Unit Packet
+ 24-27 - Aggregation Packet
+ 28-29 - Fragmentation Unit
+ 30 - Single NAL Unit Packet
+ 31 0 reserved
+ 31 1 Single NAL Unit Packet
+ 31 2 Aggregation Packet
+ 31 3-31 reserved
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 27]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Table 3. Summary of the NAL unit types allowed to be
+ aggregated or fragmented (yes = allowed, no = disallowed,
+ - = not applicable/not specified)
+
+ Type Subtype STAP-A STAP-B MTAP16 MTAP24 FU-A FU-B NI-MTAP
+ -------------------------------------------------------------
+ 0 - - - - - - - -
+ 1-23 - yes yes yes yes yes yes yes
+ 24-29 - no no no no no no no
+ 30 - yes yes yes yes no no yes
+ 31 0 - - - - - - -
+ 31 1 yes no no no no no yes
+ 31 2 no no no no no no no
+ 31 3-31 - - - - - - -
+
+4.4. Transmission Modes
+
+ This memo enables transmission of an SVC bitstream over one or more
+ RTP sessions. If only one RTP session is used for transmission of
+ the SVC bitstream, the transmission mode is referred to as single-
+ session transmission (SST); otherwise (more than one RTP session is
+ used for transmission of the SVC bitstream), the transmission mode is
+ referred to as multi-session transmission (MST).
+
+ SST SHOULD be used for point-to-point unicast scenarios, while MST
+ SHOULD be used for point-to-multipoint multicast scenarios where
+ different receivers requires different operation points of the same
+ SVC bitstream, to improve bandwidth utilizing efficiency.
+
+ If the OPTIONAL mst-mode media type parameter (see Section 7.1) is
+ not present, SST MUST be used; otherwise (mst-mode is present), MST
+ MUST be used.
+
+4.5. Packetization Modes
+
+4.5.1. Packetization Modes for Single-Session Transmission
+
+ When SST is in use, Section 5.4 of [RFC6184] applies with the
+ following extensions.
+
+ The packetization modes specified in Section 5.4 of [RFC6184],
+ namely, single NAL unit mode, non-interleaved mode, and interleaved
+ mode, are also referred to as session packetization modes. Table 4
+ summarizes the allowed session packetization modes for SST.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 28]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Table 4. Summary of allowed session packetization modes
+ (denoted as "Session Mode" for simplicity) for SST (yes =
+ allowed, no = disallowed)
+
+ Session Mode Allowed
+ -------------------------------------
+ Single NAL Unit Mode yes
+ Non-Interleaved Mode yes
+ Interleaved Mode yes
+
+ For NAL unit types in the range of 0 to 29, inclusive, the NAL unit
+ types allowed to be directly used as packet payloads for each session
+ packetization mode are the same as specified in Section 5.4 of
+ [RFC6184]. For other NAL unit types, which are newly introduced in
+ this memo, the NAL unit types allowed to be directly used as packet
+ payloads for each session packetization mode are summarized in Table
+ 5.
+
+ Table 5. New NAL unit types allowed to be directly used
+ as packet payloads for each session packetization mode
+ (yes = allowed, no = disallowed, - = not applicable/not specified)
+
+ Type Subtype Single NAL Non-Interleaved Interleaved
+ Unit Mode Mode Mode
+ -------------------------------------------------------------
+ 30 - yes no no
+ 31 0 - - -
+ 31 1 yes yes no
+ 31 2 no yes no
+ 31 3-31 - - -
+
+4.5.2. Packetization Modes for Multi-Session Transmission
+
+ For MST, this memo specifies four MST packetization modes:
+
+ o Non-interleaved timestamp based mode (NI-T);
+
+ o Non-interleaved cross-session decoding order number (CS-DON) based
+ mode (NI-C);
+
+ o Non-interleaved combined timestamp and CS-DON mode (NI-TC); and
+
+ o Interleaved CS-DON (I-C) mode.
+
+ These four modes differ in two ways. First, they differ in terms of
+ whether NAL units are required to be transmitted within each RTP
+ session in decoding order (i.e., non-interleaved), or they are
+ allowed to be transmitted in a different order (i.e., interleaved).
+
+
+
+Wenger, et al. Standards Track [Page 29]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Second, they differ in the mechanisms they provide in order to
+ recover the correct decoding order of the NAL units across all RTP
+ sessions involved.
+
+ The NI-T, NI-C, and NI-TC modes do not allow interleaving, and are
+ thus targeted for systems that require relatively low end-to-end
+ latency, e.g., conversational systems. The I-C mode allows
+ interleaving and is thus targeted for systems that do not require
+ very low end-to-end latency. The benefits of interleaving are the
+ same as that of the interleaved mode specified in [RFC6184].
+
+ The NI-T mode uses timestamps to recover the decoding order of NAL
+ units, whereas the NI-C and I-C modes both use the CS-DON mechanism
+ (explained later) to do so. The NI-TC mode provides both timestamps
+ and the CS-DON method; receivers in this case may choose to use
+ either method for performing decoding order recovery. The MST
+ packetization mode in use MUST be signaled by the value of the
+ OPTIONAL mst-mode media type parameter. The used MST packetization
+ mode governs which session packetization modes are allowed in the
+ associated RTP sessions, which in turn govern which NAL unit types
+ are allowed to be directly used as RTP packet payloads.
+
+ Table 6 summarizes the allowed session packetization modes for NI-T,
+ NI-C, and NI-TC. Table 7 summarizes the allowed session
+ packetization modes for I-C.
+
+ Table 6. Summary of allowed session packetization modes
+ (denoted as "Session Mode" for simplicity) for NI-T, NI-C, and
+ NI-TC (yes = allowed, no = disallowed)
+
+ Session Mode Base Session Enhancement Session
+ -----------------------------------------------------------
+ Single NAL Unit Mode yes no
+ Non-Interleaved Mode yes yes
+ Interleaved Mode no no
+
+ Table 7. Summary of allowed session packetization modes
+ (denoted as "Session Mode" for simplicity) for I-C
+ (yes = allowed, no = disallowed)
+
+ Session Mode Base Session Enhancement Session
+ -----------------------------------------------------------
+ Single NAL Unit Mode no no
+ Non-Interleaved Mode no no
+ Interleaved Mode yes yes
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 30]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ For NAL unit types in the range of 0 to 29, inclusive, the NAL unit
+ types allowed to be directly used as packet payloads for each session
+ packetization mode are the same as specified in Section 5.4 of
+ [RFC6184]. For other NAL unit types, which are newly introduced in
+ this memo, the NAL unit types allowed to be directly used as packet
+ payloads for each allowed session packetization mode for NI-T, NI-C,
+ NI-TC, and I-C are summarized in Tables 8, 9, 10, and 11,
+ respectively.
+
+ Table 8. New NAL unit types allowed to be directly used
+ as packet payloads for each allowed session packetization
+ mode when NI-T is in use (yes = allowed, no = disallowed,
+ - = not applicable/not specified)
+
+ Type Subtype Single NAL Non-Interleaved
+ Unit Mode Mode
+ ---------------------------------------------------
+ 30 - yes no
+ 31 0 - -
+ 31 1 yes yes
+ 31 2 no yes
+ 31 3-31 - -
+
+ Table 9. New NAL unit types allowed to be directly used
+ as packet payloads for each allowed session packetization
+ mode when NI-C is in use (yes = allowed, no = disallowed,
+ - = not applicable/not specified)
+
+ Type Subtype Single NAL Non-Interleaved
+ Unit Mode Mode
+ ---------------------------------------------------
+ 30 - yes yes
+ 31 0 - -
+ 31 1 no no
+ 31 2 no yes
+ 31 3-31 - -
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 31]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Table 10. New NAL unit types allowed to be directly used
+ as packet payloads for each allowed session packetization
+ mode when NI-TC is in use (yes = allowed, no = disallowed,
+ - = not applicable/not specified)
+
+ Type Subtype Single NAL Non-Interleaved
+ Unit Mode Mode
+ ---------------------------------------------------
+ 30 - yes yes
+ 31 0 - -
+ 31 1 yes yes
+ 31 2 no yes
+ 31 3-31 - -
+
+ Table 11. New NAL unit types allowed to be directly used
+ as packet payloads for the allowed session packetization
+ mode when I-C is in use (yes = allowed, no = disallowed,
+ - = not applicable/not specified)
+
+ Type Subtype Interleaved Mode
+ ------------------------------------
+ 30 - no
+ 31 0 -
+ 31 1 no
+ 31 2 no
+ 31 3-31 -
+
+ When MST is in use and the MST packetization mode in use is NI-C,
+ empty NAL units (type 31, subtype 1) MUST NOT be used, i.e., no RTP
+ packet is allowed to contain one or more empty NAL units.
+
+ When MST is in use and the MST packetization mode in use is I-C, both
+ empty NAL units (type 31, subtype 1) and NI-MTAP NAL units (type 31,
+ subtype 2) MUST NOT be used, i.e., no RTP packet is allowed to
+ contain one or more empty NAL units or an NI-MTAP NAL unit.
+
+4.6. Single NAL Unit Packets
+
+ Section 5.6 of [RFC6184] applies with the following extensions.
+
+ The payload of a single NAL unit packet MAY be a PACSI NAL unit (Type
+ 30) or an empty NAL unit (Type 31 and Subtype 1), in addition to a
+ NAL unit with NAL unit type equal to any value from 1 to 23,
+ inclusive.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 32]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ If the Type field of the first byte of the payload is not equal to
+ 31, the payload header is the first byte of the payload. Otherwise,
+ (the Type field of the first byte of the payload is equal to 31), the
+ payload header is the first two bytes of the payload.
+
+4.7. Aggregation Packets
+
+ In addition to Section 5.7 of [RFC6184], the following applies in
+ this memo.
+
+4.7.1. Non-Interleaved Multi-Time Aggregation Packets (NI-MTAPs)
+
+ One new NAL unit type introduced in this memo is the non-interleaved
+ multi-time aggregation packet (NI-MTAP). An NI-MTAP consists of one
+ or more non-interleaved multi-time aggregation units.
+
+ The NAL units contained in NI-MTAPs MUST be aggregated in decoding
+ order.
+
+ A non-interleaved multi-time aggregation unit for the NI-MTAP
+ consists of 16 bits of unsigned size information of the following NAL
+ unit (in network byte order), and 16 bits (in network byte order) of
+ timestamp offset (TS offset) for the NAL unit. The structure is
+ presented in Figure 1. The starting or ending position of an
+ aggregation unit within a packet may or may not be on a 32-bit word
+ boundary. The NAL units in the NI-MTAP are ordered in NAL unit
+ decoding order.
+
+ The Type field of the NI-MTAP MUST be set equal to "31".
+
+ The F bit MUST be set to 0 if all the F bits of the aggregated NAL
+ units are zero; otherwise, it MUST be set to 1.
+
+ The value of NRI MUST be the maximum value of NRI across all NAL
+ units carried in the NI-MTAP packet.
+
+ The field Subtype MUST be equal to 2.
+
+ If the field J is equal to 1, the optional DON field MUST be present
+ for each of the non-interleaved multi-time aggregation units. For
+ SST, the J field MUST be equal to 0. For MST, in the NI-T mode the J
+ field MUST be equal to 0, whereas in the NI-C or NI-TC mode the J
+ field MUST be equal to 1. When the NI-C or NI-TC mode is in use, the
+ DON field, when present, MUST represent the CS-DON value for the
+ particular NAL unit as defined in Section 6.2.2.
+
+ The fields K and L MUST be both equal to 0.
+
+
+
+
+Wenger, et al. Standards Track [Page 33]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ : NAL unit size | TS offset |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | DON (optional) | |
+ |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NAL unit |
+ | |
+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | :
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 1. Non-interleaved multi-time aggregation unit for NI-MTAP
+
+ Let TS be the RTP timestamp of the packet carrying the NAL unit.
+ Recall that the NALU-time of a NAL unit in an MTAP is defined in
+ [RFC6184] as the value that the RTP timestamp would have if that NAL
+ unit would be transported in its own RTP packet. The timestamp
+ offset field MUST be set to a value equal to the value of the
+ following formula:
+
+ if NALU-time >= TS, TS offset = NALU-time - TS
+ else, TS offset = NALU-time + (2^32 - TS)
+
+ For the "earliest" multi-time aggregation unit in an NI-MTAP, the
+ timestamp offset MUST be zero. Hence, the RTP timestamp of the NI-
+ MTAP itself is identical to the earliest NALU-time.
+
+ Informative note: The "earliest" multi-time aggregation unit is
+ the one that would have the smallest extended RTP timestamp among
+ all the aggregation units of an NI-MTAP if the aggregation units
+ were encapsulated in single NAL unit packets. An extended
+ timestamp is a timestamp that has more than 32 bits and is capable
+ of counting the wraparound of the timestamp field, thus enabling
+ one to determine the smallest value if the timestamp wraps. Such
+ an "earliest" aggregation unit may or may not be the first one in
+ the order in which the aggregation units are encapsulated in an
+ NI-MTAP. The "earliest" NAL unit need not be the same as the
+ first NAL unit in the NAL unit decoding order either.
+
+ Figure 2 presents an example of an RTP packet that contains an NI-
+ MTAP that contains two non-interleaved multi-time aggregation units,
+ labeled as 1 and 2 in the figure.
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 34]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | RTP Header |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |F|NRI| Type | Subtype |J|K|L| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
+ | |
+ | Non-interleaved multi-time aggregation unit #1 |
+ : :
+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | | Non-interleaved multi-time |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
+ | aggregation unit #2 |
+ : :
+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | :...OPTIONAL RTP padding |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 2. An RTP packet including an NI-MTAP containing two
+ non-interleaved multi-time aggregation units
+
+4.8. Fragmentation Units (FUs)
+
+ Section 5.8 of [RFC6184] applies.
+
+ Informative note: In case a NAL unit with the four-byte SVC NAL
+ unit header is fragmented, the three-byte SVC-specific header
+ extension is considered as part of the NAL unit payload. That is,
+ the three-byte SVC-specific header extension is only available in
+ the first fragment of the fragmented NAL unit.
+
+4.9. Payload Content Scalability Information (PACSI) NAL Unit
+
+ Another new type of NAL unit specified in this memo is the payload
+ content scalability information (PACSI) NAL unit. The Type field of
+ PACSI NAL units MUST be equal to 30 (a NAL unit type value left
+ unspecified in [H.264] and [RFC6184]). A PACSI NAL unit MAY be
+ carried in a single NAL unit packet or an aggregation packet, and
+ MUST NOT be fragmented.
+
+ PACSI NAL units may be used for the following purposes:
+
+ o To enable MANEs to decide whether to forward, process, or discard
+ aggregation packets, by checking in PACSI NAL units the
+ scalability information and other characteristics of the
+
+
+
+
+
+Wenger, et al. Standards Track [Page 35]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ aggregated NAL units, rather than looking into the aggregated NAL
+ units themselves, which are defined by the video coding
+ specification;
+
+ o To enable correct decoding order recovery in MST using the NI-C or
+ NI-TC mode, with the help of the CS-DON information included in
+ PACSI NAL units; and
+
+ o To improve resilience to packet losses, e.g., by utilizing the
+ following data or information included in PACSI NAL units:
+ repeated Supplemental Enhancement Information (SEI) messages,
+ information regarding the start and end of layer representations,
+ and the indices to layer representations of the lowest temporal
+ subset.
+
+ PACSI NAL units MAY be ignored in the NI-T mode without affecting the
+ decoding order recovery process.
+
+ When a PACSI NAL unit is present in an aggregation packet, the
+ following applies.
+
+ o The PACSI NAL unit MUST be the first aggregated NAL unit in the
+ aggregation packet.
+
+ o There MUST be at least one additional aggregated NAL unit in the
+ aggregation packet.
+
+ o The RTP header fields and the payload header fields of the
+ aggregation packet are set as if the PACSI NAL unit was not
+ included in the aggregation packet.
+
+ o If the aggregation packet is an MTAP16, MTAP24, or NI-MTAP with
+ the J field equal to 1, the decoding order number (DON) for the
+ PACSI NAL unit MUST be set to indicate that the PACSI NAL unit has
+ an identical DON to the first NAL unit in decoding order among the
+ remaining NAL units in the aggregation packet.
+
+ When a PACSI NAL unit is included in a single NAL unit packet, it is
+ associated with the next non-PACSI NAL unit in transmission order,
+ and the RTP header fields of the packet are set as if the next non-
+ PACSI NAL unit in transmission order was included in a single NAL
+ unit packet.
+
+ The PACSI NAL unit structure is as follows. The first four octets
+ are exactly the same as the four-byte SVC NAL unit header discussed
+ in Section 1.1.3. They are followed by one octet containing several
+ flags, then five optional octets, and finally zero or more SEI NAL
+ units. Each SEI NAL unit is preceded by a 16-bit unsigned size field
+
+
+
+Wenger, et al. Standards Track [Page 36]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ (in network byte order) that indicates the size of the following NAL
+ unit in bytes (excluding these two octets, but including the NAL unit
+ header octet of the SEI NAL unit). Figure 3 illustrates the PACSI
+ NAL unit structure and an example of a PACSI NAL unit containing two
+ SEI NAL units.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |F|NRI| Type |R|I| PRID |N| DID | QID | TID |U|D|O| RR|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |X|Y|T|A|P|C|S|E| TL0PICIDX (o) | IDRPICID (o) |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | DONC (o) | NAL unit size 1 |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | |
+ | SEI NAL unit 1 |
+ | |
+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | | NAL unit size 2 | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
+ | |
+ | SEI NAL unit 2 |
+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 3. PACSI NAL unit structure. Fields suffixed by
+ "(o)" are OPTIONAL.
+
+ The bits A, P, and C are specified only if the bit X is equal to 1.
+ The bits S and E are specified, and the fields TL0PICIDX and IDRPICID
+ are present, only if the bit Y is equal to 1. The field DONC is
+ present only if the bit T is equal to 1. The field T MUST be equal
+ to 0 if the PACSI NAL unit is contained in an STAP-B, MTAP16, MTAP24,
+ or NI-MTAP with the J field equal to 1.
+
+ The values of the fields in PACSI NAL unit MUST be set as follows.
+
+ o The F bit MUST be set to 1 if the F bit in at least one of the
+ remaining NAL units in the aggregation packet is equal to 1 (when
+ the PACSI NAL unit is included in an aggregation packet) or if the
+ next non-PACSI NAL unit in transmission order has the F bit equal
+ to 1 (when the PACSI NAL unit is included in a single NAL unit
+ packet). Otherwise, the F bit MUST be set to 0.
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 37]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o The NRI field MUST be set to the highest value of NRI field among
+ all the remaining NAL units in the aggregation packet (when the
+ PACSI NAL unit is included in an aggregation packet) or the value
+ of the NRI field of the next non-PACSI NAL unit in transmission
+ order (when the PACSI NAL unit is included in a single NAL unit
+ packet).
+
+ o The Type field MUST be set to 30.
+
+ o The R bit MUST be set to 1. Receivers MUST ignore the value of R.
+
+ o The I bit MUST be set to 1 if the I bit of at least one of the
+ remaining NAL units in the aggregation packet is equal to 1 (when
+ the PACSI NAL unit is included in an aggregation packet) or if the
+ I bit of the next non-PACSI NAL unit in transmission order is
+ equal to 1 (when the PACSI NAL unit is included in a single NAL
+ unit packet). Otherwise, the I bit MUST be set to 0.
+
+ o The PRID field MUST be set to the lowest value of the PRID values
+ of the remaining NAL units in the aggregation packet (when the
+ PACSI NAL unit is included in an aggregation packet) or the PRID
+ value of the next non-PACSI NAL unit in transmission order (when
+ the PACSI NAL unit is included in a single NAL unit packet).
+
+ o The N bit MUST be set to 1 if the N bit of all the remaining NAL
+ units in the aggregation packet is equal to 1 (when the PACSI NAL
+ unit is included in an aggregation packet) or if the N bit of the
+ next non-PACSI NAL unit in transmission order is equal to 1 (when
+ the PACSI NAL unit is included in a single NAL unit packet).
+ Otherwise, the N bit MUST be set to 0.
+
+ o The DID field MUST be set to the lowest value of the DID values of
+ the remaining NAL units in the aggregation packet (when the PACSI
+ NAL unit is included in an aggregation packet) or the DID value of
+ the next non-PACSI NAL unit in transmission order (when the PACSI
+ NAL unit is included in a single NAL unit packet).
+
+ o The QID field MUST be set to the lowest value of the QID values of
+ the remaining NAL units with the lowest value of DID in the
+ aggregation packet (when the PACSI NAL unit is included in an
+ aggregation packet) or the QID value of the next non-PACSI NAL
+ unit in transmission order (when the PACSI NAL unit is included in
+ a single NAL unit packet).
+
+ o The TID field MUST be set to the lowest value of the TID values of
+ the remaining NAL units with the lowest value of DID in the
+ aggregation packet (when the PACSI NAL unit is included in an
+
+
+
+
+Wenger, et al. Standards Track [Page 38]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ aggregation packet) or the TID value of the next non-PACSI NAL
+ unit in transmission order (when the PACSI NAL unit is included in
+ a single NAL unit packet).
+
+ o The U bit MUST be set to 1 if the U bit of at least one of the
+ remaining NAL units in the aggregation packet is equal to 1 (when
+ the PACSI NAL unit is included in an aggregation packet) or if the
+ U bit of the next non-PACSI NAL unit in transmission order is
+ equal to 1 (when the PACSI NAL unit is included in a single NAL
+ unit packet). Otherwise, the U bit MUST be set to 0.
+
+ o The D bit MUST be set to 1 if the D value of all the remaining NAL
+ units in the aggregation packet is equal to 1 (when the PACSI NAL
+ unit is included in an aggregation packet) or if the D bit of the
+ next non-PACSI NAL unit in transmission order is equal to 1 (when
+ the PACSI NAL unit is included in a single NAL unit packet).
+ Otherwise, the D bit MUST be set to 0.
+
+ o The O bit MUST be set to 1 if the O bit of at least one of the
+ remaining NAL units in the aggregation packet is equal to 1 (when
+ the PACSI NAL unit is included in an aggregation packet) or if the
+ O bit of the next non-PACSI NAL unit in transmission order is
+ equal to 1 (when the PACSI NAL unit is included in a single NAL
+ unit packet). Otherwise, the O bit MUST be set to 0.
+
+ o The RR field MUST be set to "11" (in binary form). Receivers MUST
+ ignore the value of RR.
+
+ o If the X bit is equal to 1, the bits A, P, and C are specified as
+ below. Otherwise, the bits A, P, and C are unspecified, and
+ receivers MUST ignore the values of these bits. The X bit SHOULD
+ be identical for all the PACSI NAL units in all the RTP sessions
+ carrying the same SVC bitstream.
+
+ o If the Y bit is equal to 1, the OPTIONAL fields TL0PICIDX and
+ IDRPICID MUST be present and specified as below, and the bits S
+ and E are also specified as below. Otherwise, the fields
+ TL0PICIDX and IDRPICID MUST NOT be present, while the S and E bits
+ are unspecified and receivers MUST ignore the values of these
+ bits. The Y bit MUST be identical for all the PACSI NAL units in
+ all the RTP sessions carrying the same SVC bitstream. The Y bit
+ MUST be equal to 0 when the parameter packetization-mode is equal
+ to 2.
+
+ o If the T bit is equal to 1, the OPTIONAL field DONC MUST be
+ present and specified as below. Otherwise, the field DONC MUST
+ NOT be present. The field T MUST be equal to 0 if the PACSI NAL
+ unit is contained in an STAP-B, MTAP16, MTAP24, or NI-MTAP.
+
+
+
+Wenger, et al. Standards Track [Page 39]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o The A bit MUST be set to 1 if at least one of the remaining NAL
+ units in the aggregation packet belongs to an anchor layer
+ representation (when the PACSI NAL unit is included in an
+ aggregation packet) or if the next non-PACSI NAL unit in
+ transmission order belongs to an anchor layer representation (when
+ the PACSI NAL unit is included in a single NAL unit packet).
+ Otherwise, the A bit MUST be set to 0.
+
+ Informative note: The A bit indicates whether CGS or spatial layer
+ switching at a non-IDR layer representation (a layer
+ representation with nal_unit_type not equal to 5 and idr_flag not
+ equal to 1) can be performed. With some picture coding structures
+ a non-IDR intra layer representation can be used for random
+ access. Compared to using only IDR layer representations, higher
+ coding efficiency can be achieved. The H.264/AVC or SVC solution
+ to indicate the random accessibility of a non-IDR intra layer
+ representation is using a recovery point SEI message. The A bit
+ offers direct access to this information, without having to parse
+ the recovery point SEI message, which may be buried deeply in an
+ SEI NAL unit. Furthermore, the SEI message may or may not be
+ present in the bitstream.
+
+ o The P bit MUST be set to 1 if all the remaining NAL units in the
+ aggregation packet have redundant_pic_cnt greater than 0 (when the
+ PACSI NAL unit is included in an aggregation packet) or the next
+ non-PACSI NAL unit in transmission order has redundant_pic_cnt
+ greater than 0 (when the PACSI NAL unit is included in a single
+ NAL unit packet). Otherwise, the P bit MUST be set to 0.
+
+ Informative note: The P bit indicates whether a packet can be
+ discarded because it contains only redundant slice NAL units.
+ Without this bit, the corresponding information can be obtained
+ from the syntax element redundant_pic_cnt, which is contained in
+ the variable-length coded slice header.
+
+ o The C bit MUST be set to 1 if at least one of the remaining NAL
+ units in the aggregation packet belongs to an intra layer
+ representation (when the PACSI NAL unit is included in an
+ aggregation packet) or if the next non-PACSI NAL unit in
+ transmission order belongs to an intra layer representation (when
+ the PACSI NAL unit is included in a single NAL unit packet).
+ Otherwise, the C bit MUST be set to 0.
+
+ Informative note: The C bit indicates whether a packet contains
+ intra slices, which may be the only packets to be forwarded, e.g.,
+ when the network conditions are particularly adverse.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 40]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o The S bit MUST be set to 1, if the first NAL unit following the
+ PACSI NAL unit in an aggregation packet is the first VCL NAL unit,
+ in decoding order, of a layer representation (when the PACSI NAL
+ unit is included in an aggregation packet) or if the next non-
+ PACSI NAL unit in transmission order is the first VCL NAL unit, in
+ decoding order, of a layer representation(when the PACSI NAL unit
+ is included in a single NAL unit packet). Otherwise, the S bit
+ MUST be set to 0.
+
+ o The E bit MUST be set to 1, if the last NAL unit following the
+ PACSI NAL unit in an aggregation packet is the last VCL NAL unit,
+ in decoding order, of a layer representation (when the PACSI NAL
+ unit is included in an aggregation packet) or if the next non-
+ PACSI NAL unit in transmission order is the last VCL NAL unit, in
+ decoding order, of a layer representation (when the PACSI NAL unit
+ is included in a single NAL unit packet). Otherwise, the E bit
+ MUST be set to 0.
+
+ Informative note: In an aggregation packet it is always possible
+ to detect the beginning or end of a layer representation by
+ detecting changes in the values of dependency_id, quality_id, and
+ temporal_id in NAL unit headers, except from the first and last
+ NAL units of a packet. The S or E bits are used to provide this
+ information, for both single NAL unit and aggregation packets, so
+ that previous or following packets do not have to be examined.
+ This enables MANEs to detect slice loss and take proper action
+ such as requesting a retransmission as soon as possible, as well
+ as to allow efficient playout buffer handling similarly to the M
+ bit present in the RTP header. The M bit in the RTP header still
+ indicates the end of an access unit, not the end of a layer
+ representation.
+
+ o When present, the TL0PICIDX field MUST be set to equal to
+ tl0_dep_rep_idx as specified in Annex G of [H.264] for the layer
+ representation containing the first NAL unit following the PACSI
+ NAL unit in the aggregation packet (when the PACSI NAL unit is
+ included in an aggregation packet) or containing the next non-
+ PACSI NAL unit in transmission order (when the PACSI NAL unit is
+ included in a single NAL unit packet).
+
+ o When present, the IDRPICID field MUST be set to equal to
+ effective_idr_pic_id as specified in Annex G of [H.264] for the
+ layer representation containing the first NAL unit following the
+ PACSI NAL unit in the aggregation packet (when the PACSI NAL unit
+ is included in an aggregation packet) or containing the next non-
+ PACSI NAL unit in transmission order (when the PACSI NAL unit is
+ included in a single NAL unit packet).
+
+
+
+
+Wenger, et al. Standards Track [Page 41]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Informative note: The TL0PICIDX and IDRPICID fields enable the
+ detection of the loss of layer representations in the most
+ important temporal layer (with temporal_id equal to 0) by
+ receivers as well as MANEs. SVC provides a solution that uses SEI
+ messages, which are harder to parse and may or may not be present
+ in the bitstream. When the PACSI NAL unit is part of an NI-MTAP
+ packet, it is possible to infer the correct values of
+ tl0_dep_rep_idx and idr_pic_id for all layer representations
+ contained in the NI-MTAP by following the rules that specify how
+ these parameters are set as given in Annex G of [H.264] and by
+ detecting the different layer representations contained in the NI-
+ MTAP packet by detecting changes in the values of dependency_id_,
+ quality_id, and temporal_id in the NAL unit headers as well as
+ using the S and E flags. The only exception is if NAL units of an
+ IDR picture are present in the NI-MTAP in a position other than
+ the first NAL unit following the PACSI NAL unit, in which case the
+ value of idr_pic_id cannot be inferred. In this case the NAL unit
+ has to be partially parsed to obtain the idr_pic_id. Note that,
+ due to the large size of IDR pictures, their inclusion in an NI-
+ MTAP, and especially in a position other than the first NAL unit
+ following the PACSI NAL unit, may be neither practical nor useful.
+
+ o When present, the field DONC indicates the cross-session decoding
+ order number (CS-DON) for the first of the remaining NAL units in
+ the aggregation packet (when the PACSI NAL unit is included in an
+ aggregation packet) or the CS-DON of the next non-PACSI NAL unit
+ in transmission order (when the PACSI NAL unit is included in a
+ single NAL unit packet). CS-DON is further discussed in Section
+ 4.11.
+
+ The PACSI NAL unit MAY include a subset of the SEI NAL units
+ associated with the access unit to which the first non-PACSI NAL unit
+ in the aggregation packet belongs, and MUST NOT contain SEI NAL units
+ associated with any other access unit.
+
+ Informative note: In H.264/AVC and SVC, within each access unit,
+ SEI NAL units must appear before any VCL NAL unit in decoding
+ order. Therefore, without using PACSI NAL units, SEI messages are
+ typically only conveyed in the first of the packets carrying an
+ access unit. Senders may repeat SEI NAL units in PACSI NAL units,
+ so that they are repeated in more than one packet and thus
+ increase robustness against packet losses. Receivers may use the
+ repeated SEI messages in place of missing SEI messages.
+
+ For a PACSI NAL unit included in an aggregation packet, an SEI
+ message SHOULD NOT be included in the PACSI NAL unit and also
+ included in one of the remaining NAL units contained in the same
+ aggregation packet.
+
+
+
+Wenger, et al. Standards Track [Page 42]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+4.10. Empty NAL unit
+
+ An empty NAL unit MAY be included in a single NAL unit packet, an
+ STAP-A or an NI-MTAP packet. Empty NAL units MUST have an RTP
+ timestamp (when transported in a single NAL unit packet) or NALU-
+ time (when transported in an aggregation packet) that is associated
+ with an access unit for which there exists at least one NAL unit of
+ type 1, 5, or 20. When MST is used, the type 1, 5, or 20 NAL unit
+ may be in a different RTP session. Empty NAL units may be used in
+ the decoding order recovery process of the NI-T mode as described in
+ Section 5.2.1.
+
+ The packet structure is shown in the following figure.
+
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |F|NRI| Type | Subtype |J|K|L|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 4. Empty NAL unit structure.
+
+ The fields MUST be set as follows:
+
+ F MUST be equal to 0
+ NRI MUST be equal to 3
+ Type MUST be equal to 31
+ Subtype MUST be equal to 1
+ J MUST be equal to 0
+ K MUST be equal to 0
+ L MUST be equal to 0
+
+4.11. Decoding Order Number (DON)
+
+ The DON concept is introduced in [RFC6184] and is used to recover the
+ decoding order when interleaving is used within a single session.
+ Section 5.5 of [RFC6184] applies when using SST.
+
+ When using MST, it is necessary to recover the decoding order across
+ the various RTP sessions regardless if interleaving is used or not.
+ In addition to the timestamp mechanism described later, the CS-DON
+ mechanism is an extension of the DON facility that can be used for
+ this purpose, and is defined in the following section.
+
+4.11.1. Cross-Session DON (CS-DON) for Multi-Session Transmission
+
+ The cross-session decoding order number (CS-DON) is a number that
+ indicates the decoding order of NAL units across all RTP sessions
+ involved in MST. It is similar to the DON concept in [RFC6184], but
+ contrary to [RFC6184] where the DON was used only for interleaved
+
+
+
+Wenger, et al. Standards Track [Page 43]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ packetization, in this memo it is used not only in the interleaved
+ MST mode (I-C) but also in two of the non-interleaved MST modes (NI-C
+ and NI-TC).
+
+ When the NI-C or NI-TC MST modes are in use, the packetization of
+ each session MUST be as specified in Section 5.2.2. In PACSI NAL
+ units the CS-DON value is explicitly coded in the field DONC. For
+ non-PACSI NAL units the CS-DON value is derived as follows. Let SN
+ indicate the RTP sequence number of a packet.
+
+ o For each non-PACSI NAL unit carried in a session using the single
+ NAL unit session packetization mode, the CS-DON value of the NAL
+ unit is equal to (DONC_prev_PACSI + SN_diff - 1) % 65536, wherein
+ "%" is the modulo operation, DONC_prev_PACSI is the DONC value of
+ the previous PACSI NAL unit with the same NALU-time as the current
+ NAL unit, and SN_diff is calculated as follows:
+
+ if SN1 > SN2, SN_diff = SN1 - SN2
+ else SN_diff = SN2 + 65536 - SN1
+
+ where SN1 and SN2 are the SNs of the current NAL unit and the
+ previous PACSI NAL unit with the same NALU-time, respectively.
+
+ o For non-PACSI NAL units carried in a session using the non-
+ interleaved session packetization mode, the CS-DON value of each
+ non-PACSI NAL unit is derived as follows.
+
+ For a non-PACSI NAL unit in a single NAL unit packet, the
+ following applies.
+
+ If the previous PACSI NAL unit is contained in a single NAL
+ unit packet, the CS-DON value of the NAL unit is calculated
+ as above;
+
+ otherwise (the previous PACSI NAL unit is contained in an
+ STAP-A packet), the CS-DON value of the NAL unit is
+ calculated as above, with DONC_prev_PACSI being replaced by
+ the CS-DON value of the previous non-PACSI NAL unit in
+ decoding order (i.e., the CS-DON value of the last NAL unit
+ of the STAP-A packet).
+
+ For a non-PACSI NAL unit in an STAP-A packet, the following
+ applies.
+
+ If the non-PACSI NAL unit is the first non-PACSI NAL unit in
+ the STAP-A packet, the CS-DON value of the NAL unit is equal
+ to DONC of the PACSI NAL unit in the STAP-A packet;
+
+
+
+
+Wenger, et al. Standards Track [Page 44]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ otherwise (the non-PACSI NAL unit is not the first non-
+ PACSI NAL unit in the STAP-A packet), the CS-DON value of
+ the NAL unit is equal to: (the CS-DON value of the previous
+ non-PACSI NAL unit in decoding order + 1) % 65536, wherein
+ "%" is the modulo operation.
+
+ For a non-PACSI NAL unit in a number of FU-A packets, the CS-
+ DON value of the NAL unit is calculated the same way as when
+ the single NAL unit session packetization mode is in use, with
+ SN1 being the SN value of the first FU-A packet.
+
+ For a non-PACSI NAL unit in an NI-MTAP packet, the CS-DON value
+ is equal to the value of the DON field of the non-interleaved
+ multi-time aggregation unit.
+
+ When the I-C MST packetization mode is in use, the DON values derived
+ according to [RFC6184] for all the NAL units in each of the RTP
+ sessions MUST indicate CS-DON values.
+
+5. Packetization Rules
+
+ Section 6 of [RFC6184] applies in this memo, with the following
+ additions.
+
+5.1. Packetization Rules for Single-Session Transmission
+
+ All receivers MUST support the single NAL unit packetization mode to
+ provide backward compatibility to endpoints supporting only the
+ single NAL unit mode of [RFC6184]. However, the use of single NAL
+ unit packetization mode (packetization-mode equal to 0) SHOULD be
+ avoided whenever possible, because encapsulating NAL units of small
+ sizes in their own packets (e.g., small NAL units containing
+ parameter sets, prefix NAL units, or SEI messages) is less efficient
+ due to the packet header overhead.
+
+ All receivers MUST support the non-interleaved mode.
+
+ Informative note: The non-interleaved mode of [RFC6184] does allow
+ an application to encapsulate a single NAL unit in a single RTP
+ packet. Historically, the single NAL unit mode has been included
+ in [RFC6184] only for compatibility with ITU-T Rec. H.241 Annex A
+ [H.241]. There is no point in carrying this historic ballast
+ towards a new application space such as the one provided with SVC.
+ The implementation complexity increase for supporting the
+ additional mechanisms of the non-interleaved mode (namely, STAP-A
+ and FU-A) is minor, whereas the benefits are significant. As a
+ result, the support of STAP-A and FU-A is required. Additionally,
+
+
+
+
+Wenger, et al. Standards Track [Page 45]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ support for two of the three NAL unit types defined in this memo,
+ namely, empty NAL units and NI-MTAP is needed, as specified in
+ Section 4.5.1.
+
+ A NAL unit of small size SHOULD be encapsulated in an aggregation
+ packet together with one or more other NAL units. For example, non-
+ VCL NAL units such as access unit delimiters, parameter sets, or SEI
+ NAL units are typically small.
+
+ A prefix NAL unit and the NAL unit with which it is associated, and
+ which follows the prefix NAL unit in decoding order, SHOULD be
+ included in the same aggregation packet whenever an aggregation
+ packet is used for the associated NAL unit, unless this would violate
+ session MTU constraints or if fragmentation units are used for the
+ associated NAL unit.
+
+ Informative note: Although the prefix NAL unit is ignored by an
+ H.264/AVC decoder, it is necessary in the SVC decoding process.
+
+ Given the small size of the prefix NAL unit, it is best if it is
+ transported in the same RTP packet as its associated NAL unit.
+
+ When only an H.264/AVC compatible subset of the SVC base layer is
+ transmitted in an RTP session, the subset MUST be encapsulated
+ according to [RFC6184]. This way, an [RFC6184] receiver will be able
+ to receive the H.264/AVC compatible bitstream subset.
+
+ When a set of layers including one or more SVC enhancement layers is
+ transmitted in an RTP session, the set SHOULD be carried in one RTP
+ stream that SHOULD be encapsulated according to this memo.
+
+5.2. Packetization Rules for Multi-Session Transmission
+
+ When MST is used, the packetization rules specified in Section 5.1
+ still apply. In addition, the following packetization rules MUST be
+ followed, to ensure that decoding order of NAL units carried in the
+ sessions can be correctly recovered for each of the MST packetization
+ modes using the de-packetization process specified in Section 6.2.
+
+ The NI-T and NI-TC modes both use timestamps to recover the decoding
+ order. In order to be able to do so, it is necessary for the RTP
+ packet stream to contain data for all sampling instances of a given
+ RTP session in all enhancement RTP sessions that depend on the given
+ RTP session. The NI-C and I-C modes do not have this limitation, and
+ use the CS-DON values as a means to explicitly indicate decoding
+ order, either directly coded in PACSI NAL units, or inferred from
+
+
+
+
+
+Wenger, et al. Standards Track [Page 46]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ them using the packetization rules. It is noted that the NI-TC mode
+ offers both alternatives and it is up to the receiver to select which
+ one to use.
+
+5.2.1. NI-T/NI-TC Packetization Rules
+
+ When using the NI-T mode and a PACSI NAL unit is present, the T bit
+ MUST be equal to 0, i.e., the DONC field MUST NOT be present.
+
+ When using the NI-T mode, the optional parameters sprop-mst-remux-
+ buf-size, sprop-remux-buf-req, remux-buf-cap, sprop-remux-init-buf-
+ time, sprop-mst-max-don-diff MUST NOT be present.
+
+ When the NI-T or NI-TC MST mode is in use, the following applies.
+
+ If one or more NAL units of an access unit of sampling time instance
+ t is present in RTP session A, then one or more NAL units of the same
+ access unit MUST be present in any enhancement RTP session that
+ depends on RTP session A.
+
+ Informative note: The mapping between RTP and NTP format
+ timestamps is conveyed in RTCP SR packets. In addition, the
+ mechanisms for faster media timestamp synchronization discussed in
+ [RFC6051] may be used to speed up the acquisition of the RTP-to-
+ wall-clock mapping.
+
+ Informative note: The rule above may require the insertion of NAL
+ units, typically when temporal scalability is used, i.e., an
+ enhancement RTP session does not contain any NAL units for an
+ access unit with a particular NTP timestamp (media timestamp),
+ which, however, is present in a lower enhancement RTP session or
+ the base RTP session. There are two ways to insert additional NAL
+ units in order to satisfy this rule:
+
+ - One option for adding additional NAL units is to use empty NAL
+ units (defined in Section 4.10), which can be used by the
+ process described in Section 6.2.1 for the access unit
+ reordering process.
+
+ - Additional NAL units may also be added by the encoder itself,
+ for example, by transmitting coded data that simply instruct the
+ decoder to repeat the previous picture. This option, however,
+ may be difficult to use with pre-encoded content.
+
+ If a packet must be inserted in order to satisfy the above rule,
+ e.g., in case of a MANE generating multiple RTP streams out of a
+ single RTP stream, the inserted packet must have an RTP timestamp
+ that maps to the same wall-clock time (in NTP format) as the one of
+
+
+
+Wenger, et al. Standards Track [Page 47]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ the RTP timestamp of any packet of the access unit present in any
+ lower enhancement RTP session or the base RTP session. This is easy
+ to accomplish if the NAL unit or the packet can be inserted at the
+ time of the RTP stream generation, since the media timestamp (NTP
+ timestamp) must be the same for the inserted packet and the packet of
+ the corresponding access unit. If there is no knowledge of the media
+ time at RTP stream generation or if the RTP streams are not generated
+ at the same instance, this can be also applied later in the
+ transmission process. In this case the NTP timestamp of the inserted
+ packet can be calculated as follows.
+
+ Assume that a packet A2 of an access unit with RTP timestamp TS_A2 is
+ present in base RTP session A, and that no packet of that access unit
+ is present in enhancement RTP session B, as shown in Figure 5. Thus,
+ a packet B2 must be inserted into session B following the rule above.
+ The most recent RTCP sender report in session A carries NTP timestamp
+ NTP_A and the RTP timestamp TS_A. The sender report in session B
+ with a lower NTP timestamp than NTP_A is NTP_B, and carries the RTP
+ timestamp TS_B.
+
+ RTP session B:..B0........B1........(B2)......................
+
+ RTCP session B:.....SR(NTP_B,TS_B).............................
+
+ RTP session A:..A0........A1........A2........................
+
+ RTCP session A:..................SR(NTP_A,TS_A)................
+
+ -----------------|--x------|-----x---|------------------------>
+ NTP time
+ --------------------+<---------->+<->+------------------------>
+ t1 t2 RTP TS(B) time
+
+ Figure 5. Example calculation of RTP timestamp for packet
+ insertion in an enhancement layer RTP session
+
+ The vertical bars ("|")in the NTP time line in the figure above
+ indicate that access unit data is present in at least one of the
+ sessions. The "x" marks indicate the times of the sender reports.
+ The RTP timestamp time line for session B, shown right below the NTP
+ time line, indicates two time segments, t1 and t2. t1 is the time
+ difference between the sender reports between the two sessions,
+ expressed in RTP timestamp clock ticks, and t2 is the time difference
+ from the session A sender report to the A2 packet, again expressed in
+ RTP timestamp clock ticks. The sum of these differences is added to
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 48]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ the RTP timestamp of the session report from session B in order to
+ derive the correct RTP timestamp for the inserted packet B2. In
+ other words:
+
+ TS_B2 = TS_B + t1 + t2
+
+ Let toRTP() be a function that calculates the RTP time difference (in
+ clock ticks of the used clock) given an NTP timestamp difference, and
+ effRTPdiff() be a function that calculates the effective difference
+ between two timestamps, including wraparounds:
+
+ effRTPdiff( ts1, ts2 ):
+
+ if( ts1 <= ts2 ) then
+ effRTPdiff := ts1-ts2
+ else
+ effRTPDiff := (4294967296 + ts2) - ts1
+ We have:
+
+ t1 = toRTP(NTP_A - NTP_B) and t2 = effRTPdiff(TS_A2, TS_A)
+
+ Hence in order to generate the RTP timestamp TS_B2 for the inserted
+ packet B2, the RTP timestamp for packet B2 TS_B2 can be calculated as
+ follows.
+
+ TS_B2 = TS_B + toRTP(NTP_A - NTP_B) + effRTPdiff(TS_A2, TS_A)
+
+5.2.2. NI-C/NI-TC Packetization Rules
+
+ When the NI-C or NI-TC MST mode is in use, the following applies for
+ each of the RTP sessions.
+
+ o For each single NAL unit packet containing a non-PACSI NAL unit,
+ the previous packet, if present, MUST have the same RTP timestamp
+ as the single NAL unit packet, and the following applies.
+
+ o If the NALU-time of the non-PACSI NAL unit is not equal to the
+ NALU-time of the previous non-PACSI NAL unit in decoding order,
+ the previous packet MUST contain a PACSI NAL unit containing
+ the DONC field.
+
+ o In an STAP-A packet the first NAL unit in the STAP-A packet MUST
+ be a PACSI NAL unit containing the DONC field.
+
+ o For an FU-A packet the previous packet MUST have the same RTP
+ timestamp as the FU-A packet, and the following applies.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 49]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o If the FU-A packet is the start of the fragmented NAL unit, the
+ following applies.
+
+ o If the NALU-time of the fragmented NAL unit is not equal to
+ the NALU-time of the previous non-PACSI NAL unit in decoding
+ order, the previous packet MUST contain a PACSI NAL unit
+ containing the DONC field;
+
+ o Otherwise, (the NALU-time of the fragmented NAL unit is
+ equal to the NALU-time of the previous non-PACSI NAL unit in
+ decoding order), the previous packet MAY contain a PACSI NAL
+ unit containing the DONC field.
+
+ o Otherwise, if the FU-A packet is the end of the fragmented NAL
+ unit, the following applies.
+
+ o If the next non-PACSI NAL unit in decoding order has NALU-
+ time equal to the NALU-time of the fragmented NAL unit, and
+ is carried in a number of FU-A packets or a single NAL unit
+ packet, the next packet MUST be a single NAL unit packet
+ containing a PACSI NAL unit containing the DONC field.
+
+ o Otherwise (the FU-A packet is neither the start nor the end
+ of the fragmented NAL unit), the previous packet MUST be a
+ FU-A packet.
+
+ o For each single NAL unit packet containing a PACSI NAL unit, if
+ present, the PACSI NAL unit MUST contain the DONC field.
+
+ o When the optional media type parameter sprop-mst-csdon-always-
+ present is equal to 1, the session packetization mode in use MUST
+ be the non-interleaved mode, and only STAP-A and NI-MTAP packets
+ can be used.
+
+5.2.3. I-C Packetization Rules
+
+ When the I-C MST packetization mode is in use, the following applies.
+
+ o When a PACSI NAL unit is present, the T bit MUST be equal to 0,
+ i.e., the DONC field is not present, and the Y bit MUST be equal
+ to 0, i.e., the TL0PICIDX and IDRPICID are not present.
+
+5.2.4. Packetization Rules for Non-VCL NAL Units
+
+ NAL units that do not directly encode video slices are known in H.264
+ as non-VCL NAL units. Non-VCL units that are only used by, or only
+ relevant to, enhancement RTP sessions SHOULD be sent in the lowest
+ session to which they are relevant.
+
+
+
+Wenger, et al. Standards Track [Page 50]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Some senders, however, such as those sending pre-encoded data, may be
+ unable to easily determine which non-VCL units are relevant to which
+ session. Thus, non-VCL NAL units MAY, instead, be sent in a session
+ on which the session using these non-VCL NAL units depends (e.g., the
+ base RTP session).
+
+ If a non-VCL unit is relevant to more than one RTP session, neither
+ of which depends on the other(s), the NAL unit MAY be sent in another
+ session on which all these sessions depend.
+
+5.2.5. Packetization Rules for Prefix NAL Units
+
+ Section 5.1 of this memo applies, with the following addition. If
+ the base layer is sent in a base RTP session using [RFC6184], prefix
+ NAL units MAY be sent in the lowest enhancement RTP session rather
+ than in the base RTP session.
+
+6. De-Packetization Process
+
+6.1. De-Packetization Process for Single-Session Transmission
+
+ For single-session transmission, where a single RTP session is used,
+ the de-packetization process specified in Section 7 of [RFC6184]
+ applies.
+
+6.2. De-Packetization Process for Multi-Session Transmission
+
+ For multi-session transmission, where more than one RTP session is
+ used to receive data from the same SVC bitstream, the de-
+ packetization process is specified as follows.
+
+ As for a single RTP session, the general concept behind the de-
+ packetization process is to reorder NAL units from transmission order
+ to the NAL unit decoding order.
+
+ The sessions to be received MUST be identified by mechanisms
+ specified in Section 7.2.3. An enhancement RTP session typically
+ contains an RTP stream that depends on at least one other RTP
+ session, as indicated by mechanisms defined in Section 7.2.3. A
+ lower RTP session to an enhancement RTP session is an RTP session on
+ which the enhancement RTP session depends. The lowest RTP session
+ for a receiver is the base RTP session, which does not depend on any
+ other RTP session received by the receiver. The highest RTP session
+ for a receiver is the RTP session on which no other RTP session
+ received by the receiver depends.
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 51]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ For each of the RTP sessions, the RTP reception process as specified
+ in RFC 3550 is applied. Then the received packets are passed into
+ the payload de-packetization process as defined in this memo.
+
+ The decoding order of the NAL units carried in all the associated RTP
+ sessions is then recovered by applying one of the following
+ subsections, depending on which of the MST packetization modes is in
+ use.
+
+6.2.1. Decoding Order Recovery for the NI-T and NI-TC Modes
+
+ The following process MUST be applied when the NI-T packetization
+ mode is in use. The following process MAY be applied when the NI-TC
+ packetization mode is in use.
+
+ The process is based on RTP session dependency signaling, RTP
+ sequence numbers, and timestamps.
+
+ The decoding order of NAL units within an RTP packet stream in RTP
+ session is given by the ordering of sequence numbers SN of the RTP
+ packets that contain the NAL units, and the order of appearance of
+ NAL units within a packet.
+
+ Timing information according to the media timestamp TS, i.e., the NTP
+ timestamp as derived from the RTP timestamp of an RTP packet, is
+ associated with all NAL units contained in the same RTP packet
+ received in an RTP session.
+
+ For NI-MTAP packets the NALU-time is derived for each contained NAL
+ unit by using the "TS offset" value in the NI-MTAP packet as defined
+ in Section 4.10, and is used instead of the RTP packet timestamp to
+ derive the media timestamp, e.g., using the NTP wall clock as
+ provided via RTCP sender reports. NAL units contained in
+ fragmentation packets are handled as defragmented, entire NAL units
+ with their own media timestamps. All NAL units associated with the
+ same value of media timestamp TS are part of the same access unit
+ AU(TS). Any empty NAL units SHOULD be kept as, effectively, access
+ unit indicators in the reordering process. Empty NAL units and PACSI
+ NAL units SHOULD be removed before passing access unit data to the
+ decoder.
+
+ Informative note: These empty NAL units are used to associate NAL
+ units present in other RTP sessions with RTP sessions not
+ containing any data for an access unit of a particular time
+ instance. They act as access unit indicators in sessions that
+ would otherwise contain no data for the particular access unit.
+ The presence of these NAL units is ensured by the packetization
+ rules in Section 5.2.1.
+
+
+
+Wenger, et al. Standards Track [Page 52]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ It is assumed that the receiver has established an operation point
+ (DID, QID, and TID values), and has identified the highest
+ enhancement RTP session for this operation point. The decoding order
+ of NAL units from multiple RTP streams in multiple RTP sessions MUST
+ be recovered into a single sequence of NAL units, grouped into access
+ units, by performing any process equivalent to the following steps.
+ The general process is described in Section 4.2 of [RFC6051]. For
+ convenience the instructions of [RFC6051] are repeated and applied to
+ NAL units rather than to full RTP packets. Additionally, SVC-
+ specific extensions to the procedure in Section 4.2. of [RFC6051]
+ are presented in the following list:
+
+ o The process should be started with the NAL units received in
+ the highest RTP session with the first media timestamp TS (in
+ NTP format) available in the session's (de-jittering) buffer.
+ It is assumed that packets in the de-jittering buffer are
+ already stored in RTP sequence number order.
+
+ o Collect all NAL units associated with the same value of media
+ timestamp TS, starting from the highest RTP session, from all
+ the (de-jittering) buffers of the received RTP sessions. The
+ collected NAL units will be those associated with the access
+ unit AU(TS).
+
+ o Place the collected NAL units in the order of session
+ dependency as derived by the dependency indication as specified
+ in Section 7.2.3, starting from the lowest RTP session.
+
+ o Place the session ordered NAL units in decoding order within
+ the particular access unit by satisfying the NAL unit ordering
+ rules for SVC access units, as described in the informative
+ algorithm provided in Section 6.2.1.1.
+
+ o Remove NI-MTAP and any PACSI NAL units from the access unit
+ AU(TS).
+
+ o The access units can then be transferred to the decoder.
+ Access units AU(TS) are transferred to the decoder in the order
+ of appearance (given by the order of RTP sequence numbers) of
+ media timestamp values TS in the highest RTP session associated
+ with access unit AU(TS).
+
+ Informative note: Due to packet loss it is possible that not
+ all sessions may have NAL units present for the media
+ timestamp value TS present in the highest RTP session. In
+ such a case, an algorithm may: a) proceed to the next
+ complete access unit with NAL units present in all the
+ received RTP sessions; or b) consider a new highest RTP
+
+
+
+Wenger, et al. Standards Track [Page 53]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ session, the highest RTP session for which the access unit
+ is complete, and apply the process above. The algorithm may
+ return to the original highest RTP session when a complete
+ and error-free access unit that contains NAL units in all
+ the sessions is received.
+
+ The following gives an informative example.
+
+ The example shown in Figure 6 refers to three RTP sessions A, B, and
+ C containing an SVC bitstream transmitted as 3 sources. In the
+ example, the dependency signaling (described in Section 7.2.3)
+ indicates that session A is the base RTP session, B is the first
+ enhancement RTP session and depends on A, and C is the second
+ enhancement RTP session and depends on A and B. A hierarchical
+ picture coding prediction structure is used, in which session A has
+ the lowest frame rate and sessions B and C have the same but higher
+ frame rate.
+
+ The figure shows NAL units contained in RTP packets that are stored
+ in the de-jittering buffer at the receiver for session de-
+ packetization. The NAL units are already reordered according to
+ their RTP sequence number order and, if within an aggregation packet,
+ according to the order of their appearance within the aggregation
+ packet. The figure indicates for the received NAL units the decoding
+ order within the sessions, as well as the associated media (NTP)
+ timestamps ("TS[..]"). NAL units of the same access unit within a
+ session are grouped by "(.,.)" and share the same media timestamp TS,
+ which is shown at the bottom of the figure. Note that the timestamps
+ are not in increasing order since, in this example, the decoding
+ order is different from the output/display order.
+
+ The process first proceeds to the NAL units associated with the first
+ media timestamp TS[1] present in the highest session C and
+ removes/ignores all preceding (in decoding order) NAL units to NAL
+ units with TS[1] in each of the de-jittering buffers of RTP sessions
+ A, B, and C. Then, starting from session C, the first media
+ timestamp available in decoding order (TS[1]) is selected and NAL
+ units starting from RTP session A, and sessions B and C are placed in
+ order of the RTP session dependency as required by Section 7.2.3 of
+ this memo (in the example for TS[1]: first session B and then session
+ C) into the access unit AU(TS[1]) associated with media timestamp
+ TS[1]. Then the next media timestamp TS[3] in order of appearance in
+ the highest RTP session C is processed and the process described
+ above is repeated. Note that there may be access units with no NAL
+ units present, e.g., in the lowest RTP session A (see, e.g., TS[1]).
+ With TS[8], the first access unit with NAL units present in all the
+ RTP sessions appears in the buffers.
+
+
+
+
+Wenger, et al. Standards Track [Page 54]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ C: ------------(1,2)-(3,4)--(5)---(6)---(7,8)(9,10)-(11)--(12)----
+ | | | | | | | | | |
+ B: -(1,2)-(3,4)-(5)---(6)--(7,8)-(9,10)-(11)-(12)--(13,14)(15,15)-
+ | | | | | |
+ A: -------(1)---------------(2)---(3)---------------(4)----(5)----
+ ---------------------------------------------------decoding order-->
+
+ TS: [4] [2] [1] [3] [8] [6] [5] [7] [12] [10]
+
+ Key:
+ A, B, C - RTP sessions
+ Integer values in "()" - NAL unit decoding order within RTP session
+ "( )" - groups the NAL units of an access unit
+ in an RTP session
+ "|" - indicates corresponding NAL units of the
+ same access unit AU(TS[..]) in the RTP
+ sessions
+ Integer values in "[]" - media timestamp TS, sampling time
+ as derived, e.g., from NTP timestamp
+ associated with the access unit AU(TS[..]),
+ consisting of NAL units in the sessions
+ above each TS value.
+
+ Figure 6. Example of decoding order recovery in multi-source
+ transmission.
+
+6.2.1.1. Informative Algorithm for NI-T Decoding Order Recovery within
+ an Access Unit
+
+ Within an access unit, the [H.264] specification (Sections 7.4.1.2.3
+ and G.7.4.1.2.3) constrains the valid decoding order of NAL units.
+
+ These constraints make it possible to reconstruct a valid decoding
+ order for the NAL units of an access unit based only on the order of
+ NAL units in each session, the NAL unit headers, and Supplemental
+ Enhancement Information message headers.
+
+ This section specifies an informative algorithm to reconstruct a
+ valid decoding order for NAL units within an access unit. Other NAL
+ unit orderings may also be valid; however, any compliant NAL unit
+ ordering will describe the same video stream and ancillary data as
+ the one produced by this algorithm.
+
+ An actual implementation, of course, needs only to behave "as if"
+ this reordering is done. In particular, NAL units that are discarded
+ by an implementation's decoding process do not need to be reordered.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 55]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ In this algorithm, NAL units within an access unit are first ordered
+ by NAL unit type, in the order specified in Table 12 below, except
+ from NAL unit type 14, which is handled specially as described in the
+ table. NAL units of the same type are then ordered as specified for
+ the type, if necessary.
+
+ For the purposes of this algorithm, "session order" is the order of
+ NAL units implied by their transmission order within an RTP session.
+ For the non-interleaved and single NAL unit modes, this is the RTP
+ sequence number order coupled with the order of NAL units within an
+ aggregation unit.
+
+ Table 12. Ordering of NAL unit types within an Access Unit
+
+ Type Description / Comments
+ -----------------------------------------------------------
+ 9 Access unit delimiter
+
+ 7 Sequence parameter set
+
+ 13 Sequence parameter set extension
+
+ 15 Subset sequence parameter set
+
+ 8 Picture parameter set
+
+ 16-18 Reserved
+
+ 6 Supplemental enhancement information (SEI)
+ If an SEI message with a first payload of 0 (Buffering
+ Period) is present, it must be the first SEI message.
+
+ If SEI messages with a Scalable Nesting (30) payload and
+ a nested payload of 0 (Buffering Period) are present,
+ these then follow the first SEI message. Such an SEI
+ message with the all_layer_representations_in_au_flag
+ equal to 1 is placed first, followed by any others,
+ sorted in increasing order of DQId.
+
+ All other SEI messages follow in any order.
+
+ 14 Prefix NAL unit in scalable extension
+ 1 Coded slice of a non-IDR picture
+ 5 Coded slice of an IDR picture
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 56]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ NAL units of type 1 or 5 will be sent within only a
+ single session for any given access unit. They are
+ placed in session order. (Note: Any given access unit
+ will contain only NAL units of type 1 or type 5, not
+ both.)
+
+ If NAL units of type 14 are present, every NAL unit of
+ type 1 or 5 is prefixed by a NAL unit of type 14. (Note:
+ Within an access unit, every NAL unit of type 14 is
+ identical, so correlation of type 14 NAL units with the
+ other NAL units is not necessary.)
+
+ 12 Filler data
+
+ The only restriction of filler data NAL units within an
+ access unit is that they shall not precede the first VCL
+ NAL unit with the same access unit.
+
+ 19 Coded slice of an auxiliary coded picture without
+ partitioning
+
+ These NAL units will be sent within only a single
+ session for any given access unit, and are placed in
+ session order.
+
+ 20 Coded slice in scalable extension
+ 21-23 Reserved
+
+ Type 20 NAL units are placed in increasing order of DQId.
+ Within each DQId value, they are placed in session order.
+
+ (Note: SVC slices with a given DQId value will be sent
+ within only a single session for any given access unit.)
+
+ Type 21-23 NAL units are placed immediately following
+ the non-reserved-type VCL NAL unit they follow in
+ session order.
+
+ 10 End of sequence
+
+ 11 End of stream
+
+6.2.2. Decoding Order Recovery for the NI-C, NI-TC, and I-C Modes
+
+ The following process MUST be used when either the NI-C or I-C MST
+ packetization mode is in use. The following process MAY be applied
+ when the NI-TC MST packetization mode is in use.
+
+
+
+
+Wenger, et al. Standards Track [Page 57]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ The RTP packets output from the RTP-level reception processing for
+ each session are placed into a re-multiplexing buffer.
+
+ It is RECOMMENDED to set the size of the re-multiplexing buffer (in
+ bytes) equal to or greater than the value of the sprop-remux-buf-req
+ media type parameter of the highest RTP session the receiver
+ receives.
+
+ The CS-DON value is calculated and stored for each NAL unit.
+
+ Informative note: The CS-DON value of a NAL unit may rely on
+ information carried in another packet than the packet containing
+ the NAL unit. This happens, e.g., when the CS-DON values need to
+ be derived for non-PACSI NAL units contained in single NAL unit
+ packets, as the single NAL unit packets themselves do not contain
+ CS-DON information. In this case, when no packet containing
+ required CS-DON information is received for a NAL unit, this NAL
+ unit has to be discarded by the receiver as it cannot be fed to
+ the decoder in the correct order. When the optional media type
+ parameter sprop-mst-csdon-always-present is equal to 1, no such
+ dependency exists, i.e., the CS-DON value of any particular NAL
+ unit can be derived solely according to information in the packet
+ containing the NAL unit, and therefore, the receiver does not need
+ to discard any received NAL units.
+
+ The receiver operation is described below with the help of the
+ following functions and constants:
+
+ o Function AbsDON is specified in Section 8.1 of [RFC6184].
+
+ o Function don_diff is specified in Section 5.5 of [RFC6184].
+
+ o Constant N is the value of the OPTIONAL sprop-mst-remux-buf-size
+ media type parameter of the highest RTP session incremented by 1.
+
+ Initial buffering lasts until one of the following conditions is
+ fulfilled:
+
+ o There are N or more VCL NAL units in the re-multiplexing buffer.
+
+ o If sprop-mst-max-don-diff of the highest RTP session is present,
+ don_diff(m,n) is greater than the value of sprop-mst-max-don-diff
+ of the highest RTP session, where n corresponds to the NAL unit
+ having the greatest value of AbsDON among the received NAL units
+ and m corresponds to the NAL unit having the smallest value of
+ AbsDON among the received NAL units.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 58]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o Initial buffering has lasted for the duration equal to or greater
+ than the value of the OPTIONAL sprop-remux-init-buf-time media
+ type parameter of the highest RTP session.
+
+ The NAL units to be removed from the re-multiplexing buffer are
+ determined as follows:
+
+ o If the re-multiplexing buffer contains at least N VCL NAL units,
+ NAL units are removed from the re-multiplexing buffer and passed
+ to the decoder in the order specified below until the buffer
+ contains N-1 VCL NAL units.
+
+ o If sprop-mst-max-don-diff of the highest RTP session is present,
+ all NAL units m for which don_diff(m,n) is greater than sprop-
+ max-don-diff of the highest RTP session are removed from the re-
+ multiplexing buffer and passed to the decoder in the order
+ specified below. Herein, n corresponds to the NAL unit having the
+ greatest value of AbsDON among the NAL units in the re-
+ multiplexing buffer.
+
+ The order in which NAL units are passed to the decoder is specified
+ as follows:
+
+ o Let PDON be a variable that is initialized to 0 at the beginning
+ of the RTP sessions.
+
+ o For each NAL unit associated with a value of CS-DON, a CS-DON
+ distance is calculated as follows. If the value of CS-DON of the
+ NAL unit is larger than the value of PDON, the CS-DON distance is
+ equal to CS-DON - PDON. Otherwise, the CS-DON distance is equal
+ to 65535 - PDON + CS-DON + 1.
+
+ o NAL units are delivered to the decoder in increasing order of CS-
+ DON distance. If several NAL units share the same value of CS-
+ DON distance, they can be passed to the decoder in any order.
+
+ o When a desired number of NAL units have been passed to the
+ decoder, the value of PDON is set to the value of CS-DON for the
+ last NAL unit passed to the decoder.
+
+7. Payload Format Parameters
+
+ This section specifies the parameters that MAY be used to select
+ optional features of the payload format and certain features of the
+ bitstream. The parameters are specified here as part of the media
+ type registration for the SVC codec. A mapping of the parameters
+ into the Session Description Protocol (SDP) [RFC4566] is also
+
+
+
+
+Wenger, et al. Standards Track [Page 59]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ provided for applications that use SDP. Equivalent parameters could
+ be defined elsewhere for use with control protocols that do not use
+ SDP.
+
+ Some parameters provide a receiver with the properties of the stream
+ that will be sent. The names of all these parameters start with
+ "sprop" for stream properties. Some of these "sprop" parameters are
+ limited by other payload or codec configuration parameters. For
+ example, the sprop-parameter-sets parameter is constrained by the
+ profile-level-id parameter. The media sender selects all "sprop"
+ parameters rather than the receiver. This uncommon characteristic of
+ the "sprop" parameters may be incompatible with some signaling
+ protocol concepts, in which case the use of these parameters SHOULD
+ be avoided.
+
+7.1. Media Type Registration
+
+ The media subtype for the SVC codec has been allocated from the IETF
+ tree.
+
+ The receiver MUST ignore any unspecified parameter.
+
+ Informative note: Requiring that the receiver ignore unspecified
+ parameters allows for backward compatibility of future extensions.
+ For example, if a future specification that is backward compatible
+ to this specification specifies some new parameters, then a
+ receiver according to this specification is capable of receiving
+ data per the new payload but ignoring those parameters newly
+ specified in the new payload specification. This provision is
+ also present in [RFC6184].
+
+ Media Type name: video
+
+ Media subtype name: H264-SVC
+
+ Required parameters: none
+
+ OPTIONAL parameters:
+
+ In the following definitions of parameters, "the stream" or "the
+ NAL unit stream" refers to all NAL units conveyed in the current
+ RTP session in SST, and all NAL units conveyed in the current RTP
+ session and all NAL units conveyed in other RTP sessions that the
+ current RTP session depends on in MST.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 60]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ profile-level-id:
+ A base16 [RFC4648] (hexadecimal) representation of the
+ following three bytes in the sequence parameter set or subset
+ sequence parameter set NAL unit specified in [H.264]: 1)
+ profile_idc; 2) a byte herein referred to as profile-iop,
+ composed of the values of constraint_set0_flag,
+ constraint_set1_flag, constraint_set2_flag,
+ constraint_set3_flag, constraint_set4_flag,
+ constraint_set5_flag, and reserved_zero_2bits, in bit-
+ significance order, starting from the most-significant bit, and
+ 3) level_idc. Note that reserved_zero_2bits is required to be
+ equal to 0 in [H.264], but other values for it may be specified
+ in the future by ITU-T or ISO/IEC.
+
+ The profile-level-id parameter indicates the default sub-
+ profile, i.e., the subset of coding tools that may have been
+ used to generate the stream or that the receiver supports, and
+ the default level of the stream or the one that the receiver
+ supports.
+
+ The default sub-profile is indicated collectively by the
+ profile_idc byte and some fields in the profile-iop byte.
+ Depending on the values of the fields in the profile-iop byte,
+ the default sub-profile may be the same set of coding tools
+ supported by one profile, or a common subset of coding tools of
+ multiple profiles, as specified in Subsection G.7.4.2.1.1 of
+ [H.264]. The default level is indicated by the level_idc byte,
+ and, when profile_idc is equal to 66, 77, or 88 (the Baseline,
+ Main, or Extended profile) and level_idc is equal to 11,
+ additionally by bit 4 (constraint_set3_flag) of the profile-iop
+ byte. When profile_idc is equal to 66, 77, or 88 (the
+ Baseline, Main, or Extended profile) and level_idc is equal to
+ 11, and bit 4 (constraint_set3_flag) of the profile-iop byte is
+ equal to 1, the default level is Level 1b.
+
+ Table 13 lists all profiles defined in Annexes A and G of
+ [H.264] and, for each of the profiles, the possible
+ combinations of profile_idc and profile-iop that represent the
+ same sub-profile.
+
+ Table 13. Combinations of profile_idc and profile-iop
+ representing the same sub-profile corresponding to the full set
+ of coding tools supported by one profile. In the following, x
+ may be either 0 or 1, while the profile names are indicated as
+ follows. CB: Constrained Baseline profile, B: Baseline
+ profile, M: Main profile, E: Extended profile, H: High profile,
+ H10: High 10 profile, H42: High 4:2:2 profile, H44: High 4:4:4
+ Predictive profile, H10I: High 10 Intra profile, H42I: High
+
+
+
+Wenger, et al. Standards Track [Page 61]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ 4:2:2 Intra profile, H44I: High 4:4:4 Intra profile, C44I:
+ CAVLC 4:4:4 Intra profile, SB: Scalable Baseline profile, SH:
+ Scalable High profile, and SHI: Scalable High Intra profile.
+
+ Profile profile_idc profile-iop
+ (hexadecimal) (binary)
+
+ CB 42 (B) x1xx0000
+ same as: 4D (M) 1xxx0000
+ same as: 58 (E) 11xx0000
+ B 42 (B) x0xx0000
+ same as: 58 (E) 10xx0000
+ M 4D (M) 0x0x0000
+ E 58 00xx0000
+ H 64 00000000
+ H10 6E 00000000
+ H42 7A 00000000
+ H44 F4 00000000
+ H10I 6E 00010000
+ H42I 7A 00010000
+ H44I F4 00010000
+ C44I 2C 00010000
+ SB 53 x0000000
+ SH 56 0x000000
+ SHI 56 0x010000
+
+ For example, in the table above, profile_idc equal to 58
+ (Extended) with profile-iop equal to 11xx0000 indicates the
+ same sub-profile corresponding to profile_idc equal to 42
+ (Baseline) with profile-iop equal to x1xx0000. Note that other
+ combinations of profile_idc and profile-iop (not listed in
+ Table 13) may represent a sub-profile equivalent to the common
+ subset of coding tools for more than one profile. Note also
+ that a decoder conforming to a certain profile may be able to
+ decode bitstreams conforming to other profiles.
+
+ If profile-level-id is used to indicate stream properties, it
+ indicates that, to decode the stream, the minimum subset of
+ coding tools a decoder has to support is the default sub-
+ profile, and the lowest level the decoder has to support is the
+ default level.
+
+ If the profile-level-id parameter is used for capability
+ exchange or session setup, it indicates the subset of coding
+ tools, which is equal to the default sub-profile, that the
+ codec supports for both receiving and sending. If max-recv-
+ level is not present, the default level from profile-level-id
+ indicates the highest level the codec wishes to support. If
+
+
+
+Wenger, et al. Standards Track [Page 62]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ max-recv-level is present, it indicates the highest level the
+ codec supports for receiving. For either receiving or sending,
+ all levels that are lower than the highest level supported MUST
+ also be supported.
+
+ Informative note: Capability exchange and session setup
+ procedures should provide means to list the capabilities for
+ each supported sub-profile separately. For example, the
+ one-of-N codec selection procedure of the SDP Offer/Answer
+ model can be used (Section 10.2 of [RFC3264]). The one-of-N
+ codec selection procedure may also be used to provide
+ different combinations of profile_idc and profile-iop that
+ represent the same sub-profile. When there are many
+ different combinations of profile_idc and profile-iop that
+ represent the same sub-profile, using the one-of-N codec
+ selection procedure may result in a fairly large SDP
+ message. Therefore, a receiver should understand the
+ different equivalent combinations of profile_idc and
+ profile-iop that represent the same sub-profile, and be
+ ready to accept an offer using any of the equivalent
+ combinations.
+
+ If no profile-level-id is present, the Baseline Profile without
+ additional constraints at Level 1 MUST be implied.
+
+ max-recv-level:
+ This parameter MAY be used to indicate the highest level a
+ receiver supports when the highest level is higher than the
+ default level (the level indicated by profile-level-id). The
+ value of max-recv-level is a base16 (hexadecimal)
+ representation of the two bytes after the syntax element
+ profile_idc in the sequence parameter set NAL unit specified in
+ [H.264]: profile-iop (as defined above) and level_idc. If (the
+ level_idc byte of max-recv-level is equal to 11 and bit 4 of
+ the profile-iop byte of max-recv-level is equal to 1) or (the
+ level_idc byte of max-recv-level is equal to 9 and bit 4 of the
+ profile-iop byte of max-recv-level is equal to 0), the highest
+ level the receiver supports is Level 1b. Otherwise, the
+ highest level the receiver supports is equal to the level_idc
+ byte of max-recv-level divided by 10.
+
+ max-recv-level MUST NOT be present if the highest level the
+ receiver supports is not higher than the default level.
+
+ max-recv-base-level:
+ This parameter MAY be used to indicate the highest level a
+ receiver supports for the base layer when negotiating an SVC
+ stream. The value of max-recv-base-level is a base16
+
+
+
+Wenger, et al. Standards Track [Page 63]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ (hexadecimal) representation of the two bytes after the syntax
+ element profile_idc in the sequence parameter set NAL unit
+ specified in [H.264]: profile-iop (as defined above) and
+ level_idc. If (the level_idc byte of max-recv-level is equal
+ to 11 and bit 4 of the profile-iop byte of max-recv-level is
+ equal to 1) or (the level_idc byte of max-recv-level is equal
+ to 9 and bit 4 of the profile-iop byte of max-recv-level is
+ equal to 0), the highest level the receiver supports for the
+ base layer is Level 1b. Otherwise, the highest level the
+ receiver supports for the base layer is equal to the level_idc
+ byte of max-recv-level divided by 10.
+
+ max-mbps, max-fs, max-cpb, max-dpb, and max-br:
+ The common properties of these parameters are specified in
+ [RFC6184].
+
+ max-mbps: This parameter is as specified in [RFC6184].
+
+ max-fs: This parameter is as specified in [RFC6184].
+
+ max-cpb: The value of max-cpb is an integer indicating the maximum
+ coded picture buffer size in units of 1000 bits for the VCL HRD
+ parameters and in units of 1200 bits for the NAL HRD
+ parameters. Note that this parameter does not use units of
+ cpbBrVclFactor and cpbBrNALFactor (see Table A-1 of [H.264]).
+ The max-cpb parameter signals that the receiver has more memory
+ than the minimum amount of coded picture buffer memory required
+ by the signaled highest level conveyed in the value of the
+ profile-level-id parameter or the max-recv-level parameter.
+ When max-cpb is signaled, the receiver MUST be able to decode
+ NAL unit streams that conform to the signaled highest level,
+ with the exception that the MaxCPB value in Table A-1 of
+ [H.264] for the signaled highest level is replaced with the
+ value of max-cpb (after taking cpbBrVclFactor and
+ cpbBrNALFactor into consideration when needed). The value of
+ max-cpb (after taking cpbBrVclFactor and cpbBrNALFactor into
+ consideration when needed) MUST be greater than or equal to the
+ value of MaxCPB given in Table A-1 of [H.264] for the highest
+ level. Senders MAY use this knowledge to construct coded video
+ streams with greater variation of bitrate than can be achieved
+ with the MaxCPB value in Table A-1 of [H.264].
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 64]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Informative note: The coded picture buffer is used in the
+ Hypothetical Reference Decoder (HRD, Annex C) of [H.264].
+ The use of the HRD is recommended in SVC encoders to verify
+ that the produced bitstream conforms to the standard and to
+ control the output bitrate. Thus, the coded picture buffer
+ is conceptually independent of any other potential buffers
+ in the receiver, including de-interleaving, re-multiplexing,
+ and de-jitter buffers. The coded picture buffer need not be
+ implemented in decoders as specified in Annex C of [H.264];
+ standard-compliant decoders can have any buffering
+ arrangements provided that they can decode standard-
+ compliant bitstreams. Thus, in practice, the input buffer
+ for video decoder can be integrated with the de-
+ interleaving, re-multiplexing, and de-jitter buffers of the
+ receiver.
+
+ max-dpb: This parameter is as specified in [RFC6184].
+
+ max-br: The value of max-br is an integer indicating the maximum
+ video bitrate in units of 1000 bits per second for the VCL HRD
+ parameters and in units of 1200 bits per second for the NAL HRD
+ parameters. Note that this parameter does not use units of
+ cpbBrVclFactor and cpbBrNALFactor (see Table A-1 of [H.264]).
+
+ The max-br parameter signals that the video decoder of the
+ receiver is capable of decoding video at a higher bitrate than
+ is required by the signaled highest level conveyed in the value
+ of the profile-level-id parameter or the max-recv-level
+ parameter.
+
+ When max-br is signaled, the video codec of the receiver MUST
+ be able to decode NAL unit streams that conform to the signaled
+ highest level, with the following exceptions in the limits
+ specified by the highest level:
+
+ o The value of max-br (after taking cpbBrVclFactor and
+ cpbBrNALFactor into consideration when needed) replaces the
+ MaxBR value in Table A-1 of [H.264] for the highest level.
+
+ o When the max-cpb parameter is not present, the result of the
+ following formula replaces the value of MaxCPB in Table A-1
+ of [H.264]: (MaxCPB of the signaled level) * max-br / (MaxBR
+ of the signaled highest level).
+
+ For example, if a receiver signals capability for Main profile
+ Level 1.2 with max-br equal to 1550, this indicates a maximum
+ video bitrate of 1550 kbits/sec for VCL HRD parameters, a
+
+
+
+
+Wenger, et al. Standards Track [Page 65]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ maximum video bitrate of 1860 kbits/sec for NAL HRD parameters,
+ and a CPB size of 4036458 bits (1550000 / 384000 * 1000 *
+ 1000).
+
+ The value of max-br (after taking cpbBrVclFactor and
+ cpbBrNALFactor into consideration when needed) MUST be greater
+ than or equal to the value MaxBR given in Table A-1 of [H.264]
+ for the signaled highest level.
+
+ Senders MAY use this knowledge to send higher-bitrate video as
+ allowed in the level definition of SVC, to achieve improved
+ video quality.
+
+ Informative note: This parameter was added primarily to
+ complement a similar codepoint in the ITU-T Recommendation
+ H.245, so as to facilitate signaling gateway designs. No
+ assumption can be made from the value of this parameter that
+ the network is capable of handling such bitrates at any
+ given time. In particular, no conclusion can be drawn that
+ the signaled bitrate is possible under congestion control
+ constraints.
+
+ redundant-pic-cap:
+ This parameter is as specified in [RFC6184].
+
+ sprop-parameter-sets:
+ This parameter MAY be used to convey any sequence parameter
+ set, subset sequence parameter set, and picture parameter set
+ NAL units (herein referred to as the initial parameter set NAL
+ units) that can be placed in the NAL unit stream to precede any
+ other NAL units in decoding order and that are associated with
+ the default level of profile-level-id. The parameter MUST NOT
+ be used to indicate codec capability in any capability exchange
+ procedure. The value of the parameter is a comma (',')
+ separated list of base64 [RFC4648] representations of the
+ parameter set NAL units as specified in Sections 7.3.2.1,
+ 7.3.2.2, and G.7.3.2.1 of [H.264]. Note that the number of
+ bytes in a parameter set NAL unit is typically less than 10,
+ but a picture parameter set NAL unit can contain several
+ hundreds of bytes.
+
+ Informative note: When several payload types are offered in
+ the SDP Offer/Answer model, each with its own sprop-
+ parameter-sets parameter, then the receiver cannot assume
+ that those parameter sets do not use conflicting storage
+ locations (i.e., identical values of parameter set
+
+
+
+
+
+Wenger, et al. Standards Track [Page 66]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ identifiers). Therefore, a receiver should buffer all
+ sprop-parameter-sets and make them available to the decoder
+ instance that decodes a certain payload type.
+
+ sprop-level-parameter-sets:
+ This parameter MAY be used to convey any sequence, subset
+ sequence, and picture parameter set NAL units (herein referred
+ to as the initial parameter set NAL units) that can be placed
+ in the NAL unit stream to precede any other NAL units in
+ decoding order and that are associated with one or more levels
+ different than the default level of profile-level-id. The
+ parameter MUST NOT be used to indicate codec capability in any
+ capability exchange procedure.
+
+ The sprop-level-parameter-sets parameter contains parameter
+ sets for one or more levels that are different than the default
+ level. All parameter sets targeted for use when one level of
+ the default sub-profile is accepted by a receiver are clustered
+ and prefixed with a three-byte field that has the same syntax
+ as profile-level-id. This enables the receiver to install the
+ parameter sets for the accepted level and discard the rest.
+ The three-byte field is named PLId, and all parameter sets
+ associated with one level are named PSL, which has the same
+ syntax as sprop-parameter-sets. Parameter sets for each level
+ are represented in the form of PLId:PSL, i.e., PLId followed by
+ a colon (':') and the base64 [RFC4648] representation of the
+ initial parameter set NAL units for the level. Each pair of
+ PLId:PSL is also separated by a colon. Note that a PSL can
+ contain multiple parameter sets for that level, separated with
+ commas (',').
+
+ The subset of coding tools indicated by each PLId field MUST be
+ equal to the default sub-profile, and the level indicated by
+ each PLId field MUST be different than the default level.
+
+ Informative note: This parameter allows for efficient level
+ downgrade or upgrade in SDP Offer/Answer and out-of-band
+ transport of parameter sets, simultaneously.
+
+ in-band-parameter-sets:
+ This parameter MAY be used to indicate a receiver capability.
+ The value MAY be equal to either 0 or 1. The value 1 indicates
+ that the receiver discards out-of-band parameter sets in sprop-
+ parameter-sets and sprop-level-parameter-sets, therefore the
+ sender MUST transmit all parameter sets in-band. The value 0
+ indicates that the receiver utilizes out-of-band parameter sets
+ included in sprop-parameter-sets and/or sprop-level-parameter-
+ sets. However, in this case, the sender MAY still choose to
+
+
+
+Wenger, et al. Standards Track [Page 67]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ send parameter sets in-band. When the parameter is not
+ present, this receiver capability is not specified, and
+ therefore the sender MAY send out-of-band parameter sets only,
+ or it MAY send in-band-parameter-sets only, or it MAY send
+ both.
+
+ packetization-mode:
+ This parameter is as specified in [RFC6184]. When the mst-mode
+ parameter is present, the value of this parameter is
+ additionally constrained as follows. If mst-mode is equal to
+ "NI-T", "NI-C", or "NI-TC", packetization-mode MUST NOT be
+ equal to 2. Otherwise, (mst-mode is equal to "I-C"),
+ packetization-mode MUST be equal to 2.
+
+ sprop-interleaving-depth:
+ This parameter is as specified in [RFC6184].
+
+ sprop-deint-buf-req:
+ This parameter is as specified in [RFC6184].
+
+ deint-buf-cap:
+ This parameter is as specified in [RFC6184].
+
+ sprop-init-buf-time:
+ This parameter is as specified in [RFC6184].
+
+ sprop-max-don-diff:
+ This parameter is as specified in [RFC6184].
+
+ max-rcmd-nalu-size:
+ This parameter is as specified in [RFC6184].
+
+ mst-mode:
+ This parameter MAY be used to signal the properties of a NAL
+ unit stream or the capabilities of a receiver implementation.
+ If this parameter is present, multi-session transmission MUST
+ be used. Otherwise (this parameter is not present), single-
+ session transmission MUST be used. When this parameter is
+ present, the following applies. When the value of mst-mode is
+ equal to "NI-T", the NI-T mode MUST be used. When the value of
+ mst-mode is equal to "NI-C", the NI-C mode MUST be used. When
+ the value of mst-mode is equal to "NI-TC", the NI-TC mode MUST
+ be used. When the value of mst-mode is equal to "I-C", the I-C
+ mode MUST be used. The value of mst-mode MUST have one of the
+ following tokens: "NI-T", "NI-C", "NI-TC", or "I-C".
+
+ All RTP sessions in an MST MUST have the same value of mst-
+ mode.
+
+
+
+Wenger, et al. Standards Track [Page 68]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ sprop-mst-csdon-always-present:
+ This parameter MUST NOT be present when mst-mode is not present
+ or the value of mst-mode is equal to "NI-T" or "I-C". This
+ parameter signals the properties of the NAL unit stream. When
+ sprop-mst-csdon-always-present is present and the value is
+ equal to 1, packetization-mode MUST be equal to 1, and all the
+ RTP packets carrying the NAL unit stream MUST be STAP-A packets
+ containing a PACSI NAL unit that further contains the DONC
+ field or NI-MTAP packets with the J field equal to 1. When
+ sprop-mst-csdon-always-present is present and the value is
+ equal to 1, the CS-DON value of any particular NAL unit can be
+ derived solely according to information in the packet
+ containing the NAL unit.
+
+ When sprop-mst-csdon-always-present is present in the current
+ RTP session, it MUST be present also in all the RTP sessions
+ the current RTP session depends on and the value of sprop-mst-
+ csdon-always-present is identical for the current RTP session
+ and all the RTP sessions on which the current RTP session
+ depends.
+
+ sprop-mst-remux-buf-size:
+ This parameter MUST NOT be present when mst-mode is not present
+ or the value of mst-mode is equal to "NI-T". This parameter
+ MUST be present when mst-mode is present and the value of mst-
+ mode is equal to "NI-C", "NI-TC", or "I-C".
+
+ This parameter signals the properties of the NAL unit stream.
+ It MUST be set to a value one less than the minimum re-
+ multiplexing buffer size (in NAL units), so that it is
+ guaranteed that receivers can reconstruct NAL unit decoding
+ order as specified in Subsection 6.2.2.
+
+ The value of sprop-mst-remux-buf-size MUST be an integer in the
+ range of 0 to 32767, inclusive.
+
+ sprop-remux-buf-req:
+ This parameter MUST NOT be present when mst-mode is not present
+ or the value of mst-mode is equal to "NI-T". It MUST be
+ present when mst-mode is present and the value of mst-mode is
+ equal to "NI-C", "NI-TC", or "I-C".
+
+ sprop-remux-buf-req signals the required size of the re-
+ multiplexing buffer for the NAL unit stream. It is guaranteed
+ that receivers can recover the decoding order of the received
+ NAL units from the current RTP session and the RTP sessions the
+
+
+
+
+
+Wenger, et al. Standards Track [Page 69]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ current RTP session depends on as specified in Section 6.2.2,
+ when the re-multiplexing buffer size is of at least the value
+ of sprop-remux-buf-req in units of bytes.
+
+ The value of sprop-remux-buf-req MUST be an integer in the
+ range of 0 to 4294967295, inclusive.
+
+ remux-buf-cap:
+ This parameter MUST NOT be present when mst-mode is not present
+ or the value of mst-mode is equal to "NI-T". This parameter
+ MAY be used to signal the capabilities of a receiver
+ implementation and indicates the amount of re-multiplexing
+ buffer space in units of bytes that the receiver has available
+ for recovering the NAL unit decoding order as specified in
+ Section 6.2.2. A receiver is able to handle any NAL unit
+ stream for which the value of the sprop-remux-buf-req parameter
+ is smaller than or equal to this parameter.
+
+ If the parameter is not present, then a value of 0 MUST be used
+ for remux-buf-cap. The value of remux-buf-cap MUST be an
+ integer in the range of 0 to 4294967295, inclusive.
+
+ sprop-remux-init-buf-time:
+ This parameter MAY be used to signal the properties of the NAL
+ unit stream. The parameter MUST NOT be present if mst-mode is
+ not present or the value of mst-mode is equal to "NI-T".
+
+ The parameter signals the initial buffering time that a
+ receiver MUST wait before starting to recover the NAL unit
+ decoding order as specified in Section 6.2.2 of this memo.
+
+ The parameter is coded as a non-negative base10 integer
+ representation in clock ticks of a 90-kHz clock. If the
+ parameter is not present, then no initial buffering time value
+ is defined. Otherwise, the value of sprop-remux-init-buf-time
+ MUST be an integer in the range of 0 to 4294967295, inclusive.
+
+ sprop-mst-max-don-diff:
+ This parameter MAY be used to signal the properties of the NAL
+ unit stream. It MUST NOT be used to signal transmitter or
+ receiver or codec capabilities. The parameter MUST NOT be
+ present if mst-mode is not present or the value of mst-mode is
+ equal to "NI-T". sprop-mst-max-don-diff is an integer in the
+ range of 0 to 32767, inclusive. If sprop-mst-max-don-diff is
+ not present, the value of the parameter is unspecified. sprop-
+ mst-max-don-diff is calculated same as sprop-max-don-diff as
+ specified in [RFC6184], with decoding order number being
+ replaced by cross-session decoding order number.
+
+
+
+Wenger, et al. Standards Track [Page 70]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ sprop-scalability-info:
+ This parameter MAY be used to convey the NAL unit containing
+ the scalability information SEI message as specified in Annex G
+ of [H.264]. This parameter MAY be used to signal the contained
+ layers of an SVC bitstream. The parameter MUST NOT be used to
+ indicate codec capability in any capability exchange procedure.
+ The value of the parameter is the base64 [RFC4648]
+ representation of the NAL unit containing the scalability
+ information SEI message. If present, the NAL unit MUST contain
+ only one SEI message that is a scalability information SEI
+ message.
+
+ This parameter MAY be used in an offering or declarative SDP
+ message to indicate what layers (operation points) can be
+ provided. A receiver MAY indicate its choice of one layer
+ using the optional media type parameter scalable-layer-id.
+
+ scalable-layer-id:
+ This parameter MAY be used to signal a receiver's choice of the
+ offers or declared operation points or layers using sprop-
+ scalability-info or sprop-operation-point-info. The value of
+ scalable-layer-id is a base16 representation of the layer_id[ i
+ ] syntax element in the scalability information SEI message as
+ specified in Annex G of [H.264] or layer-ID contained in sprop-
+ operation-point-info.
+
+ sprop-operation-point-info:
+ This parameter MAY be used to describe the operation points of
+ an RTP session. The value of this parameter consists of a
+ comma-separated list of operation-point-description vectors.
+ The values given by the operation-point-description vectors are
+ the same as, or are derived from, the values that would be
+ given for a scalable layer in the scalability information SEI
+ message as specified in Annex G of [H.264], where the term
+ scalable layer in the scalability information SEI message
+ refers to all NAL units associated with the same values of
+ temporal_id, dependency_id, and quality_id. In this memo, such
+ a set of NAL units is called an operation point.
+
+ Each operation-point-description vector has ten elements,
+ provided as a comma-separated list of values as defined below.
+ The first value of the operation-point-description vector is
+ preceded by a '<', and the last value of the operation-point-
+ description vector is followed by a '>'. If the sprop-
+ operation-point-info is followed by exactly one operation-
+ point-description vector, this describes the highest operation
+ point contained in the RTP session. If there are two or more
+
+
+
+
+Wenger, et al. Standards Track [Page 71]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ operation-point-description vectors, the first describes the
+ lowest and the last describes the highest operation point
+ contained in the RTP session.
+
+ The values given by the operation-point-description vector are
+ as follows, in the order listed:
+
+ - layer-ID: This value specifies the layer identifier of the
+ operation point, which is identical to the layer_id that
+ would be indicated (for the same values of dependency_id,
+ quality_id, and temporal_id) in the scalability information
+ SEI message. This field MAY be empty, indicating that the
+ value is unspecified. When there are multiple operation-
+ point-description vectors with layer-ID, the values of
+ layer-ID do not need to be consecutive.
+
+ - temporal-ID: This value specifies the temporal_id of the
+ operation point. This field MUST NOT be empty.
+
+ - dependency-ID: This values specifies the dependency_id of
+ the operation point. This field MUST NOT be empty.
+
+ - quality-ID: This values specifies the quality_id of the
+ operation point. This field MUST NOT be empty.
+
+ - profile-level-ID: This value specifies the profile-level-idc
+ of the operation point in the base16 format. The default
+ sub-profile or default level indicated by the parameter
+ profile-level-ID in the sprop-operation-point-info vector
+ SHALL be equal to or lower than the default sub-profile or
+ default level indicated by profile-level-id, which may be
+ either present or the default value is taken. This field
+ MAY be empty, indicating that the value is unspecified.
+
+ - avg-framerate: This value specifies the average frame rate
+ of the operation point. This value is given as an integer
+ in frames per 256 seconds. The field MAY be empty,
+ indicating that the value is unspecified.
+
+ - width: This value specifies the width dimension in pixels of
+ decoded frames for the operation point. This parameter is
+ not directly given in the scalability information SEI
+ message. This field MAY be empty, indicating that the value
+ is unspecified.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 72]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ - height: This value gives the height dimension in pixels of
+ decoded frames for the operation point. This parameter is
+ not directly given in the scalability information SEI. This
+ field MAY be empty, indicating that the value is
+ unspecified.
+
+ - avg-bitrate: This value specifies the average bitrate of the
+ operation point. This parameter is given as an integer in
+ kbits per second over the entire stream. Note that this
+ parameter is provided in the scalability information SEI
+ message in bits per second and calculated over a variable
+ time window. This field MAY be empty, indicating that the
+ value is unspecified.
+
+ - max-bitrate: This value specifies the maximum bitrate of the
+ operation point. This parameter is given as an integer in
+ kbits per second and describes the maximum bitrate per each
+ one-second window. Note that this parameter is provided in
+ the scalability information SEI message in bits per second
+ and is calculated over a variable time window. This field
+ MAY be empty, indicating that the value is unspecified.
+
+ Similarly to sprop-scalability-info, this parameter MAY be
+ used in an offering or declarative SDP message to indicate
+ what layers (operation points) can be provided. A receiver
+ MAY indicate its choice of the highest layer it wants to
+ send and/or receive using the optional media type parameter
+ scalable-layer-id.
+
+ sprop-no-NAL-reordering-required:
+ This parameter MAY be used to signal the properties of the NAL
+ unit stream. This parameter MUST NOT be present when mst-mode
+ is not present or the value of mst-mode is not equal to "NI-T".
+ The presence of this parameter indicates that no reordering of
+ non-VCL or VCL NAL units is required for the decoding order
+ recovery process.
+
+ sprop-avc-ready:
+ This parameter MAY be used to indicate the properties of the
+ NAL unit stream. The presence of this parameter indicates that
+ the RTP session, if used in SST, or used in MST combined with
+ other RTP sessions also with this parameter present, can be
+ processed by a [RFC6184] receiver. This parameter MAY be used
+ with RTP sessions with media subtype H264-SVC.
+
+ Encoding considerations:
+ This media type is framed and binary; see Section 4.8 of RFC
+ 4288 [RFC4288].
+
+
+
+Wenger, et al. Standards Track [Page 73]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Security considerations:
+ See Section 8 of RFC 6190.
+
+ Published specification:
+ Please refer to RFC 6190 and its Section 13.
+
+ Additional information:
+ none
+
+ File extensions: none
+
+ Macintosh file type code: none
+
+ Object identifier or OID: none
+
+ Person & email address to contact for further information:
+
+ Ye-Kui Wang, yekui.wang@huawei.com
+
+ Intended usage: COMMON
+
+ Restrictions on usage:
+ This media type depends on RTP framing, and hence is only
+ defined for transfer via RTP [RFC3550]. Transport within other
+ framing protocols is not defined at this time.
+
+ Interoperability considerations:
+ The media subtype name contains "SVC" to avoid potential
+ conflict with RFC 3984 and its potential future replacement RTP
+ payload format for H.264 non-SVC profiles.
+
+ Applications that use this media type:
+ Real-time video applications like video streaming, video
+ telephony, and video conferencing.
+
+ Author:
+
+ Ye-Kui Wang, yekui.wang@huawei.com
+
+ Change controller:
+ IETF Audio/Video Transport working group delegated from the
+ IESG.
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 74]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+7.2. SDP Parameters
+
+7.2.1. Mapping of Payload Type Parameters to SDP
+
+ The media type video/H264-SVC string is mapped to fields in the
+ Session Description Protocol (SDP) as follows:
+
+ o The media name in the "m=" line of SDP MUST be video.
+
+ o The encoding name in the "a=rtpmap" line of SDP MUST be H264-SVC
+ (the media subtype).
+
+ o The clock rate in the "a=rtpmap" line MUST be 90000.
+
+ o The OPTIONAL parameters profile-level-id, max-recv-level, max-
+ recv-base-level, max-mbps, max-fs, max-cpb, max-dpb, max-br,
+ redundant-pic-cap, in-band-parameter-sets, packetization-mode,
+ sprop-interleaving-depth, deint-buf-cap, sprop-deint-buf-req,
+ sprop-init-buf-time, sprop-max-don-diff, max-rcmd-nalu-size, mst-
+ mode, sprop-mst-csdon-always-present, sprop-mst-remux-buf-size,
+ sprop-remux-buf-req, remux-buf-cap, sprop-remux-init-buf-time,
+ sprop-mst-max-don-diff, and scalable-layer-id, when present, MUST
+ be included in the "a=fmtp" line of SDP. These parameters are
+ expressed as a media type string, in the form of a semicolon-
+ separated list of parameter=value pairs.
+
+ o The OPTIONAL parameters sprop-parameter-sets, sprop-level-
+ parameter-sets, sprop-scalability-info, sprop-operation-point-
+ info, sprop-no-NAL-reordering-required, and sprop-avc-ready, when
+ present, MUST be included in the "a=fmtp" line of SDP or conveyed
+ using the "fmtp" source attribute as specified in Section 6.3 of
+ [RFC5576]. For a particular media format (i.e., RTP payload
+ type), a sprop-parameter-sets or sprop-level-parameter-sets MUST
+ NOT be both included in the "a=fmtp" line of SDP and conveyed
+ using the "fmtp" source attribute. When included in the "a=fmtp"
+ line of SDP, these parameters are expressed as a media type
+ string, in the form of a semicolon-separated list of
+ parameter=value pairs. When conveyed using the "fmtp" source
+ attribute, these parameters are only associated with the given
+ source and payload type as parts of the "fmtp" source attribute.
+
+ Informative note: Conveyance of sprop-parameter-sets and
+ sprop-level-parameter-sets using the "fmtp" source attribute
+ allows for out-of-band transport of parameter sets in
+ topologies like Topo-Video-switch-MCU [RFC5117].
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 75]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+7.2.2. Usage with the SDP Offer/Answer Model
+
+ When an SVC stream (with media subtype H264-SVC) is offered over RTP
+ using SDP in an Offer/Answer model [RFC3264] for negotiation for
+ unicast usage, the following limitations and rules apply:
+
+ o The parameters identifying a media format configuration for SVC
+ are profile-level-id, packetization-mode, and mst-mode. These
+ media configuration parameters (except for the level part of
+ profile-level-id) MUST be used symmetrically when the answerer
+ does not include scalable-layer-id in the answer; i.e., the
+ answerer MUST either maintain all configuration parameters or
+ remove the media format (payload type) completely, if one or more
+ of the parameter values are not supported. Note that the level
+ part of profile-level-id includes level_idc, and, for indication
+ of level 1b when profile_idc is equal to 66, 77, or 88, bit 4
+ (constraint_set3_flag) of profile-iop. The level part of profile-
+ level-id is changeable.
+
+ Informative note: The requirement for symmetric use does not
+ apply for the level part of profile-level-id, and does not
+ apply for the other stream properties and capability
+ parameters.
+
+ Informative note: In [H.264], all the levels except for Level
+ 1b are equal to the value of level_idc divided by 10. Level 1b
+ is a level higher than Level 1.0 but lower than Level 1.1, and
+ is signaled in an ad hoc manner. For the Baseline, Main, and
+ Extended profiles (with profile_idc equal to 66, 77, and 88,
+ respectively), Level 1b is indicated by level_idc equal to 11
+ (i.e., the same as level 1.1) and constraint_set3_flag equal to
+ 1. For other profiles, Level 1b is indicated by level_idc
+ equal to 9 (but note that Level 1b for these profiles is still
+ higher than Level 1, which has level_idc equal to 10, and lower
+ than Level 1.1). In SDP Offer/Answer, an answer may indicate a
+ level equal to or lower than the level indicated in the offer.
+ Due to the ad hoc indication of Level 1b, offerers and
+ answerers must check the value of bit 4 (constraint_set3_flag)
+ of the middle octet of the parameter profile-level-id, when
+ profile_idc is equal to 66, 77, or 88 and level_idc is equal to
+ 11.
+
+ To simplify handling and matching of these configurations, the
+ same RTP payload type number used in the offer should also be used
+ in the answer, as specified in [RFC3264]. The same RTP payload
+ type number used in the offer MUST also be used in the answer when
+ the answer includes scalable-layer-id. When the answer does not
+ include scalable-layer-id, the answer MUST NOT contain a payload
+
+
+
+Wenger, et al. Standards Track [Page 76]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ type number used in the offer unless the configuration is exactly
+ the same as in the offer or the configuration in the answer only
+ differs from that in the offer with a level lower than the default
+ level offered.
+
+ Informative note: When an offerer receives an answer that does
+ not include scalable-layer-id it has to compare payload types
+ not declared in the offer based on the media type (i.e.,
+ video/H264-SVC) and the above media configuration parameters
+ with any payload types it has already declared. This will
+ enable it to determine whether the configuration in question is
+ new or if it is equivalent to configuration already offered,
+ since a different payload type number may be used in the
+ answer.
+
+ Since an SVC stream may contain multiple operation points, a
+ facility is provided so that an answerer can select a different
+ operation point than the entire SVC stream. Specifically,
+ different operation points MAY be described using the sprop-
+ scalability-info or sprop-operation-point-info parameters. The
+ first one carries the entire scalability information SEI message
+ defined in Annex G of [H.264], whereas the second one may be
+ derived, e.g., as a subset of this SEI message that only contains
+ key information about an operation point. Operation points, in
+ both cases, are associated with a layer identifier.
+
+ If such information (sprop-operation-point-info or sprop-
+ scalability-info) is provided in an offer, an answerer MAY select
+ from the various operation points offered in the sprop-
+ scalability-information or sprop-operation-point-info parameters
+ by including scalable-layer-id in the answer. By this, the
+ answerer indicates its selection of a particular operation point
+ in the received and/or in the sent stream. When such operation
+ point selection takes place, i.e., the answerer includes scalable-
+ layer-id in the answer, the media configuration parameters MUST
+ NOT be present in the answer. Rather, the media configuration
+ that the answerer will use for receiving and/or sending is the one
+ used for the selected operation point as indicated in the offer.
+
+ Informative note: The ability to perform operation point
+ selection enables a receiver to utilize the scalable nature of
+ an SVC stream.
+
+ o The parameter max-recv-level, when present, declares the highest
+ level supported for receiving. In case max-recv-level is not
+ present, the highest level supported for receiving is equal to the
+
+
+
+
+
+Wenger, et al. Standards Track [Page 77]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ default level indicated by the level part of profile-level-id.
+ max-recv-level, when present, MUST be higher than the default
+ level.
+
+ o The parameter max-recv-base-level, when present, declares the
+ highest level of the base layer supported for receiving. When
+ max-recv-base-level is not present, the highest level supported
+ for the base layer is not constrained separately from the SVC
+ stream containing the base layer. The endpoint at the other side
+ MUST NOT send a scalable stream for which the base layer is of a
+ level higher than max-recv-base-level. Parameters declaring
+ receiver capabilities above the default level (max-mbps, max-
+ smbps, max-fs, max-cpb, max-dpb, max-br, and max-recv-level) do
+ not apply to the base layer when max-recv-base-level is present.
+
+ o The parameters sprop-deint-buf-req, sprop-interleaving-depth,
+ sprop-max-don-diff, sprop-init-buf-time, sprop-mst-csdon-always-
+ present, sprop-remux-buf-req, sprop-mst-remux-buf-size, sprop-
+ remux-init-buf-time, sprop-mst-max-don-diff, sprop-scalability-
+ information, sprop-operation-point-info, sprop-no-NAL-reordering-
+ required, and sprop-avc-ready describe the properties of the NAL
+ unit stream that the offerer or answerer is sending for the media
+ format configuration. This differs from the normal usage of the
+ Offer/Answer parameters: normally such parameters declare the
+ properties of the stream that the offerer or the answerer is able
+ to receive. When dealing with SVC, the offerer assumes that the
+ answerer will be able to receive media encoded using the
+ configuration being offered.
+
+ Informative note: The above parameters apply for any stream
+ sent by the declaring entity with the same configuration; i.e.,
+ they are dependent on their source. Rather than being bound to
+ the payload type, the values may have to be applied to another
+ payload type when being sent, as they apply for the
+ configuration.
+
+ o The capability parameters max-mbps, max-fs, max-cpb, max-dpb, max-
+ br, redundant-pic-cap, and max-rcmd-nalu-size MAY be used to
+ declare further capabilities of the offerer or answerer for
+ receiving. These parameters MUST NOT be present when the
+ direction attribute is sendonly, and the parameters describe the
+ limitations of what the offerer or answerer accepts for receiving
+ streams.
+
+ o When mst-mode is not present and packetization-mode is equal to 2,
+ the following applies.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 78]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o An offerer has to include the size of the de-interleaving
+ buffer, sprop-deint-buf-req, in the offer. To enable the
+ offerer and answerer to inform each other about their
+ capabilities for de-interleaving buffering, both parties are
+ RECOMMENDED to include deint-buf-cap. It is also RECOMMENDED
+ to consider offering multiple payload types with different
+ buffering requirements when the capabilities of the receiver
+ are unknown.
+
+ o When mst-mode is present and equal to "NI-C", "NI-TC", or "I-C",
+ the following applies.
+
+ o An offerer has to include sprop-remux-buf-req in the offer. To
+ enable the offerer and answerer to inform each other about
+ their capabilities for re-multiplexing buffering, both parties
+ are RECOMMENDED to include remux-buf-cap. It is also
+ RECOMMENDED to consider offering multiple payload types with
+ different buffering requirements when the capabilities of the
+ receiver are unknown.
+
+ o The sprop-parameter-sets or sprop-level-parameter-sets parameter,
+ when present (included in the "a=fmtp" line of SDP or conveyed
+ using the "fmtp" source attribute as specified in Section 6.3 of
+ [RFC5576]), is used for out-of-band transport of parameter sets.
+ However, when out-of-band transport of parameter sets is used,
+ parameter sets MAY still be additionally transported in-band.
+
+ The answerer MAY use either out-of-band or in-band transport of
+ parameter sets for the stream it is sending, regardless of whether
+ out-of-band parameter sets transport has been used in the offerer-
+ to-answerer direction. Parameter sets included in an answer are
+ independent of those parameter sets included in the offer, as they
+ are used for decoding two different video streams, one from the
+ answerer to the offerer, and the other in the opposite direction.
+
+ The following rules apply to transport of parameter sets in the
+ offerer-to-answerer direction.
+
+ o An offer MAY include either or both of sprop-parameter- sets
+ and sprop-level-parameter-sets. If neither sprop-parameter-
+ sets nor sprop-level-parameter-sets is present in the offer,
+ then only in-band transport of parameter sets is used.
+
+ o If the answer includes in-band-parameter-sets equal to 1, then
+ the offerer MUST transmit parameter sets in-band. Otherwise,
+ the following applies.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 79]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ o If the level to use in the offerer-to-answerer direction is
+ equal to the default level in the offer, the following
+ applies.
+
+ The answerer MUST be prepared to use the parameter sets
+ included in sprop-parameter-sets, when present, for
+ decoding the incoming NAL unit stream, and ignore sprop-
+ level-parameter-sets, when present.
+
+ When sprop-parameter-sets is not present in the offer,
+ in-band transport of parameter sets MUST be used.
+
+ o Otherwise (the level to use in the offerer-to-answerer
+ direction is not equal to the default level in the offer),
+ the following applies.
+
+ The answerer MUST be prepared to use the parameter sets
+ that are included in sprop-level-parameter-sets for the
+ accepted level (i.e., the default level in the answer,
+ which is also the level to use in the offerer-to-answerer
+ direction), when present, for decoding the incoming NAL
+ unit stream, and ignore all other parameter sets included
+ in sprop-level-parameter-sets and sprop-parameter-sets,
+ when present.
+
+ When no parameter sets for the accepted level are present
+ in the sprop-level-parameter-sets, in-band transport of
+ parameter sets MUST be used.
+
+ The following rules apply to transport of parameter sets in the
+ answerer-to-offerer direction.
+
+ o An answer MAY include either sprop-parameter-sets or sprop-
+ level-parameter-sets, but MUST NOT include both of the two. If
+ neither sprop-parameter-sets nor sprop-level-parameter-sets is
+ present in the answer, then only in-band transport of parameter
+ sets is used.
+
+ o If the offer includes in-band-parameter-sets equal to 1, then
+ the answerer MUST NOT include sprop-parameter-sets or sprop-
+ level-parameter-sets in the answer and MUST transmit parameter
+ sets in-band. Otherwise, the following applies.
+
+ o If the level to use in the answerer-to-offerer direction is
+ equal to the default level in the answer, the following
+ applies.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 80]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ The offerer MUST be prepared to use the parameter sets
+ included in sprop-parameter-sets, when present, for
+ decoding the incoming NAL unit stream, and ignore sprop-
+ level-parameter-sets, when present.
+
+ When sprop-parameter-sets is not present in the answer,
+ the answerer MUST transmit parameter sets in-band.
+
+ o Otherwise (the level to use in the answerer-to-offerer
+ direction is not equal to the default level in the answer),
+ the following applies.
+
+ The offerer MUST be prepared to use the parameter sets
+ that are included in sprop-level-parameter-sets for the
+ level to use in the answerer-to-offerer direction, when
+ present in the answer, for decoding the incoming NAL unit
+ stream, and ignore all other parameter sets included in
+ sprop-level-parameter-sets and sprop-parameter-sets, when
+ present in the answer.
+
+ When no parameter sets for the level to use in the
+ answerer-to-offerer direction are present in sprop-level-
+ parameter-sets in the answer, the answerer MUST transmit
+ parameter sets in-band.
+
+ When sprop-parameter-sets or sprop-level-parameter-sets is
+ conveyed using the "fmtp" source attribute as specified in Section
+ 6.3 of [RFC5576], the receiver of the parameters MUST store the
+ parameter sets included in the sprop-parameter-sets or sprop-
+ level-parameter-sets for the accepted level and associate them to
+ the source given as a part of the "fmtp" source attribute.
+ Parameter sets associated with one source MUST only be used to
+ decode NAL units conveyed in RTP packets from the same source.
+ When this mechanism is in use, SSRC collision detection and
+ resolution MUST be performed as specified in [RFC5576].
+
+ Informative note: Conveyance of sprop-parameter-sets and sprop-
+ level-parameter-sets using the "fmtp" source attribute may be
+ used in topologies like Topo-Video-switch-MCU [RFC5117] to
+ enable out-of-band transport of parameter sets.
+
+ For streams being delivered over multicast, the following rules
+ apply:
+
+ o The media format configuration is identified by profile-level- id,
+ including the level part, packetization-mode, and mst-mode. These
+ media format configuration parameters (including the level part of
+ profile-level-id) MUST be used symmetrically; i.e., the answerer
+
+
+
+Wenger, et al. Standards Track [Page 81]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ MUST either maintain all configuration parameters or remove the
+ media format (payload type) completely. Note that this implies
+ that the level part of profile-level-id for Offer/Answer in
+ multicast is not changeable.
+
+ To simplify handling and matching of these configurations, the
+ same RTP payload type number used in the offer should also be used
+ in the answer, as specified in [RFC3264]. An answer MUST NOT
+ contain a payload type number used in the offer unless the
+ configuration is the same as in the offer.
+
+ o Parameter sets received MUST be associated with the originating
+ source, and MUST be only used in decoding the incoming NAL unit
+ stream from the same source.
+
+ o The rules for other parameters are the same as above for unicast
+ as long as the above rules are obeyed.
+
+ Table 14 lists the interpretation of all the parameters that MUST be
+ used for the various combinations of offer, answer, and direction
+ attributes. Note that the two columns wherein the scalable-layer-id
+ parameter is used only apply to answers, whereas the other columns
+ apply to both offers and answers.
+
+ Table 14. Interpretation of parameters for various combinations of
+ offers, answers, direction attributes, with and without scalable-
+ layer-id. Columns that do not indicate offer or answer apply to
+ both.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 82]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ sendonly --+
+ answer: recvonly,scalable-layer-id --+ |
+ recvonly w/o scalable-layer-id --+ | |
+ answer: sendrecv, scalable-layer-id --+ | | |
+ sendrecv w/o scalable-layer-id --+ | | | |
+ | | | | |
+ profile-level-id C X C X P
+ max-recv-level R R R R -
+ max-recv-base-level R R R R -
+ packetization-mode C X C X P
+ mst-mode C X C X P
+ sprop-avc-ready P P - - P
+ sprop-deint-buf-req P P - - P
+ sprop-init-buf-time P P - - P
+ sprop-interleaving-depth P P - - P
+ sprop-max-don-diff P P - - P
+ sprop-mst-csdon-always-present P P - - P
+ sprop-mst-max-don-diff P P - - P
+ sprop-mst-remux-buf-size P P - - P
+ sprop-no-NAL-reordering-required P P - - P
+ sprop-operation-point-info P P - - P
+ sprop-remux-buf-req P P - - P
+ sprop-remux-init-buf-time P P - - P
+ sprop-scalability-info P P - - P
+ deint-buf-cap R R R R -
+ max-br R R R R -
+ max-cpb R R R R -
+ max-dpb R R R R -
+ max-fs R R R R -
+ max-mbps R R R R -
+ max-rcmd-nalu-size R R R R -
+ redundant-pic-cap R R R R -
+ remux-buf-cap R R R R -
+ in-band-parameter-sets R R R R -
+ sprop-parameter-sets S S - - S
+ sprop-level-parameter-sets S S - - S
+ scalable-layer-id X O X O -
+
+ Legend:
+
+ C: configuration for sending and receiving streams
+ P: properties of the stream to be sent
+ R: receiver capabilities
+ S: out-of-band parameter sets
+ O: operation point selection
+ X: MUST NOT be present
+ -: not usable, when present SHOULD be ignored
+
+
+
+
+Wenger, et al. Standards Track [Page 83]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Parameters used for declaring receiver capabilities are in general
+ downgradable; i.e., they express the upper limit for a sender's
+ possible behavior. Thus, a sender MAY select to set its encoder
+ using only lower/lesser or equal values of these parameters.
+
+ Parameters declaring a configuration point are not changeable, with
+ the exception of the level part of the profile-level-id parameter for
+ unicast usage. This expresses values a receiver expects to be used
+ and must be used verbatim on the sender side. If level downgrading
+ (for profile-level-id) is used, an answerer MUST NOT include the
+ scalable-layer-id parameter.
+
+ When a sender's capabilities are declared, and non-downgradable
+ parameters are used in this declaration, then these parameters
+ express a configuration that is acceptable for the sender to receive
+ streams. In order to achieve high interoperability levels, it is
+ often advisable to offer multiple alternative configurations, e.g.,
+ for the packetization mode. It is impossible to offer multiple
+ configurations in a single payload type. Thus, when multiple
+ configuration offers are made, each offer requires its own RTP
+ payload type associated with the offer.
+
+ A receiver SHOULD understand all media type parameters, even if it
+ only supports a subset of the payload format's functionality. This
+ ensures that a receiver is capable of understanding when an offer to
+ receive media can be downgraded to what is supported by the receiver
+ of the offer.
+
+ An answerer MAY extend the offer with additional media format
+ configurations. However, to enable their usage, in most cases a
+ second offer is required from the offerer to provide the stream
+ property parameters that the media sender will use. This also has
+ the effect that the offerer has to be able to receive this media
+ format configuration, not only to send it.
+
+ If an offerer wishes to have non-symmetric capabilities between
+ sending and receiving, the offerer can allow asymmetric levels via
+ level-asymmetry-allowed equal to 1. Alternatively, the offerer can
+ offer different RTP sessions, i.e., different media lines declared as
+ "recvonly" and "sendonly", respectively. This may have further
+ implications on the system, and may require additional external
+ semantics to associate the two media lines.
+
+7.2.3. Dependency Signaling in Multi-Session Transmission
+
+ If MST is used, the rules on signaling media decoding dependency in
+ SDP as defined in [RFC5583] apply. The rules on "hierarchical or
+ layered encoding" with multicast in Section 5.7 of [RFC4566] do not
+
+
+
+Wenger, et al. Standards Track [Page 84]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ apply, i.e., the notation for Connection Data "c=" SHALL NOT be used
+ with more than one address. Additionally, the order of dependencies
+ of the RTP sessions indicated by the "a=depend" attribute as defined
+ in [RFC5583] MUST represent the decoding order of the VC) NAL units
+ in an access unit, i.e., the order of session dependency is given
+ from the base or the lowest enhancement RTP session (the most
+ important) to the highest enhancement RTP session (the least
+ important).
+
+7.2.4. Usage in Declarative Session Descriptions
+
+ When SVC over RTP is offered with SDP in a declarative style, as in
+ Real Time Streaming Protocol (RTSP) [RFC2326] or Session Announcement
+ Protocol (SAP) [RFC2974], the following considerations are necessary.
+
+ o All parameters capable of indicating both stream properties and
+ receiver capabilities are used to indicate only stream properties.
+ For example, in this case, the parameter profile-level-id declares
+ the values used by the stream, not the capabilities for receiving
+ streams. This results in that the following interpretation of the
+ parameters MUST be used:
+
+ Declaring actual configuration or stream properties:
+
+ - profile-level-id
+ - packetization-mode
+ - mst-mode
+ - sprop-deint-buf-req
+ - sprop-interleaving-depth
+ - sprop-max-don-diff
+ - sprop-init-buf-time
+ - sprop-mst-csdon-always-present
+ - sprop-mst-remux-buf-size
+ - sprop-remux-buf-req
+ - sprop-remux-init-buf-time
+ - sprop-mst-max-don-diff
+ - sprop-scalability-info
+ - sprop-operation-point-info
+ - sprop-no-NAL-reordering-required
+ - sprop-avc-ready
+
+ Out-of-band transporting of parameter sets:
+
+ - sprop-parameter-sets
+ - sprop-level-parameter-sets
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 85]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Not usable (when present, they SHOULD be ignored):
+
+ - max-mbps
+ - max-fs
+ - max-cpb
+ - max-dpb
+ - max-br
+ - max-recv-level
+ - max-recv-base-level
+ - redundant-pic-cap
+ - max-rcmd-nalu-size
+ - deint-buf-cap
+ - remux-buf-cap
+ - scalable-layer-id
+
+ o A receiver of the SDP is required to support all parameters and
+ values of the parameters provided; otherwise, the receiver MUST
+ reject (RTSP) or not participate in (SAP) the session. It falls
+ on the creator of the session to use values that are expected to
+ be supported by the receiving application.
+
+7.3. Examples
+
+ In the following examples, "{data}" is used to indicate a data string
+ encoded as base64.
+
+7.3.1. Example for Offering a Single SVC Session
+
+ Example 1: The offerer offers one video media description including
+ two RTP payload types. The first payload type offers H264, and the
+ second offers H264-SVC. Both payload types have different fmtp
+ parameters as profile-level-id, packetization-mode, and sprop-
+ parameter-sets.
+
+ Offerer -> Answerer SDP message:
+
+ m=video 20000 RTP/AVP 97 96
+ a=rtpmap:96 H264/90000
+ a=fmtp:96 profile-level-id=4de00a; packetization-mode=0;
+ sprop-parameter-sets={sps0},{pps0};
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 profile-level-id=53000c; packetization-mode=1;
+ sprop-parameter-sets={sps0},{pps0},{sps1},{pps1};
+
+ If the answerer does not support media subtype H264-SVC, it can issue
+ an answer accepting only the base layer offer (payload type 96). In
+ the following example, the receiver supports H264-SVC, so it lists
+ payload type 97 first as the preferred option.
+
+
+
+Wenger, et al. Standards Track [Page 86]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Answerer -> Offerer SDP message:
+
+ m=video 40000 RTP/AVP 97 96
+ a=rtpmap:96 H264/90000
+ a=fmtp:96 profile-level-id=4de00a; packetization-mode=0;
+ sprop-parameter-sets={sps2},{pps2};
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 profile-level-id=53000c; packetization-mode=1;
+ sprop-parameter-sets={sps2},{pps2},{sps3},{pps3};
+
+7.3.2. Example for Offering a Single SVC Session Using
+ scalable-layer-id
+
+ Example 2: Offerer offers the same media configurations as shown in
+ the example above for receiving and sending the stream, but using a
+ single RTP payload type and including sprop-operation-point-info.
+
+ Offerer -> Answerer SDP message:
+
+ m=video 20000 RTP/AVP 97
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 profile-level-id=53000c; packetization-mode=1;
+ sprop-parameter-sets={sps0},{sps1},{pps0},{pps1};
+ sprop-operation-point-info=<1,0,0,0,4de00a,3200,176,144,128,
+ 256>,<2,1,1,0,53000c,6400,352,288,256,512>;
+
+ In this example, the receiver supports H264-SVC and chooses the lower
+ operation point offered in the RTP payload type for sending and
+ receiving the stream.
+
+ Answerer -> Offerer SDP message:
+
+ m=video 40000 RTP/AVP 97
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 sprop-parameter-sets={sps2},{sps3},{pps2},{pps3};
+ scalable-layer-id=1;
+
+ In an equivalent example showing the use of sprop-scalability-info
+ instead using the sprop-operation-point-info, the sprop-operation-
+ point-info would be exchanged by the sprop-scalability-info followed
+ by the binary (base16) representation of the Scalability Information
+ SEI message.
+
+7.3.3. Example for Offering Multiple Sessions in MST
+
+ Example 3: In this example, the offerer offers a multi-session
+ transmission with up to three sessions. The base session media
+ description includes payload types that are backward compatible with
+
+
+
+Wenger, et al. Standards Track [Page 87]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ [RFC6184], and three different payload types are offered. The other
+ two media are using payload types with media subtype H264-SVC. In
+ each media description, different values of profile-level-id,
+ packetization-mode, mst-mode, and sprop-parameter-sets are offered.
+
+ Offerer -> Answerer SDP message:
+
+ a=group:DDP L1 L2 L3
+ m=video 20000 RTP/AVP 96 97 98
+ a=rtpmap:96 H264/90000
+ a=fmtp:96 profile-level-id=4de00a; packetization-mode=0;
+ mst-mode=NI-T; sprop-parameter-sets={sps0},{pps0};
+ a=rtpmap:97 H264/90000
+ a=fmtp:97 profile-level-id=4de00a; packetization-mode=1;
+ mst-mode=NI-TC; sprop-parameter-sets={sps0},{pps0};
+ a=rtpmap:98 H264/90000
+ a=fmtp:98 profile-level-id=4de00a; packetization-mode=2;
+ mst-mode=I-C; init-buf-time=156320;
+ sprop-parameter-sets={sps0},{pps0};
+ a=mid:L1
+ m=video 20002 RTP/AVP 99 100
+ a=rtpmap:99 H264-SVC/90000
+ a=fmtp:99 profile-level-id=53000c; packetization-mode=1;
+ mst-mode=NI-T; sprop-parameter-sets={sps1},{pps1};
+ a=rtpmap:100 H264-SVC/90000
+ a=fmtp:100 profile-level-id=53000c; packetization-mode=2;
+ mst-mode=I-C; sprop-parameter-sets={sps1},{pps1};
+ a=mid:L2
+ a=depend:99 lay L1:96,97; 100 lay L1:98
+ m=video 20004 RTP/AVP 101
+ a=rtpmap:101 H264-SVC/90000
+ a=fmtp:101 profile-level-id=53001F; packetization-mode=1;
+ mst-mode=NI-T; sprop-parameter-sets={sps2},{pps2};
+ a=mid:L3
+ a=depend:101 lay L1:96,97 L2:99
+
+ It is assumed that in this example the answerer only supports the NI-
+ T mode for multi-session transmission. For this reason, it chooses
+ the corresponding payload type (96) for the base RTP session. For
+ the two enhancement RTP sessions, the answerer also chooses the
+ payload types that use the NI-T mode (99 and 101).
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 88]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ Answerer -> Offerer SDP message:
+
+ a=group:DDP L1 L2 L3
+ m=video 40000 RTP/AVP 96
+ a=rtpmap:96 H264/90000
+ a=fmtp:96 profile-level-id=4de00a; packetization-mode=0;
+ mst-mode=NI-T; sprop-parameter-sets={sps3},{pps3};
+ a=mid:L1
+ m=video 40002 RTP/AVP 99
+ a=rtpmap:99 H264-SVC/90000
+ a=fmtp:99 profile-level-id=53000c; packetization-mode=1;
+ mst-mode=NI-T; sprop-parameter-sets={sps4},{pps4};
+ a=mid:L2
+ a=depend:99 lay L1:96
+ m=video 40004 RTP/AVP 101
+ a=rtpmap:101 H264-SVC/90000
+ a=fmtp:101 profile-level-id=53001F; packetization-mode=1;
+ mst-mode=NI-T; sprop-parameter-sets={sps5},{pps5};
+ a=mid:L3
+ a=depend:101 lay L1:96 L2:99
+
+7.3.4. Example for Offering Multiple Sessions in MST Including
+ Operation with Answerer Using scalable-layer-id
+
+ Example 4: In this example, the offerer offers a multi-session
+ transmission of three layers with up to two sessions. The base
+ session media description has a payload type that is backward
+ compatible with [RFC6184]. Note that no parameter sets are provided,
+ in which case in-band transport must be used. The other media
+ description contains two enhancement layers and uses the media
+ subtype H264-SVC. It includes two operation point definitions.
+
+ Offerer -> Answerer SDP message:
+
+ a=group:DDP L1 L2
+ m=video 20000 RTP/AVP 96
+ a=rtpmap:96 H264/90000
+ a=fmtp:96 profile-level-id=4de00a; packetization-mode=0;
+ mst-mode=NI-T;
+ a=mid:L1
+ m=video 20002 RTP/AVP 97
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 profile-level-id=53001F; packetization-mode=1;
+ mst-mode=NI-TC; sprop-operation-point-info=<2,0,1,0,53000c,
+ 3200,352,288,384,512>,<3,1,2,0,53001F,6400,704,576,768,1024>;
+ a=mid:L2
+ a=depend:97 lay L1:96
+
+
+
+
+Wenger, et al. Standards Track [Page 89]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ It is assumed that the answerer wants to send and receive the base
+ layer (payload type 96), but it only wants to send and receive the
+ lower enhancement layer, i.e., the one with layer id equal to 2. For
+ this reason, the response will include the selection of the desired
+ layer by setting scalable-layer-id equal to 2. Note that the answer
+ only includes the scalable-layer-id information. The answer could
+ include sprop-parameter-sets in the response.
+
+ Answerer -> Offerer SDP message:
+
+ a=group:DDP L1 L2
+ m=video 40000 RTP/AVP 96
+ a=rtpmap:96 H264/90000
+ a=fmtp:96 profile-level-id=4de00a; packetization-mode=0;
+ mst-mode=NI-T;
+ a=mid:L1
+ m=video 40002 RTP/AVP 97
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 scalable-layer-id=2;
+ a=mid:L2
+ a=depend:97 lay L1:96
+
+7.3.5. Example for Negotiating an SVC Stream with a Constrained Base
+ Layer in SST
+
+ Example 5: The offerer (Alice) offers one video description including
+ two RTP payload types with differing levels and packetization modes.
+
+ Offerer -> Answerer SDP message:
+
+ m=video 20000 RTP/AVP 97 96
+ a=rtpmap:96 H264-SVC/90000
+ a=fmtp:96 profile-level-id=53001e; packetization-mode=0;
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 profile-level-id=53001f; packetization-mode=1;
+
+ The answerer (Bridge) chooses packetization mode 1, and indicates
+ that it would receive an SVC stream with the base layer being
+ constrained.
+
+ Answerer -> Offerer SDP message:
+
+ m=video 40000 RTP/AVP 97
+ a=rtpmap:97 H264-SVC/90000
+ a=fmtp:97 profile-level-id=53001f; packetization-mode=1;
+ max-recv-base-level=000d
+
+
+
+
+
+Wenger, et al. Standards Track [Page 90]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ The answering endpoint must send an SVC stream at Level 3.1. Since
+ the offering endpoint did not declare max-recv-base-level, the base
+ layer of the SVC stream the answering endpoint must send is not
+ specifically constrained. The offering endpoint (Alice) must send an
+ SVC stream at Level 3.1, for which the base layer must be of a level
+ not higher than Level 1.3.
+
+7.4. Parameter Set Considerations
+
+ Section 8.4 of [RFC6184] applies in this memo, with the following
+ applies additionally for multi-session transmission (MST).
+
+ In MST, regardless of out-of-band or in-band transport of parameter
+ sets are in use, parameter sets required for decoding NAL units
+ carried in one particular RTP session SHOULD be carried in the same
+ session, MAY be carried in a session that the particular RTP session
+ depends on, and MUST NOT be carried in a session that the particular
+ RTP session does not depend on.
+
+8. Security Considerations
+
+ The security considerations of the RTP Payload Format for H.264 Video
+ specification [RFC6184] apply. Additionally, the following applies.
+
+ Decoders MUST exercise caution with respect to the handling of
+ reserved NAL unit types and reserved SEI messages, particularly if
+ they contain active elements, and MUST restrict their domain of
+ applicability to the presentation containing the stream. The safest
+ way is to simply discard these NAL units and SEI messages.
+
+ When integrity protection is applied to a stream, care MUST be taken
+ that the stream being transported may be scalable; hence a receiver
+ may be able to access only part of the entire stream.
+
+ End-to-end security with either authentication, integrity, or
+ confidentiality protection will prevent a MANE from performing media-
+ aware operations other than discarding complete packets. And in the
+ case of confidentiality protection it will even be prevented from
+ performing discarding of packets in a media-aware way. To allow any
+ MANE to perform its operations, it will be required to be a trusted
+ entity that is included in the security context establishment. This
+ applies both for the media path and for the RTCP path, if RTCP
+ packets need to be rewritten.
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 91]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+9. Congestion Control
+
+ Within any given RTP session carrying payload according to this
+ specification, the provisions of Section 10 of [RFC6184] apply.
+ Reducing the session bitrate is possible by one or more of the
+ following means:
+
+ a) Within the highest layer identified by the DID field remove any
+ NAL units with QID higher than a certain value.
+
+ b) Remove all NAL units with TID higher than a certain value.
+
+ c) Remove all NAL units associated with a DID higher than a certain
+ value.
+
+ Informative note: Removal of all coded slice NAL units
+ associated with DIDs higher than a certain value in the entire
+ stream is required in order to preserve conformance of the
+ resulting SVC stream.
+
+ d) Utilize the PRID field to indicate the relative importance of NAL
+ units, and remove all NAL units associated with a PRID higher than
+ a certain value. Note that the use of the PRID is application-
+ specific.
+
+ e) Remove NAL units or entire packets according to application-
+ specific rules. The result will depend on the particular coding
+ structure used as well as any additional application-specific
+ functionality (e.g., concealment performed at the receiving
+ decoder). In general, this will result in the reception of a non-
+ conforming bitstream and hence the decoder behavior is not
+ specified by [H.264]. Significant artifacts may therefore appear
+ in the decoded output if the particular decoder implementation
+ does not take appropriate action in response to congestion
+ control.
+
+ Informative note: The discussion above is centered on NAL units
+ rather than packets, primarily because that is the level where
+ senders can meaningfully manipulate the scalable bitstream. The
+ mapping of NAL units to RTP packets is fairly flexible when using
+ aggregation packets. Depending on the nature of the congestion
+ control algorithm, the "dimension" of congestion measurement
+ (packet count or bitrate) and reaction to it (reducing packet
+ count or bitrate or both) can be adjusted accordingly.
+
+ All aforementioned means are available to the RTP sender, regardless
+ of whether that sender is located in the sending endpoint or in a
+ mixer-based MANE.
+
+
+
+Wenger, et al. Standards Track [Page 92]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ When a translator-based MANE is employed, then the MANE MAY
+ manipulate the session only on the MANE's outgoing path, so that the
+ sensed end-to-end congestion falls within the permissible envelope.
+ As with all translators, in this case, the MANE needs to rewrite RTCP
+ RRs to reflect the manipulations it has performed on the session.
+
+ Informative note: Applications MAY also implement, in addition or
+ separately, other congestion control mechanisms, e.g., as
+ described in [RFC5775] and [Yan].
+
+10. IANA Considerations
+
+ A new media type, as specified in Section 7.1 of this memo, has been
+ registered with IANA.
+
+11. Informative Appendix: Application Examples
+
+11.1. Introduction
+
+ Scalable video coding is a concept that has been around since at
+ least MPEG-2 [MPEG2], which goes back as early as 1993.
+ Nevertheless, it has never gained wide acceptance, perhaps partly
+ because applications didn't materialize in the form envisioned during
+ standardization.
+
+ ISO/IEC MPEG and ITU-T VCEG, respectively, performed a requirement
+ analysis for the SVC project. The MPEG and VCEG requirement
+ documents are available in [JVT-N026] and [JVT-N027], respectively.
+
+ The following introduces four main application scenarios that the
+ authors consider relevant and that are implementable with this
+ specification.
+
+11.2. Layered Multicast
+
+ This well-understood form of the use of layered coding [McCanne]
+ implies that all layers are individually conveyed in their own RTP
+ packet streams, each carried in its own RTP session using the IP
+ (multicast) address and port number as the single demultiplexing
+ point. Receivers "tune" into the layers by subscribing to the IP
+ multicast, normally by using IGMP [IGMP]. Depending on the
+ application scenario, it is also possible to convey a number of
+ layers in one RTP session, when finer operation points within the
+ subset of layers are not needed.
+
+ Layered multicast has the great advantage of simplicity and easy
+ implementation. However, it has also the great disadvantage of
+ utilizing many different transport addresses. While the authors
+
+
+
+Wenger, et al. Standards Track [Page 93]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ consider this not to be a major problem for a professionally
+ maintained content server, receiving client endpoints need to open
+ many ports to IP multicast addresses in their firewalls. This is a
+ practical problem from a firewall and network address translation
+ (NAT) viewpoint. Furthermore, even today IP multicast is not as
+ widely deployed as many wish.
+
+ The authors consider layered multicast an important application
+ scenario for the following reasons. First, it is well understood and
+ the implementation constraints are well known. Second, there may
+ well be large-scale IP networks outside the immediate Internet
+ context that may wish to employ layered multicast in the future. One
+ possible example could be a combination of content creation and core-
+ network distribution for the various mobile TV services, e.g., those
+ being developed by 3GPP (MBMS) [MBMS] and DVB (DVB-H) [DVB-H].
+
+11.3. Streaming
+
+ In this scenario, a streaming server has a repository of stored SVC
+ coded layers for a given content. At the time of streaming, and
+ according to the capabilities, connectivity, and congestion situation
+ of the client(s), the streaming server generates and serves a
+ scalable stream. Both unicast and multicast serving is possible. At
+ the same time, the streaming server may use the same repository of
+ stored layers to compose different streams (with a different set of
+ layers) intended for other audiences.
+
+ As every endpoint receives only a single SVC RTP session, the number
+ of firewall pinholes can be optimized to one.
+
+ The main difference between this scenario and straightforward
+ simulcasting lies in the architecture and the requirements of the
+ streaming server, and is therefore out of the scope of IETF
+ standardization. However, compelling arguments can be made why such
+ a streaming server design makes sense. One possible argument is
+ related to storage space and channel bandwidth. Another is bandwidth
+ adaptability without transcoding -- a considerable advantage in a
+ congestion controlled network. When the streaming server learns
+ about congestion, it can reduce the sending bitrate by choosing fewer
+ layers when composing the layered stream; see Section 9. SVC is
+ designed to gracefully support both bandwidth ramp-down and bandwidth
+ ramp-up with a considerable dynamic range. This payload format is
+ designed to allow for bandwidth flexibility in the mentioned sense.
+ While, in theory, a transcoding step could achieve a similar dynamic
+ range, the computational demands are impractically high and video
+ quality is typically lowered -- therefore, few (if any) streaming
+ servers implement full transcoding.
+
+
+
+
+Wenger, et al. Standards Track [Page 94]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+11.4. Videoconferencing (Unicast to MANE, Unicast to Endpoints)
+
+ Videoconferencing has traditionally relied on Multipoint Control
+ Units (MCUs). These units connect endpoints in a star configuration
+ and operate as follows. Coded video is transmitted from each
+ endpoint to the MCU, where it is decoded, scaled, and composited to
+ construct output frames, which are then re-encoded and transmitted to
+ the endpoint(s). In systems supporting personalized layout (each
+ user is allowed to select the layout of his/her screen), the
+ compositing and encoding process is performed for each of the
+ receiving endpoints. Even without personalized layout, rate matching
+ still requires that the encoding process at the MCU is performed
+ separately for each endpoint. As a result, MCUs have considerable
+ complexity and introduce significant delay. The cascaded encodings
+ also reduce the video quality. Particularly for multipoint
+ connections, interactive communication is cumbersome as the end-to-
+ end delay is very high [G.114]. A simpler architecture is the
+ switching MCU, in which one of the incoming video streams is
+ redirected to the receiving endpoints. Obviously, only one user at a
+ time can be seen and rate matching cannot be performed, thus forcing
+ all transmitting endpoints to transmit at the lowest bit rate
+ available in the MCU-to-endpoint connections.
+
+ With scalable video coding the MCU can be replaced with an
+ application-level router (ALR): this unit simply selects which
+ incoming packets should be transmitted to which of the receiving
+ endpoints [Eleft]. In such a system, each endpoint performs its own
+ composition of the incoming video streams. Assuming, for example, a
+ system that uses spatial scalability with two layers, personalized
+ layout is equivalent to instructing the ALR to only send the required
+ packets for the corresponding resolution to the particular endpoint.
+ Similarly, rate matching at the ALR for a particular endpoint can be
+ performed by selecting an appropriate subset of the incoming video
+ packets to transmit to the particular endpoint. Personalized layout
+ and rate matching thus become routing decisions, and require no
+ signal processing. Note that scalability also allows participants to
+ enjoy the best video quality afforded by their links, i.e., users no
+ longer have to be forced to operate at the quality supported by the
+ weakest endpoint. Most importantly, the ALR has an insignificant
+ contribution to the end-to-end delay, typically an order of magnitude
+ less than an MCU. This makes it possible to have fully interactive
+ multipoint conferences with even a very large number of participants.
+ There are significant advantages as well in terms of error resilience
+ and, in fact, error tolerance can be increased by nearly an order of
+ magnitude here as well (e.g., using unequal error protection).
+ Finally, the very low delay of an ALR allows these systems to be
+
+
+
+
+
+Wenger, et al. Standards Track [Page 95]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ cascaded, with significant benefits in terms of system design and
+ deployment. Cascading of traditional MCUs is impossible due to the
+ very high delay that even a single MCU introduces.
+
+ Scalable video coding enables a very significant paradigm shift in
+ videoconferencing systems, bringing the complexity of video
+ communication systems (particularly the servers residing within the
+ network) in line with other types of network applications.
+
+11.5. Mobile TV (Multicast to MANE, Unicast to Endpoint)
+
+ This scenario is a bit more complex, and designed to optimize the
+ network traffic in a core network, while still requiring only a
+ single pinhole in the endpoint's firewall. One of its key
+ applications is the mobile TV market.
+
+ Consider a large private IP network, e.g., the core network of the
+ Third Generation Partnership Project (3GPP). Streaming servers
+ within this core network can be assumed to be professionally
+ maintained. It is assumed that these servers can have many ports
+ open to the network and that layered multicast is a real option.
+ Therefore, the streaming server multicasts SVC scalable layers,
+ instead of simulcasting different representations of the same content
+ at different bitrates.
+
+ Also consider many endpoints of different classes. Some of these
+ endpoints may lack the processing power or the display size to
+ meaningfully decode all layers; others may have these capabilities.
+ Users of some endpoints may wish not to pay for high quality and are
+ happy with a base service, which may be cheaper or even free. Other
+ users are willing to pay for high quality. Finally, some connected
+ users may have a bandwidth problem in that they can't receive the
+ bandwidth they would want to receive -- be it through congestion,
+ connectivity, change of service quality, or for whatever other
+ reasons. However, all these users have in common that they don't
+ want to be exposed too much, and therefore the number of firewall
+ pinholes needs to be small.
+
+ This situation can be handled best by introducing middleboxes close
+ to the edge of the core network, which receive the layered multicast
+ streams and compose the single SVC scalable bitstream according to
+ the needs of the endpoint connected. These middleboxes are called
+ MANEs throughout this specification. In practice, the authors
+ envision the MANE to be part of (or at least physically and
+ topologically close to) the base station of a mobile network, where
+ all the signaling and media traffic necessarily are multiplexed on
+ the same physical link.
+
+
+
+
+Wenger, et al. Standards Track [Page 96]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ MANEs necessarily need to be fairly complex devices. They certainly
+ need to understand the signaling, so, for example, to associate the
+ payload type octet in the RTP header with the SVC payload type.
+
+ A MANE may aggregate multiple RTP streams, possibly from multiple RTP
+ sessions, thus to reduce the number of firewall pinholes required at
+ the endpoints, or may optimize the outgoing RTP stream to the MTU
+ size of the outgoing path by utilizing the aggregation and
+ fragmentation mechanisms of this memo. This type of MANE is
+ conceptually easy to implement and can offer powerful features,
+ primarily because it necessarily can "see" the payload (including the
+ RTP payload headers), utilize the wealth of layering information
+ available therein, and manipulate it.
+
+ A MANE can also perform stream thinning, in order to adhere to
+ congestion control principles as discussed in Section 9. While the
+ implementation of the forward (media) channel of such a MANE appears
+ to be comparatively simple, the need to rewrite RTCP RRs makes even
+ such a MANE a complex device.
+
+ While the implementation complexity of either case of a MANE, as
+ discussed above, is fairly high, the computational demands are
+ comparatively low.
+
+12. Acknowledgements
+
+ Miska Hannuksela contributed significantly to the designs of the
+ PACSI NAL unit and the NI-C mode for decoding order recovery. Roni
+ Even organized and coordinated the design team for the development of
+ this memo, and provided valuable comments. Jonathan Lennox
+ contributed to the NAL unit reordering algorithm for MST and provided
+ input on several parts of this memo. Peter Amon, Sam Ganesan, Mike
+ Nilsson, Colin Perkins, and Thomas Wiegand were members of the design
+ team and provided valuable contributions. Magnus Westerlund has also
+ made valuable comments. Charles Eckel and Stuart Taylor provided
+ valuable comments after the first WGLC for this document. Xiaohui
+ (Joanne) Wei helped improving Table 13 and the SDP examples.
+
+ The work of Thomas Schierl has been supported by the European
+ Commission under contract number FP7-ICT-248036, project COAST.
+
+13. References
+
+13.1. Normative References
+
+ [H.264] ITU-T Recommendation H.264, "Advanced video coding for
+ generic audiovisual services", March 2010.
+
+
+
+
+Wenger, et al. Standards Track [Page 97]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ [RFC6184] Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP
+ Payload Format for H.264 Video", RFC 6184, May 2011.
+
+ [ISO/IEC14496-10]
+ ISO/IEC International Standard 14496-10:2005.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
+ with Session Description Protocol (SDP)", RFC 3264, June
+ 2002.
+
+ [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
+ Jacobson, "RTP: A Transport Protocol for Real-Time
+ Applications", STD 64, RFC 3550, July 2003.
+
+ [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and
+ Registration Procedures", BCP 13, RFC 4288, December 2005.
+
+ [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
+ Description Protocol", RFC 4566, July 2006.
+
+ [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data
+ Encodings", RFC 4648, October 2006.
+
+ [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific
+ Media Attributes in the Session Description Protocol
+ (SDP)", RFC 5576, June 2009.
+
+ [RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding
+ Dependency in the Session Description Protocol (SDP)", RFC
+ 5583, July 2009.
+
+ [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP
+ Flows", RFC 6051, November 2010.
+
+13.2. Informative References
+
+ [DVB-H] DVB - Digital Video Broadcasting (DVB); DVB-H
+ Implementation Guidelines, ETSI TR 102 377, 2005.
+
+ [Eleft] Eleftheriadis, A., R. Civanlar, and O. Shapiro,
+ "Multipoint Videoconferencing with Scalable Video Coding",
+ Journal of Zhejiang University SCIENCE A, Vol. 7, Nr. 5,
+ April 2006, pp. 696-705. (Proceedings of the Packet Video
+ 2006 Workshop.)
+
+
+
+
+Wenger, et al. Standards Track [Page 98]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+ [G.114] ITU-T Rec. G.114, "One-way transmission time", May 2003.
+
+ [H.241] ITU-T Rec. H.241, "Extended video procedures and control
+ signals for H.300-series terminals", May 2006.
+
+ [IGMP] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A.
+ Thyagarajan, "Internet Group Management Protocol, Version
+ 3", RFC 3376, October 2002.
+
+ [JVT-N026] Ohm J.-R., Koenen, R., and Chiariglione, L. (ed.), "SVC
+ requirements specified by MPEG (ISO/IEC JTC1 SC29 WG11)",
+ JVT-N026, available from http://ftp3.itu.ch/av-arch/
+ jvt-site/2005_01_HongKong/JVT-N026.doc, Hong Kong, China,
+ January 2005.
+
+ [JVT-N027] Sullivan, G. and Wiegand, T. (ed.), "SVC requirements
+ specified by VCEG (ITU-T SG16 Q.6)", JVT-N027, available
+ from http://ftp3.itu.int/av-arch/
+ jvt-site/2005_01_HongKong/JVT-N027.doc, Hong Kong, China,
+ January 2005.
+
+ [McCanne] McCanne, S., Jacobson, V., and Vetterli, M., "Receiver-
+ driven layered multicast", in Proc. of ACM SIGCOMM'96,
+ pages 117-130, Stanford, CA, August 1996.
+
+ [MBMS] 3GPP - Technical Specification Group Services and System
+ Aspects; Multimedia Broadcast/Multicast Service (MBMS);
+ Protocols and codecs (Release 6), December 2005.
+
+ [MPEG2] ISO/IEC International Standard 13818-2:1993.
+
+ [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time
+ Streaming Protocol (RTSP)", RFC 2326, April 1998.
+
+ [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session
+ Announcement Protocol", RFC 2974, October 2000.
+
+ [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117,
+ January 2008.
+
+ [RFC5775] Luby, M., Watson, M., and L. Vicisano, "Asynchronous
+ Layered Coding (ALC) Protocol Instantiation", RFC 5775,
+ April 2010.
+
+ [Yan] Yan, J., Katrinis, K., May, M., and Plattner, R., "Media-
+ and TCP-friendly congestion control for scalable video
+ streams", in IEEE Trans. Multimedia, pages 196-206, April
+ 2006.
+
+
+
+Wenger, et al. Standards Track [Page 99]
+
+RFC 6190 RTP Payload Format for SVC May 2011
+
+
+Authors' Addresses
+
+ Stephan Wenger
+ 2400 Skyfarm Dr.
+ Hillsborough, CA 94010
+ USA
+
+ Phone: +1-415-713-5473
+ EMail: stewe@stewe.org
+
+
+ Ye-Kui Wang
+ Huawei Technologies
+ 400 Crossing Blvd, 2nd Floor
+ Bridgewater, NJ 08807
+ USA
+
+ Phone: +1-908-541-3518
+ EMail: yekui.wang@huawei.com
+
+
+ Thomas Schierl
+ Fraunhofer HHI
+ Einsteinufer 37
+ D-10587 Berlin
+ Germany
+
+ Phone: +49-30-31002-227
+ EMail: ts@thomas-schierl.de
+
+
+ Alex Eleftheriadis
+ Vidyo, Inc.
+ 433 Hackensack Ave.
+ Hackensack, NJ 07601
+ USA
+
+ Phone: +1-201-467-5135
+ EMail: alex@vidyo.com
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 100]
+