diff options
Diffstat (limited to 'doc/rfc/rfc7798.txt')
-rw-r--r-- | doc/rfc/rfc7798.txt | 4819 |
1 files changed, 4819 insertions, 0 deletions
diff --git a/doc/rfc/rfc7798.txt b/doc/rfc/rfc7798.txt new file mode 100644 index 0000000..c08c76e --- /dev/null +++ b/doc/rfc/rfc7798.txt @@ -0,0 +1,4819 @@ + + + + + + +Internet Engineering Task Force (IETF) Y.-K. Wang +Request for Comments: 7798 Qualcomm +Category: Standards Track Y. Sanchez +ISSN: 2070-1721 T. Schierl + Fraunhofer HHI + S. Wenger + Vidyo + M. M. Hannuksela + Nokia + March 2016 + + + RTP Payload Format for High Efficiency Video Coding (HEVC) + +Abstract + + This memo describes an RTP payload format for the video coding + standard ITU-T Recommendation H.265 and ISO/IEC International + Standard 23008-2, both also known as High Efficiency Video Coding + (HEVC) and developed by the Joint Collaborative Team on Video Coding + (JCT-VC). The RTP payload format allows for packetization of one or + more Network Abstraction Layer (NAL) units in each RTP packet payload + as well as fragmentation of a NAL unit into multiple RTP packets. + Furthermore, it supports transmission of an HEVC bitstream over a + single stream as well as multiple RTP streams. When multiple RTP + streams are used, a single transport or multiple transports may be + utilized. The payload format has wide applicability in + videoconferencing, Internet video streaming, and high-bitrate + entertainment-quality video, among others. + +Status of This Memo + + This is an Internet Standards Track document. + + This document is a product of the Internet Engineering Task Force + (IETF). It represents the consensus of the IETF community. It has + received public review and has been approved for publication by the + Internet Engineering Steering Group (IESG). Further information on + Internet Standards is available in Section 2 of RFC 5741. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + http://www.rfc-editor.org/info/rfc7798. + + + + + + + + +Wang, et al. Standards Track [Page 1] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +Copyright Notice + + Copyright (c) 2016 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (http://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. Code Components extracted from this document must + include Simplified BSD License text as described in Section 4.e of + the Trust Legal Provisions and are provided without warranty as + described in the Simplified BSD License. + +Table of Contents + + 1. Introduction ....................................................3 + 1.1. Overview of the HEVC Codec .................................4 + 1.1.1. Coding-Tool Features ................................4 + 1.1.2. Systems and Transport Interfaces ....................6 + 1.1.3. Parallel Processing Support ........................11 + 1.1.4. NAL Unit Header ....................................13 + 1.2. Overview of the Payload Format ............................14 + 2. Conventions ....................................................15 + 3. Definitions and Abbreviations ..................................15 + 3.1. Definitions ...............................................15 + 3.1.1. Definitions from the HEVC Specification ...........15 + 3.1.2. Definitions Specific to This Memo .................17 + 3.2. Abbreviations .............................................19 + 4. RTP Payload Format .............................................20 + 4.1. RTP Header Usage ..........................................20 + 4.2. Payload Header Usage ......................................22 + 4.3. Transmission Modes ........................................23 + 4.4. Payload Structures ........................................24 + 4.4.1. Single NAL Unit Packets ............................24 + 4.4.2. Aggregation Packets (APs) ..........................25 + 4.4.3. Fragmentation Units ................................29 + 4.4.4. PACI Packets .......................................32 + 4.4.4.1. Reasons for the PACI Rules (Informative) ..34 + 4.4.4.2. PACI Extensions (Informative) .............35 + 4.5. Temporal Scalability Control Information ..................36 + 4.6. Decoding Order Number .....................................37 + 5. Packetization Rules ............................................39 + 6. De-packetization Process .......................................40 + 7. Payload Format Parameters ......................................42 + 7.1. Media Type Registration ...................................42 + 7.2. SDP Parameters ............................................64 + + + +Wang, et al. Standards Track [Page 2] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + 7.2.1. Mapping of Payload Type Parameters to SDP ..........64 + 7.2.2. Usage with SDP Offer/Answer Model ..................65 + 7.2.3. Usage in Declarative Session Descriptions ..........73 + 7.2.4. Considerations for Parameter Sets ..................75 + 7.2.5. Dependency Signaling in Multi-Stream Mode ..........75 + 8. Use with Feedback Messages .....................................75 + 8.1. Picture Loss Indication (PLI) .............................75 + 8.2. Slice Loss Indication (SLI) ...............................76 + 8.3. Reference Picture Selection Indication (RPSI) .............77 + 8.4. Full Intra Request (FIR) ..................................77 + 9. Security Considerations ........................................78 + 10. Congestion Control ............................................79 + 11. IANA Considerations ...........................................80 + 12. References ....................................................80 + 12.1. Normative References .....................................80 + 12.2. Informative References ...................................82 + Acknowledgments ...................................................85 + Authors' Addresses ................................................86 + + +1. Introduction + + The High Efficiency Video Coding specification, formally published as + both ITU-T Recommendation H.265 [HEVC] and ISO/IEC International + Standard 23008-2 [ISO23008-2], was ratified by the ITU-T in April + 2013; reportedly, it provides significant coding efficiency gains + over H.264 [H.264]. + + This memo describes an RTP payload format for HEVC. It shares its + basic design with the RTP payload formats of [RFC6184] and [RFC6190]. + With respect to design philosophy, security, congestion control, and + overall implementation complexity, it has similar properties to those + earlier payload format specifications. This is a conscious choice, + as at least RFC 6184 is widely deployed and generally known in the + relevant implementer communities. Mechanisms from RFC 6190 were + incorporated as HEVC version 1 supports temporal scalability. + + In order to help the overlapping implementer community, frequently + only the differences between RFCs 6184 and 6190 and the HEVC payload + format are highlighted in non-normative, explanatory parts of this + memo. Basic familiarity with both specifications is assumed for + those parts. However, the normative parts of this memo do not + require study of RFCs 6184 or 6190. + + + + + + + + +Wang, et al. Standards Track [Page 3] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +1.1. Overview of the HEVC Codec + + H.264 and HEVC share a similar hybrid video codec design. In this + memo, we provide a very brief overview of those features of HEVC that + are, in some form, addressed by the payload format specified herein. + Implementers have to read, understand, and apply the ITU-T/ISO/IEC + specifications pertaining to HEVC to arrive at interoperable, well- + performing implementations. Implementers should consider testing + their design (including the interworking between the payload format + implementation and the core video codec) using the tools provided by + ITU-T/ISO/IEC, for example, conformance bitstreams as specified in + [H.265.1]. Not doing so has historically led to systems that perform + badly and that are not secure. + + Conceptually, both H.264 and HEVC include a Video Coding Layer (VCL), + which is often used to refer to the coding-tool features, and a + Network Abstraction Layer (NAL), which is often used to refer to the + systems and transport interface aspects of the codecs. + +1.1.1. Coding-Tool Features + + Similar to earlier hybrid-video-coding-based standards, including + H.264, the following basic video coding design is employed by HEVC. + A prediction signal is first formed by either intra- or motion- + compensated prediction, and the residual (the difference between the + original and the prediction) is then coded. The gains in coding + efficiency are achieved by redesigning and improving almost all parts + of the codec over earlier designs. In addition, HEVC includes + several tools to make the implementation on parallel architectures + easier. Below is a summary of HEVC coding-tool features. + + Quad-tree block and transform structure + + One of the major tools that contributes significantly to the coding + efficiency of HEVC is the use of flexible coding blocks and + transforms, which are defined in a hierarchical quad-tree manner. + Unlike H.264, where the basic coding block is a macroblock of fixed- + size 16x16, HEVC defines a Coding Tree Unit (CTU) of a maximum size + of 64x64. Each CTU can be divided into smaller units in a + hierarchical quad-tree manner and can represent smaller blocks down + to size 4x4. Similarly, the transforms used in HEVC can have + different sizes, starting from 4x4 and going up to 32x32. Utilizing + large blocks and transforms contributes to the major gain of HEVC, + especially at high resolutions. + + + + + + + +Wang, et al. Standards Track [Page 4] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Entropy coding + + HEVC uses a single entropy-coding engine, which is based on Context + Adaptive Binary Arithmetic Coding (CABAC) [CABAC], whereas H.264 uses + two distinct entropy coding engines. CABAC in HEVC shares many + similarities with CABAC of H.264, but contains several improvements. + Those include improvements in coding efficiency and lowered + implementation complexity, especially for parallel architectures. + + In-loop filtering + + H.264 includes an in-loop adaptive deblocking filter, where the + blocking artifacts around the transform edges in the reconstructed + picture are smoothed to improve the picture quality and compression + efficiency. In HEVC, a similar deblocking filter is employed but + with somewhat lower complexity. In addition, pictures undergo a + subsequent filtering operation called Sample Adaptive Offset (SAO), + which is a new design element in HEVC. SAO basically adds a pixel- + level offset in an adaptive manner and usually acts as a de-ringing + filter. It is observed that SAO improves the picture quality, + especially around sharp edges, contributing substantially to visual + quality improvements of HEVC. + + Motion prediction and coding + + There have been a number of improvements in this area that are + summarized as follows. The first category is motion merge and + Advanced Motion Vector Prediction (AMVP) modes. The motion + information of a prediction block can be inferred from the spatially + or temporally neighboring blocks. This is similar to the DIRECT mode + in H.264 but includes new aspects to incorporate the flexible quad- + tree structure and methods to improve the parallel implementations. + In addition, the motion vector predictor can be signaled for improved + efficiency. The second category is high-precision interpolation. + The interpolation filter length is increased to 8-tap from 6-tap, + which improves the coding efficiency but also comes with increased + complexity. In addition, the interpolation filter is defined with + higher precision without any intermediate rounding operations to + further improve the coding efficiency. + + Intra prediction and intra-coding + + Compared to 8 intra prediction modes in H.264, HEVC supports angular + intra prediction with 33 directions. This increased flexibility + improves both objective coding efficiency and visual quality as the + edges can be better predicted and ringing artifacts around the edges + can be reduced. In addition, the reference samples are adaptively + smoothed based on the prediction direction. To avoid contouring + + + +Wang, et al. Standards Track [Page 5] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + artifacts a new interpolative prediction generation is included to + improve the visual quality. Furthermore, Discrete Sine Transform + (DST) is utilized instead of traditional Discrete Cosine Transform + (DCT) for 4x4 intra-transform blocks. + + Other coding-tool features + + HEVC includes some tools for lossless coding and efficient screen- + content coding, such as skipping the transform for certain blocks. + These tools are particularly useful, for example, when streaming the + user interface of a mobile device to a large display. + +1.1.2. Systems and Transport Interfaces + + HEVC inherited the basic systems and transport interfaces designs + from H.264. These include the NAL-unit-based syntax structure, the + hierarchical syntax and data unit structure, the Supplemental + Enhancement Information (SEI) message mechanism, and the video + buffering model based on the Hypothetical Reference Decoder (HRD). + The hierarchical syntax and data unit structure consists of sequence- + level parameter sets, multi-picture-level or picture-level parameter + sets, slice-level header parameters, and lower-level parameters. In + the following, a list of differences in these aspects compared to + H.264 is summarized. + + Video parameter set + + A new type of parameter set, called Video Parameter Set (VPS), was + introduced. For the first (2013) version of [HEVC], the VPS NAL unit + is required to be available prior to its activation, while the + information contained in the VPS is not necessary for operation of + the decoding process. For future HEVC extensions, such as the 3D or + scalable extensions, the VPS is expected to include information + necessary for operation of the decoding process, e.g., decoding + dependency or information for reference picture set construction of + enhancement layers. The VPS provides a "big picture" of a bitstream, + including what types of operation points are provided, the profile, + tier, and level of the operation points, and some other high-level + properties of the bitstream that can be used as the basis for session + negotiation and content selection, etc. (see Section 7.1). + + Profile, tier, and level + + The profile, tier, and level syntax structure that can be included in + both the VPS and Sequence Parameter Set (SPS) includes 12 bytes of + data to describe the entire bitstream (including all temporally + scalable layers, which are referred to as sub-layers in the HEVC + specification), and can optionally include more profile, tier, and + + + +Wang, et al. Standards Track [Page 6] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + level information pertaining to individual temporally scalable + layers. The profile indicator shows the "best viewed as" profile + when the bitstream conforms to multiple profiles, similar to the + major brand concept in the ISO Base Media File Format (ISOBMFF) + [IS014496-12] [IS015444-12] and file formats derived based on + ISOBMFF, such as the 3GPP file format [3GPPFF]. The profile, tier, + and level syntax structure also includes indications such as 1) + whether the bitstream is free of frame-packed content, 2) whether the + bitstream is free of interlaced source content, and 3) whether the + bitstream is free of field pictures. When the answer is yes for both + 2) and 3), the bitstream contains only frame pictures of progressive + source. Based on these indications, clients/players without support + of post-processing functionalities for the handling of frame-packed, + interlaced source content or field pictures can reject those + bitstreams that contain such pictures. + + Bitstream and elementary stream + + HEVC includes a definition of an elementary stream, which is new + compared to H.264. An elementary stream consists of a sequence of + one or more bitstreams. An elementary stream that consists of two or + more bitstreams has typically been formed by splicing together two or + more bitstreams (or parts thereof). When an elementary stream + contains more than one bitstream, the last NAL unit of the last + access unit of a bitstream (except the last bitstream in the + elementary stream) must contain an end of bitstream NAL unit, and the + first access unit of the subsequent bitstream must be an Intra-Random + Access Point (IRAP) access unit. This IRAP access unit may be a + Clean Random Access (CRA), Broken Link Access (BLA), or Instantaneous + Decoding Refresh (IDR) access unit. + + Random access support + + HEVC includes signaling in the NAL unit header, through NAL unit + types, of IRAP pictures beyond IDR pictures. Three types of IRAP + pictures, namely IDR, CRA, and BLA pictures, are supported: IDR + pictures are conventionally referred to as closed group-of-pictures + (closed-GOP) random access points whereas CRA and BLA pictures are + conventionally referred to as open-GOP random access points. BLA + pictures usually originate from splicing of two bitstreams or part + thereof at a CRA picture, e.g., during stream switching. To enable + better systems usage of IRAP pictures, altogether six different NAL + units are defined to signal the properties of the IRAP pictures, + which can be used to better match the stream access point types as + defined in the ISOBMFF [IS014496-12] [IS015444-12], which are + utilized for random access support in both 3GP-DASH [3GPDASH] and + MPEG DASH [MPEGDASH]. Pictures following an IRAP picture in decoding + order and preceding the IRAP picture in output order are referred to + + + +Wang, et al. Standards Track [Page 7] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + as leading pictures associated with the IRAP picture. There are two + types of leading pictures: Random Access Decodable Leading (RADL) + pictures and Random Access Skipped Leading (RASL) pictures. RADL + pictures are decodable when the decoding started at the associated + IRAP picture; RASL pictures are not decodable when the decoding + started at the associated IRAP picture and are usually discarded. + HEVC provides mechanisms to enable specifying the conformance of a + bitstream wherein the originally present RASL pictures have been + discarded. Consequently, system components can discard RASL + pictures, when needed, without worrying about causing the bitstream + to become non-compliant. + + Temporal scalability support + + HEVC includes an improved support of temporal scalability, by + inclusion of the signaling of TemporalId in the NAL unit header, the + restriction that pictures of a particular temporal sub-layer cannot + be used for inter prediction reference by pictures of a lower + temporal sub-layer, the sub-bitstream extraction process, and the + requirement that each sub-bitstream extraction output be a conforming + bitstream. Media-Aware Network Elements (MANEs) can utilize the + TemporalId in the NAL unit header for stream adaptation purposes + based on temporal scalability. + + Temporal sub-layer switching support + + HEVC specifies, through NAL unit types present in the NAL unit + header, the signaling of Temporal Sub-layer Access (TSA) and Step- + wise Temporal Sub-layer Access (STSA). A TSA picture and pictures + following the TSA picture in decoding order do not use pictures prior + to the TSA picture in decoding order with TemporalId greater than or + equal to that of the TSA picture for inter prediction reference. A + TSA picture enables up-switching, at the TSA picture, to the sub- + layer containing the TSA picture or any higher sub-layer, from the + immediately lower sub-layer. An STSA picture does not use pictures + with the same TemporalId as the STSA picture for inter prediction + reference. Pictures following an STSA picture in decoding order with + the same TemporalId as the STSA picture do not use pictures prior to + the STSA picture in decoding order with the same TemporalId as the + STSA picture for inter prediction reference. An STSA picture enables + up-switching, at the STSA picture, to the sub-layer containing the + STSA picture, from the immediately lower sub-layer. + + Sub-layer reference or non-reference pictures + + The concept and signaling of reference/non-reference pictures in HEVC + are different from H.264. In H.264, if a picture may be used by any + other picture for inter prediction reference, it is a reference + + + +Wang, et al. Standards Track [Page 8] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + picture; otherwise, it is a non-reference picture, and this is + signaled by two bits in the NAL unit header. In HEVC, a picture is + called a reference picture only when it is marked as "used for + reference". In addition, the concept of sub-layer reference picture + was introduced. If a picture may be used by another other picture + with the same TemporalId for inter prediction reference, it is a sub- + layer reference picture; otherwise, it is a sub-layer non-reference + picture. Whether a picture is a sub-layer reference picture or sub- + layer non-reference picture is signaled through NAL unit type values. + + Extensibility + + Besides the TemporalId in the NAL unit header, HEVC also includes the + signaling of a six-bit layer ID in the NAL unit header, which must be + equal to 0 for a single-layer bitstream. Extension mechanisms have + been included in the VPS, SPS, Picture Parameter Set (PPS), SEI NAL + unit, slice headers, and so on. All these extension mechanisms + enable future extensions in a backward-compatible manner, such that + bitstreams encoded according to potential future HEVC extensions can + be fed to then-legacy decoders (e.g., HEVC version 1 decoders), and + the then-legacy decoders can decode and output the base-layer + bitstream. + + Bitstream extraction + + HEVC includes a bitstream-extraction process as an integral part of + the overall decoding process. The bitstream extraction process is + used in the process of bitstream conformance tests, which is part of + the HRD buffering model. + + Reference picture management + + The reference picture management of HEVC, including reference picture + marking and removal from the Decoded Picture Buffer (DPB) as well as + Reference Picture List Construction (RPLC), differs from that of + H.264. Instead of the reference picture marking mechanism based on a + sliding window plus adaptive Memory Management Control Operation + (MMCO) described in H.264, HEVC specifies a reference picture + management and marking mechanism based on Reference Picture Set + (RPS), and the RPLC is consequently based on the RPS mechanism. An + RPS consists of a set of reference pictures associated with a + picture, consisting of all reference pictures that are prior to the + associated picture in decoding order, that may be used for inter + prediction of the associated picture or any picture following the + associated picture in decoding order. The reference picture set + consists of five lists of reference pictures; RefPicSetStCurrBefore, + RefPicSetStCurrAfter, RefPicSetStFoll, RefPicSetLtCurr, and + RefPicSetLtFoll. RefPicSetStCurrBefore, RefPicSetStCurrAfter, and + + + +Wang, et al. Standards Track [Page 9] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + RefPicSetLtCurr contain all reference pictures that may be used in + inter prediction of the current picture and that may be used in inter + prediction of one or more of the pictures following the current + picture in decoding order. RefPicSetStFoll and RefPicSetLtFoll + consist of all reference pictures that are not used in inter + prediction of the current picture but may be used in inter prediction + of one or more of the pictures following the current picture in + decoding order. RPS provides an "intra-coded" signaling of the DPB + status, instead of an "inter-coded" signaling, mainly for improved + error resilience. The RPLC process in HEVC is based on the RPS, by + signaling an index to an RPS subset for each reference index; this + process is simpler than the RPLC process in H.264. + + Ultra-low delay support + + HEVC specifies a sub-picture-level HRD operation, for support of the + so-called ultra-low delay. The mechanism specifies a standard- + compliant way to enable delay reduction below a one-picture interval. + Coded Picture Buffer (CPB) and DPB parameters at the sub-picture + level may be signaled, and utilization of this information for the + derivation of CPB timing (wherein the CPB removal time corresponds to + decoding time) and DPB output timing (display time) is specified. + Decoders are allowed to operate the HRD at the conventional access- + unit level, even when the sub-picture-level HRD parameters are + present. + + New SEI messages + + HEVC inherits many H.264 SEI messages with changes in syntax and/or + semantics making them applicable to HEVC. Additionally, there are a + few new SEI messages reviewed briefly in the following paragraphs. + + The display orientation SEI message informs the decoder of a + transformation that is recommended to be applied to the cropped + decoded picture prior to display, such that the pictures can be + properly displayed, e.g., in an upside-up manner. + + The structure of pictures SEI message provides information on the NAL + unit types, picture-order count values, and prediction dependencies + of a sequence of pictures. The SEI message can be used, for example, + for concluding what impact a lost picture has on other pictures. + + The decoded picture hash SEI message provides a checksum derived from + the sample values of a decoded picture. It can be used for detecting + whether a picture was correctly received and decoded. + + + + + + +Wang, et al. Standards Track [Page 10] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + The active parameter sets SEI message includes the IDs of the active + video parameter set and the active sequence parameter set and can be + used to activate VPSs and SPSs. In addition, the SEI message + includes the following indications: 1) An indication of whether "full + random accessibility" is supported (when supported, all parameter + sets needed for decoding of the remaining of the bitstream when + random accessing from the beginning of the current CVS by completely + discarding all access units earlier in decoding order are present in + the remaining bitstream, and all coded pictures in the remaining + bitstream can be correctly decoded); 2) An indication of whether + there is no parameter set within the current CVS that updates another + parameter set of the same type preceding in decoding order. An + update of a parameter set refers to the use of the same parameter set + ID but with some other parameters changed. If this property is true + for all CVSs in the bitstream, then all parameter sets can be sent + out-of-band before session start. + + The decoding unit information SEI message provides information + regarding coded picture buffer removal delay for a decoding unit. + The message can be used in very-low-delay buffering operations. + + The region refresh information SEI message can be used together with + the recovery point SEI message (present in both H.264 and HEVC) for + improved support of gradual decoding refresh. This supports random + access from inter-coded pictures, wherein complete pictures can be + correctly decoded or recovered after an indicated number of pictures + in output/display order. + +1.1.3. Parallel Processing Support + + The reportedly significantly higher encoding computational demand of + HEVC over H.264, in conjunction with the ever-increasing video + resolution (both spatially and temporally) required by the market, + led to the adoption of VCL coding tools specifically targeted to + allow for parallelization on the sub-picture level. That is, + parallelization occurs, at the minimum, at the granularity of an + integer number of CTUs. The targets for this type of high-level + parallelization are multicore CPUs and DSPs as well as multiprocessor + systems. In a system design, to be useful, these tools require + signaling support, which is provided in Section 7 of this memo. This + section provides a brief overview of the tools available in [HEVC]. + + Many of the tools incorporated in HEVC were designed keeping in mind + the potential parallel implementations in multicore/multiprocessor + architectures. Specifically, for parallelization, four picture + partition strategies, as described below, are available. + + + + + +Wang, et al. Standards Track [Page 11] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Slices are segments of the bitstream that can be reconstructed + independently from other slices within the same picture (though there + may still be interdependencies through loop filtering operations). + Slices are the only tool that can be used for parallelization that is + also available, in virtually identical form, in H.264. + Parallelization based on slices does not require much inter-processor + or inter-core communication (except for inter-processor or inter-core + data sharing for motion compensation when decoding a predictively + coded picture, which is typically much heavier than inter-processor + or inter-core data sharing due to in-picture prediction), as slices + are designed to be independently decodable. However, for the same + reason, slices can require some coding overhead. Further, slices (in + contrast to some of the other tools mentioned below) also serve as + the key mechanism for bitstream partitioning to match Maximum + Transfer Unit (MTU) size requirements, due to the in-picture + independence of slices and the fact that each regular slice is + encapsulated in its own NAL unit. In many cases, the goal of + parallelization and the goal of MTU size matching can place + contradicting demands to the slice layout in a picture. The + realization of this situation led to the development of the more + advanced tools mentioned below. + + Dependent slice segments allow for fragmentation of a coded slice + into fragments at CTU boundaries without breaking any in-picture + prediction mechanisms. They are complementary to the fragmentation + mechanism described in this memo in that they need the cooperation of + the encoder. As a dependent slice segment necessarily contains an + integer number of CTUs, a decoder using multiple cores operating on + CTUs can process a dependent slice segment without communicating + parts of the slice segment's bitstream to other cores. + Fragmentation, as specified in this memo, in contrast, does not + guarantee that a fragment contains an integer number of CTUs. + + In Wavefront Parallel Processing (WPP), the picture is partitioned + into rows of CTUs. Entropy decoding and prediction are allowed to + use data from CTUs in other partitions. Parallel processing is + possible through parallel decoding of CTU rows, where the start of + the decoding of a row is delayed by two CTUs, so to ensure that data + related to a CTU above and to the right of the subject CTU is + available before the subject CTU is being decoded. Using this + staggered start (which appears like a wavefront when represented + graphically), parallelization is possible with up to as many + processors/cores as the picture contains CTU rows. + + Because in-picture prediction between neighboring CTU rows within a + picture is allowed, the required inter-processor/inter-core + communication to enable in-picture prediction can be substantial. + The WPP partitioning does not result in the creation of more NAL + + + +Wang, et al. Standards Track [Page 12] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + units compared to when it is not applied; thus, WPP cannot be used + for MTU size matching, though slices can be used in combination for + that purpose. + + Tiles define horizontal and vertical boundaries that partition a + picture into tile columns and rows. The scan order of CTUs is + changed to be local within a tile (in the order of a CTU raster scan + of a tile), before decoding the top-left CTU of the next tile in the + order of tile raster scan of a picture. Similar to slices, tiles + break in-picture prediction dependencies (including entropy decoding + dependencies). However, they do not need to be included into + individual NAL units (same as WPP in this regard); hence, tiles + cannot be used for MTU size matching, though slices can be used in + combination for that purpose. Each tile can be processed by one + processor/core, and the inter-processor/inter-core communication + required for in-picture prediction between processing units decoding + neighboring tiles is limited to conveying the shared slice header in + cases a slice is spanning more than one tile, and loop-filtering- + related sharing of reconstructed samples and metadata. Insofar, + tiles are less demanding in terms of inter-processor communication + bandwidth compared to WPP due to the in-picture independence between + two neighboring partitions. + +1.1.4. NAL Unit Header + + HEVC maintains the NAL unit concept of H.264 with modifications. + HEVC uses a two-byte NAL unit header, as shown in Figure 1. The + payload of a NAL unit refers to the NAL unit excluding the NAL unit + header. + + +---------------+---------------+ + |0|1|2|3|4|5|6|7|0|1|2|3|4|5|6|7| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |F| Type | LayerId | TID | + +-------------+-----------------+ + + Figure 1: The Structure of the HEVC NAL Unit Header + + The semantics of the fields in the NAL unit header are as specified + in [HEVC] and described briefly below for convenience. In addition + to the name and size of each field, the corresponding syntax element + name in [HEVC] is also provided. + + F: 1 bit + forbidden_zero_bit. Required to be zero in [HEVC]. Note that the + inclusion of this bit in the NAL unit header was to enable + transport of HEVC video over MPEG-2 transport systems (avoidance + of start code emulations) [MPEG2S]. In the context of this memo, + + + +Wang, et al. Standards Track [Page 13] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + the value 1 may be used to indicate a syntax violation, e.g., for + a NAL unit resulted from aggregating a number of fragmented units + of a NAL unit but missing the last fragment, as described in + Section 4.4.3. + + Type: 6 bits + nal_unit_type. This field specifies the NAL unit type as defined + in Table 7-1 of [HEVC]. If the most significant bit of this field + of a NAL unit is equal to 0 (i.e., the value of this field is less + than 32), the NAL unit is a VCL NAL unit. Otherwise, the NAL unit + is a non-VCL NAL unit. For a reference of all currently defined + NAL unit types and their semantics, please refer to Section 7.4.2 + in [HEVC]. + + LayerId: 6 bits + nuh_layer_id. Required to be equal to zero in [HEVC]. It is + anticipated that in future scalable or 3D video coding extensions + of this specification, this syntax element will be used to + identify additional layers that may be present in the CVS, wherein + a layer may be, e.g., a spatial scalable layer, a quality scalable + layer, a texture view, or a depth view. + + TID: 3 bits + nuh_temporal_id_plus1. This field specifies the temporal + identifier of the NAL unit plus 1. The value of TemporalId is + equal to TID minus 1. A TID value of 0 is illegal to ensure that + there is at least one bit in the NAL unit header equal to 1, so to + enable independent considerations of start code emulations in the + NAL unit header and in the NAL unit payload data. + +1.2. Overview of the Payload Format + + This payload format defines the following processes required for + transport of HEVC coded data over RTP [RFC3550]: + + o Usage of RTP header with this payload format + + o Packetization of HEVC coded NAL units into RTP packets using three + types of payload structures: a single NAL unit packet, aggregation + packet, and fragment unit + + o Transmission of HEVC NAL units of the same bitstream within a + single RTP stream or multiple RTP streams (within one or more RTP + sessions), where within an RTP stream transmission of NAL units + may be either non-interleaved (i.e., the transmission order of NAL + units is the same as their decoding order) or interleaved (i.e., + the transmission order of NAL units is different from the decoding + order) + + + +Wang, et al. Standards Track [Page 14] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + o Media type parameters to be used with the Session Description + Protocol (SDP) [RFC4566] + + o A payload header extension mechanism and data structures for + enhanced support of temporal scalability based on that extension + mechanism. + +2. Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in BCP 14 [RFC2119]. + + In this document, the above key words will convey that interpretation + only when in ALL CAPS. Lowercase uses of these words are not to be + interpreted as carrying the significance described in RFC 2119. + + This specification uses the notion of setting and clearing a bit when + bit fields are handled. Setting a bit is the same as assigning that + bit the value of 1 (On). Clearing a bit is the same as assigning + that bit the value of 0 (Off). + +3. Definitions and Abbreviations + +3.1. Definitions + + This document uses the terms and definitions of [HEVC]. Section + 3.1.1 lists relevant definitions from [HEVC] for convenience. + Section 3.1.2 provides definitions specific to this memo. + +3.1.1. Definitions from the HEVC Specification + + access unit: A set of NAL units that are associated with each other + according to a specified classification rule, that are consecutive in + decoding order, and that contain exactly one coded picture. + + BLA access unit: An access unit in which the coded picture is a BLA + picture. + + BLA picture: An IRAP picture for which each VCL NAL unit has + nal_unit_type equal to BLA_W_LP, BLA_W_RADL, or BLA_N_LP. + + Coded Video Sequence (CVS): A sequence of access units that consists, + in decoding order, of an IRAP access unit with NoRaslOutputFlag equal + to 1, followed by zero or more access units that are not IRAP access + units with NoRaslOutputFlag equal to 1, including all subsequent + access units up to but not including any subsequent access unit that + is an IRAP access unit with NoRaslOutputFlag equal to 1. + + + +Wang, et al. Standards Track [Page 15] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Informative note: An IRAP access unit may be an IDR access unit, a + BLA access unit, or a CRA access unit. The value of + NoRaslOutputFlag is equal to 1 for each IDR access unit, each BLA + access unit, and each CRA access unit that is the first access + unit in the bitstream in decoding order, is the first access unit + that follows an end of sequence NAL unit in decoding order, or has + HandleCraAsBlaFlag equal to 1. + + CRA access unit: An access unit in which the coded picture is a CRA + picture. + + CRA picture: A RAP picture for which each VCL NAL unit has + nal_unit_type equal to CRA_NUT. + + IDR access unit: An access unit in which the coded picture is an IDR + picture. + + IDR picture: A RAP picture for which each VCL NAL unit has + nal_unit_type equal to IDR_W_RADL or IDR_N_LP. + + IRAP access unit: An access unit in which the coded picture is an + IRAP picture. + + IRAP picture: A coded picture for which each VCL NAL unit has + nal_unit_type in the range of BLA_W_LP (16) to RSV_IRAP_VCL23 (23), + inclusive. + + layer: A set of VCL NAL units that all have a particular value of + nuh_layer_id and the associated non-VCL NAL units, or one of a set of + syntactical structures having a hierarchical relationship. + + operation point: bitstream created from another bitstream by + operation of the sub-bitstream extraction process with the another + bitstream, a target highest TemporalId, and a target-layer identifier + list as input. + + random access: The act of starting the decoding process for a + bitstream at a point other than the beginning of the bitstream. + + sub-layer: A temporal scalable layer of a temporal scalable bitstream + consisting of VCL NAL units with a particular value of the TemporalId + variable, and the associated non-VCL NAL units. + + sub-layer representation: A subset of the bitstream consisting of NAL + units of a particular sub-layer and the lower sub-layers. + + tile: A rectangular region of coding tree blocks within a particular + tile column and a particular tile row in a picture. + + + +Wang, et al. Standards Track [Page 16] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + tile column: A rectangular region of coding tree blocks having a + height equal to the height of the picture and a width specified by + syntax elements in the picture parameter set. + + tile row: A rectangular region of coding tree blocks having a height + specified by syntax elements in the picture parameter set and a width + equal to the width of the picture. + +3.1.2. Definitions Specific to This Memo + + dependee RTP stream: An RTP stream on which another RTP stream + depends. All RTP streams in a Multiple RTP streams on a Single media + Transport (MRST) or Multiple RTP streams on Multiple media Transports + (MRMT), except for the highest RTP stream, are dependee RTP streams. + + highest RTP stream: The RTP stream on which no other RTP stream + depends. The RTP stream in a Single RTP stream on a Single media + Transport (SRST) is the highest RTP stream. + + Media-Aware Network Element (MANE): A network element, such as a + middlebox, selective forwarding unit, or application-layer gateway + that is capable of parsing certain aspects of the RTP payload headers + or the RTP payload and reacting to their contents. + + Informative note: The concept of a MANE goes beyond normal routers + or gateways in that a MANE has to be aware of the signaling (e.g., + to learn about the payload type mappings of the media streams), + and in that it has to be trusted when working with Secure RTP + (SRTP). The advantage of using MANEs is that they allow packets + to be dropped according to the needs of the media coding. For + example, if a MANE has to drop packets due to congestion on a + certain link, it can identify and remove those packets whose + elimination produces the least adverse effect on the user + experience. After dropping packets, MANEs must rewrite RTCP + packets to match the changes to the RTP stream, as specified in + Section 7 of [RFC3550]. + + Media Transport: As used in the MRST, MRMT, and SRST definitions + below, Media Transport denotes the transport of packets over a + transport association identified by a 5-tuple (source address, source + port, destination address, destination port, transport protocol). + See also Section 2.1.13 of [RFC7656]. + + Informative note: The term "bitstream" in this document is + equivalent to the term "encoded stream" in [RFC7656]. + + + + + + +Wang, et al. Standards Track [Page 17] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Multiple RTP streams on a Single media Transport (MRST): Multiple + RTP streams carrying a single HEVC bitstream on a Single Transport. + See also Section 3.5 of [RFC7656]. + + Multiple RTP streams on Multiple media Transports (MRMT): Multiple + RTP streams carrying a single HEVC bitstream on Multiple Transports. + See also Section 3.5 of [RFC7656]. + + NAL unit decoding order: A NAL unit order that conforms to the + constraints on NAL unit order given in Section 7.4.2.4 in [HEVC]. + + NAL unit output order: A NAL unit order in which NAL units of + different access units are in the output order of the decoded + pictures corresponding to the access units, as specified in [HEVC], + and in which NAL units within an access unit are in their decoding + order. + + NAL-unit-like structure: A data structure that is similar to NAL + units in the sense that it also has a NAL unit header and a payload, + with a difference that the payload does not follow the start code + emulation prevention mechanism required for the NAL unit syntax as + specified in Section 7.3.1.1 of [HEVC]. Examples of NAL-unit-like + structures defined in this memo are packet payloads of Aggregation + Packet (AP), PAyload Content Information (PACI), and Fragmentation + Unit (FU) packets. + + NALU-time: The value that the RTP timestamp would have if the NAL + unit would be transported in its own RTP packet. + + RTP stream: See [RFC7656]. Within the scope of this memo, one RTP + stream is utilized to transport one or more temporal sub-layers. + + Single RTP stream on a Single media Transport (SRST): Single RTP + stream carrying a single HEVC bitstream on a Single (Media) + Transport. See also Section 3.5 of [RFC7656]. + + transmission order: The order of packets in ascending RTP sequence + number order (in modulo arithmetic). Within an aggregation packet, + the NAL unit transmission order is the same as the order of + appearance of NAL units in the packet. + + + + + + + + + + + +Wang, et al. Standards Track [Page 18] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +3.2. Abbreviations + + AP Aggregation Packet + + BLA Broken Link Access + + CRA Clean Random Access + + CTB Coding Tree Block + + CTU Coding Tree Unit + + CVS Coded Video Sequence + + DPH Decoded Picture Hash + + FU Fragmentation Unit + + HRD Hypothetical Reference Decoder + + IDR Instantaneous Decoding Refresh + + IRAP Intra Random Access Point + + MANE Media-Aware Network Element + + MRMT Multiple RTP streams on Multiple media Transports + + MRST Multiple RTP streams on a Single media Transport + + MTU Maximum Transfer Unit + + NAL Network Abstraction Layer + + NALU Network Abstraction Layer Unit + + PACI PAyload Content Information + + PHES Payload Header Extension Structure + + PPS Picture Parameter Set + + RADL Random Access Decodable Leading (Picture) + + RASL Random Access Skipped Leading (Picture) + + RPS Reference Picture Set + + + + +Wang, et al. Standards Track [Page 19] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + SEI Supplemental Enhancement Information + + SPS Sequence Parameter Set + + SRST Single RTP stream on a Single media Transport + + STSA Step-wise Temporal Sub-layer Access + + TSA Temporal Sub-layer Access + + TSCI Temporal Scalability Control Information + + VCL Video Coding Layer + + VPS Video Parameter Set + +4. RTP Payload Format + +4.1. RTP Header Usage + + The format of the RTP header is specified in [RFC3550] (reprinted as + Figure 2 for convenience). This payload format uses the fields of + the header in a manner consistent with that specification. + + The RTP payload (and the settings for some RTP header bits) for + aggregation packets and fragmentation units are specified in Sections + 4.4.2 and 4.4.3, respectively. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |V=2|P|X| CC |M| PT | sequence number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | timestamp | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | synchronization source (SSRC) identifier | + +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ + | contributing source (CSRC) identifiers | + | .... | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 2: RTP Header According to [RFC3550] + + + + + + + + + +Wang, et al. Standards Track [Page 20] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + The RTP header information to be set according to this RTP payload + format is set as follows: + + Marker bit (M): 1 bit + + Set for the last packet of the access unit, carried in the current + RTP stream. This is in line with the normal use of the M bit in + video formats to allow an efficient playout buffer handling. When + MRST or MRMT is in use, if an access unit appears in multiple RTP + streams, the marker bit is set on each RTP stream's last packet of + the access unit. + + Informative note: The content of a NAL unit does not tell + whether or not the NAL unit is the last NAL unit, in decoding + order, of an access unit. An RTP sender implementation may + obtain this information from the video encoder. If, however, + the implementation cannot obtain this information directly from + the encoder, e.g., when the bitstream was pre-encoded, and also + there is no timestamp allocated for each NAL unit, then the + sender implementation can inspect subsequent NAL units in + decoding order to determine whether or not the NAL unit is the + last NAL unit of an access unit as follows. A NAL unit is + determined to be the last NAL unit of an access unit if it is + the last NAL unit of the bitstream. A NAL unit naluX is also + determined to be the last NAL unit of an access unit if both + the following conditions are true: 1) the next VCL NAL unit + naluY in decoding order has the high-order bit of the first + byte after its NAL unit header equal to 1, and 2) all NAL units + between naluX and naluY, when present, have nal_unit_type in + the range of 32 to 35, inclusive, equal to 39, or in the ranges + of 41 to 44, inclusive, or 48 to 55, inclusive. + + Payload Type (PT): 7 bits + + The assignment of an RTP payload type for this new packet format + is outside the scope of this document and will not be specified + here. The assignment of a payload type has to be performed either + through the profile used or in a dynamic way. + + Informative note: It is not required to use different payload + type values for different RTP streams in MRST or MRMT. + + Sequence Number (SN): 16 bits + + Set and used in accordance with [RFC3550]. + + + + + + +Wang, et al. Standards Track [Page 21] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Timestamp: 32 bits + + The RTP timestamp is set to the sampling timestamp of the content. + A 90 kHz clock rate MUST be used. + + If the NAL unit has no timing properties of its own (e.g., + parameter set and SEI NAL units), the RTP timestamp MUST be set to + the RTP timestamp of the coded picture of the access unit in which + the NAL unit (according to Section 7.4.2.4.4 of [HEVC]) is + included. + + Receivers MUST use the RTP timestamp for the display process, even + when the bitstream contains picture timing SEI messages or + decoding unit information SEI messages as specified in [HEVC]. + However, this does not mean that picture timing SEI messages in + the bitstream should be discarded, as picture timing SEI messages + may contain frame-field information that is important in + appropriately rendering interlaced video. + + Synchronization source (SSRC): 32 bits + + Used to identify the source of the RTP packets. When using SRST, + by definition a single SSRC is used for all parts of a single + bitstream. In MRST or MRMT, different SSRCs are used for each RTP + stream containing a subset of the sub-layers of the single + (temporally scalable) bitstream. A receiver is required to + correctly associate the set of SSRCs that are included parts of + the same bitstream. + +4.2. Payload Header Usage + + The first two bytes of the payload of an RTP packet are referred to + as the payload header. The payload header consists of the same + fields (F, Type, LayerId, and TID) as the NAL unit header as shown in + Section 1.1.4, irrespective of the type of the payload structure. + + The TID value indicates (among other things) the relative importance + of an RTP packet, for example, because NAL units belonging to higher + temporal sub-layers are not used for the decoding of lower temporal + sub-layers. A lower value of TID indicates a higher importance. + More-important NAL units MAY be better protected against transmission + losses than less-important NAL units. + + + + + + + + + +Wang, et al. Standards Track [Page 22] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +4.3. Transmission Modes + + This memo enables transmission of an HEVC bitstream over: + + o a Single RTP stream on a Single media Transport (SRST), + + o Multiple RTP streams over a Single media Transport (MRST), or + + o Multiple RTP streams on Multiple media Transports (MRMT). + + Informative note: While this specification enables the use of MRST + within the H.265 RTP payload, the signaling of MRST within SDP + offer/answer is not fully specified at the time of this writing. + See [RFC5576] and [RFC5583] for what is supported today as well as + [RTP-MULTI-STREAM] and [SDP-NEG] for future directions. + + When in MRMT, the dependency of one RTP stream on another RTP stream + is typically indicated as specified in [RFC5583]. [RFC5583] can also + be utilized to specify dependencies within MRST, but only if the RTP + streams utilize distinct payload types. + + SRST or MRST SHOULD be used for point-to-point unicast scenarios, + whereas MRMT SHOULD be used for point-to-multipoint multicast + scenarios where different receivers require different operation + points of the same HEVC bitstream, to improve bandwidth utilizing + efficiency. + + Informative note: A multicast may degrade to a unicast after all + but one receivers have left (this is a justification of the first + "SHOULD" instead of "MUST"), and there might be scenarios where + MRMT is desirable but not possible, e.g., when IP multicast is not + deployed in certain network (this is a justification of the second + "SHOULD" instead of "MUST"). + + The transmission mode is indicated by the tx-mode media parameter + (see Section 7.1). If tx-mode is equal to "SRST", SRST MUST be used. + Otherwise, if tx-mode is equal to "MRST", MRST MUST be used. + Otherwise (tx-mode is equal to "MRMT"), MRMT MUST be used. + + Informative note: When an RTP stream does not depend on other RTP + streams, any of SRST, MRST, or MRMT may be in use for the RTP + stream. + + Receivers MUST support all of SRST, MRST, and MRMT. + + Informative note: The required support of MRMT by receivers does + not imply that multicast must be supported by receivers. + + + + +Wang, et al. Standards Track [Page 23] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +4.4. Payload Structures + + Four different types of RTP packet payload structures are specified. + A receiver can identify the type of an RTP packet payload through the + Type field in the payload header. + + The four different payload structures are as follows: + + o Single NAL unit packet: Contains a single NAL unit in the payload, + and the NAL unit header of the NAL unit also serves as the payload + header. This payload structure is specified in Section 4.4.1. + + o Aggregation Packet (AP): Contains more than one NAL unit within + one access unit. This payload structure is specified in Section + 4.4.2. + + o Fragmentation Unit (FU): Contains a subset of a single NAL unit. + This payload structure is specified in Section 4.4.3. + + o PACI carrying RTP packet: Contains a payload header (that differs + from other payload headers for efficiency), a Payload Header + Extension Structure (PHES), and a PACI payload. This payload + structure is specified in Section 4.4.4. + +4.4.1. Single NAL Unit Packets + + A single NAL unit packet contains exactly one NAL unit, and consists + of a payload header (denoted as PayloadHdr), a conditional 16-bit + DONL field (in network byte order), and the NAL unit payload data + (the NAL unit excluding its NAL unit header) of the contained NAL + unit, as shown in Figure 3. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PayloadHdr | DONL (conditional) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | NAL unit payload data | + | | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | :...OPTIONAL RTP padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 3: The Structure of a Single NAL Unit Packet + + + + + + +Wang, et al. Standards Track [Page 24] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + The payload header SHOULD be an exact copy of the NAL unit header of + the contained NAL unit. However, the Type (i.e., nal_unit_type) + field MAY be changed, e.g., when it is desirable to handle a CRA + picture to be a BLA picture [JCTVC-J0107]. + + The DONL field, when present, specifies the value of the 16 least + significant bits of the decoding order number of the contained NAL + unit. If sprop-max-don-diff is greater than 0 for any of the RTP + streams, the DONL field MUST be present, and the variable DON for the + contained NAL unit is derived as equal to the value of the DONL + field. Otherwise (sprop-max-don-diff is equal to 0 for all the RTP + streams), the DONL field MUST NOT be present. + +4.4.2. Aggregation Packets (APs) + + Aggregation Packets (APs) are introduced to enable the reduction of + packetization overhead for small NAL units, such as most of the non- + VCL NAL units, which are often only a few octets in size. + + An AP aggregates NAL units within one access unit. Each NAL unit to + be carried in an AP is encapsulated in an aggregation unit. NAL + units aggregated in one AP are in NAL unit decoding order. + + An AP consists of a payload header (denoted as PayloadHdr) followed + by two or more aggregation units, as shown in Figure 4. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PayloadHdr (Type=48) | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | | + | two or more aggregation units | + | | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | :...OPTIONAL RTP padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 4: The Structure of an Aggregation Packet + + The fields in the payload header are set as follows. The F bit MUST + be equal to 0 if the F bit of each aggregated NAL unit is equal to + zero; otherwise, it MUST be equal to 1. The Type field MUST be equal + to 48. The value of LayerId MUST be equal to the lowest value of + LayerId of all the aggregated NAL units. The value of TID MUST be + the lowest value of TID of all the aggregated NAL units. + + + + + +Wang, et al. Standards Track [Page 25] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Informative note: All VCL NAL units in an AP have the same TID + value since they belong to the same access unit. However, an AP + may contain non-VCL NAL units for which the TID value in the NAL + unit header may be different than the TID value of the VCL NAL + units in the same AP. + + An AP MUST carry at least two aggregation units and can carry as many + aggregation units as necessary; however, the total amount of data in + an AP obviously MUST fit into an IP packet, and the size SHOULD be + chosen so that the resulting IP packet is smaller than the MTU size + so to avoid IP layer fragmentation. An AP MUST NOT contain FUs + specified in Section 4.4.3. APs MUST NOT be nested; i.e., an AP must + not contain another AP. + + The first aggregation unit in an AP consists of a conditional 16-bit + DONL field (in network byte order) followed by a 16-bit unsigned size + information (in network byte order) that indicates the size of the + NAL unit in bytes (excluding these two octets, but including the NAL + unit header), followed by the NAL unit itself, including its NAL unit + header, as shown in Figure 5. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + : DONL (conditional) | NALU size | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | NALU size | | + +-+-+-+-+-+-+-+-+ NAL unit | + | | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | : + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 5: The Structure of the First Aggregation Unit in an AP + + The DONL field, when present, specifies the value of the 16 least + significant bits of the decoding order number of the aggregated NAL + unit. + + If sprop-max-don-diff is greater than 0 for any of the RTP streams, + the DONL field MUST be present in an aggregation unit that is the + first aggregation unit in an AP, and the variable DON for the + aggregated NAL unit is derived as equal to the value of the DONL + field. Otherwise (sprop-max-don-diff is equal to 0 for all the RTP + streams), the DONL field MUST NOT be present in an aggregation unit + that is the first aggregation unit in an AP. + + + + + +Wang, et al. Standards Track [Page 26] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + An aggregation unit that is not the first aggregation unit in an AP + consists of a conditional 8-bit DOND field followed by a 16-bit + unsigned size information (in network byte order) that indicates the + size of the NAL unit in bytes (excluding these two octets, but + including the NAL unit header), followed by the NAL unit itself, + including its NAL unit header, as shown in Figure 6. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + : DOND (cond) | NALU size | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | NAL unit | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | : + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 6: The Structure of an Aggregation Unit That Is Not the + First Aggregation Unit in an AP + + When present, the DOND field plus 1 specifies the difference between + the decoding order number values of the current aggregated NAL unit + and the preceding aggregated NAL unit in the same AP. + + If sprop-max-don-diff is greater than 0 for any of the RTP streams, + the DOND field MUST be present in an aggregation unit that is not the + first aggregation unit in an AP, and the variable DON for the + aggregated NAL unit is derived as equal to the DON of the preceding + aggregated NAL unit in the same AP plus the value of the DOND field + plus 1 modulo 65536. Otherwise (sprop-max-don-diff is equal to 0 for + all the RTP streams), the DOND field MUST NOT be present in an + aggregation unit that is not the first aggregation unit in an AP, and + in this case the transmission order and decoding order of NAL units + carried in the AP are the same as the order the NAL units appear in + the AP. + + Figure 7 presents an example of an AP that contains two aggregation + units, labeled as 1 and 2 in the figure, without the DONL and DOND + fields being present. + + + + + + + + + + + +Wang, et al. Standards Track [Page 27] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | RTP Header | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PayloadHdr (Type=48) | NALU 1 Size | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | NALU 1 HDR | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NALU 1 Data | + | . . . | + | | + + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . . . | NALU 2 Size | NALU 2 HDR | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | NALU 2 HDR | | + +-+-+-+-+-+-+-+-+ NALU 2 Data | + | . . . | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | :...OPTIONAL RTP padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 7: An Example of an AP Packet Containing Two Aggregation + Units without the DONL and DOND Fields + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Wang, et al. Standards Track [Page 28] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Figure 8 presents an example of an AP that contains two aggregation + units, labeled as 1 and 2 in the figure, with the DONL and DOND + fields being present. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | RTP Header | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PayloadHdr (Type=48) | NALU 1 DONL | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | NALU 1 Size | NALU 1 HDR | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | NALU 1 Data . . . | + | | + + . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | NALU 2 DOND | NALU 2 Size | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | NALU 2 HDR | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NALU 2 Data | + | | + | . . . +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | :...OPTIONAL RTP padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 8: An Example of an AP Containing Two Aggregation Units + with the DONL and DOND Fields + +4.4.3. Fragmentation Units + + Fragmentation Units (FUs) are introduced to enable fragmenting a + single NAL unit into multiple RTP packets, possibly without + cooperation or knowledge of the HEVC encoder. A fragment of a NAL + unit consists of an integer number of consecutive octets of that NAL + unit. Fragments of the same NAL unit MUST be sent in consecutive + order with ascending RTP sequence numbers (with no other RTP packets + within the same RTP stream being sent between the first and last + fragment). + + When a NAL unit is fragmented and conveyed within FUs, it is referred + to as a fragmented NAL unit. APs MUST NOT be fragmented. FUs MUST + NOT be nested; i.e., an FU must not contain a subset of another FU. + + The RTP timestamp of an RTP packet carrying an FU is set to the NALU- + time of the fragmented NAL unit. + + + + + +Wang, et al. Standards Track [Page 29] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + An FU consists of a payload header (denoted as PayloadHdr), an FU + header of one octet, a conditional 16-bit DONL field (in network byte + order), and an FU payload, as shown in Figure 9. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PayloadHdr (Type=49) | FU header | DONL (cond) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-| + | DONL (cond) | | + |-+-+-+-+-+-+-+-+ | + | FU payload | + | | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | :...OPTIONAL RTP padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 9: The Structure of an FU + + The fields in the payload header are set as follows. The Type field + MUST be equal to 49. The fields F, LayerId, and TID MUST be equal to + the fields F, LayerId, and TID, respectively, of the fragmented NAL + unit. + + The FU header consists of an S bit, an E bit, and a 6-bit FuType + field, as shown in Figure 10. + + +---------------+ + |0|1|2|3|4|5|6|7| + +-+-+-+-+-+-+-+-+ + |S|E| FuType | + +---------------+ + + Figure 10: The Structure of FU Header + + The semantics of the FU header fields are as follows: + + S: 1 bit + When set to 1, the S bit indicates the start of a fragmented NAL + unit, i.e., the first byte of the FU payload is also the first + byte of the payload of the fragmented NAL unit. When the FU + payload is not the start of the fragmented NAL unit payload, the S + bit MUST be set to 0. + + + + + + + + +Wang, et al. Standards Track [Page 30] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + E: 1 bit + When set to 1, the E bit indicates the end of a fragmented NAL + unit, i.e., the last byte of the payload is also the last byte of + the fragmented NAL unit. When the FU payload is not the last + fragment of a fragmented NAL unit, the E bit MUST be set to 0. + + FuType: 6 bits + The field FuType MUST be equal to the field Type of the fragmented + NAL unit. + + The DONL field, when present, specifies the value of the 16 least + significant bits of the decoding order number of the fragmented NAL + unit. + + If sprop-max-don-diff is greater than 0 for any of the RTP streams, + and the S bit is equal to 1, the DONL field MUST be present in the + FU, and the variable DON for the fragmented NAL unit is derived as + equal to the value of the DONL field. Otherwise (sprop-max-don-diff + is equal to 0 for all the RTP streams, or the S bit is equal to 0), + the DONL field MUST NOT be present in the FU. + + A non-fragmented NAL unit MUST NOT be transmitted in one FU; i.e., + the Start bit and End bit must not both be set to 1 in the same FU + header. + + The FU payload consists of fragments of the payload of the fragmented + NAL unit so that if the FU payloads of consecutive FUs, starting with + an FU with the S bit equal to 1 and ending with an FU with the E bit + equal to 1, are sequentially concatenated, the payload of the + fragmented NAL unit can be reconstructed. The NAL unit header of the + fragmented NAL unit is not included as such in the FU payload, but + rather the information of the NAL unit header of the fragmented NAL + unit is conveyed in F, LayerId, and TID fields of the FU payload + headers of the FUs and the FuType field of the FU header of the FUs. + An FU payload MUST NOT be empty. + + If an FU is lost, the receiver SHOULD discard all following + fragmentation units in transmission order corresponding to the same + fragmented NAL unit, unless the decoder in the receiver is known to + be prepared to gracefully handle incomplete NAL units. + + A receiver in an endpoint or in a MANE MAY aggregate the first n-1 + fragments of a NAL unit to an (incomplete) NAL unit, even if fragment + n of that NAL unit is not received. In this case, the + forbidden_zero_bit of the NAL unit MUST be set to 1 to indicate a + syntax violation. + + + + + +Wang, et al. Standards Track [Page 31] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +4.4.4. PACI Packets + + This section specifies the PACI packet structure. The basic payload + header specified in this memo is intentionally limited to the 16 bits + of the NAL unit header so to keep the packetization overhead to a + minimum. However, cases have been identified where it is advisable + to include control information in an easily accessible position in + the packet header, despite the additional overhead. One such control + information is the TSCI as specified in Section 4.5. PACI packets + carry this and future, similar structures. + + The PACI packet structure is based on a payload header extension + mechanism that is generic and extensible to carry payload header + extensions. In this section, the focus lies on the use within this + specification. Section 4.4.4.2 provides guidance for the + specification designers in how to employ the extension mechanism in + future specifications. + + A PACI packet consists of a payload header (denoted as PayloadHdr), + for which the structure follows what is described in Section 4.2. + The payload header is followed by the fields A, cType, PHSsize, + F[0..2], and Y. + + Figure 11 shows a PACI packet in compliance with this memo, i.e., + without any extensions. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PayloadHdr (Type=50) |A| cType | PHSsize |F0..2|Y| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Payload Header Extension Structure (PHES) | + |=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=| + | | + | PACI payload: NAL unit | + | . . . | + | | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | :...OPTIONAL RTP padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 11: The Structure of a PACI + + + + + + + + + +Wang, et al. Standards Track [Page 32] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + The fields in the payload header are set as follows. The F bit MUST + be equal to 0. The Type field MUST be equal to 50. The value of + LayerId MUST be a copy of the LayerId field of the PACI payload NAL + unit or NAL-unit-like structure. The value of TID MUST be a copy of + the TID field of the PACI payload NAL unit or NAL-unit-like + structure. + + The semantics of other fields are as follows: + + A: 1 bit + Copy of the F bit of the PACI payload NAL unit or NAL-unit-like + structure. + + cType: 6 bits + Copy of the Type field of the PACI payload NAL unit or NAL-unit- + like structure. + + PHSsize: 5 bits + Indicates the length of the PHES field. The value is limited to + be less than or equal to 32 octets, to simplify encoder design for + MTU size matching. + + F0: + This field equal to 1 specifies the presence of a temporal + scalability support extension in the PHES. + + F1, F2: + MUST be 0, available for future extensions, see Section 4.4.4.2. + Receivers compliant with this version of the HEVC payload format + MUST ignore F1=1 and/or F2=1, and also ignore any information in + the PHES indicated as present by F1=1 and/or F2=1. + + Informative note: The receiver can do that by first decoding + information associated with F0=1, and then skipping over any + remaining bytes of the PHES based on the value of PHSsize. + + Y: 1 bit + MUST be 0, available for future extensions, see Section 4.4.4.2. + Receivers compliant with this version of the HEVC payload format + MUST ignore Y=1, and also ignore any information in the PHES + indicated as present by Y. + + PHES: variable number of octets + A variable number of octets as indicated by the value of PHSsize. + + PACI Payload: + The single NAL unit packet or NAL-unit-like structure (such as: FU + or AP) to be carried, not including the first two octets. + + + +Wang, et al. Standards Track [Page 33] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Informative note: The first two octets of the NAL unit or NAL- + unit-like structure carried in the PACI payload are not + included in the PACI payload. Rather, the respective values + are copied in locations of the PayloadHdr of the RTP packet. + This design offers two advantages: first, the overall structure + of the payload header is preserved, i.e., there is no special + case of payload header structure that needs to be implemented + for PACI. Second, no additional overhead is introduced. + + A PACI payload MAY be a single NAL unit, an FU, or an AP. PACIs + MUST NOT be fragmented or aggregated. The following subsection + documents the reasons for these design choices. + +4.4.4.1. Reasons for the PACI Rules (Informative) + + A PACI cannot be fragmented. If a PACI could be fragmented, and a + fragment other than the first fragment got lost, access to the + information in the PACI would not be possible. Therefore, a PACI + must not be fragmented. In other words, an FU must not carry + (fragments of) a PACI. + + A PACI cannot be aggregated. Aggregation of PACIs is inadvisable + from a compression viewpoint, as, in many cases, several to be + aggregated NAL units would share identical PACI fields and values + which would be carried redundantly for no reason. Most, if not all, + of the practical effects of PACI aggregation can be achieved by + aggregating NAL units and bundling them with a PACI (see below). + Therefore, a PACI must not be aggregated. In other words, an AP must + not contain a PACI. + + The payload of a PACI can be a fragment. Both middleboxes and + sending systems with inflexible (often hardware-based) encoders + occasionally find themselves in situations where a PACI and its + headers, combined, are larger than the MTU size. In such a scenario, + the middlebox or sender can fragment the NAL unit and encapsulate the + fragment in a PACI. Doing so preserves the payload header extension + information for all fragments, allowing downstream middleboxes and + the receiver to take advantage of that information. Therefore, a + sender may place a fragment into a PACI, and a receiver must be able + to handle such a PACI. + + The payload of a PACI can be an aggregation NAL unit. HEVC + bitstreams can contain unevenly sized and/or small (when compared to + the MTU size) NAL units. In order to efficiently packetize such + small NAL units, APs were introduced. The benefits of APs are + independent from the need for a payload header extension. Therefore, + a sender may place an AP into a PACI, and a receiver must be able to + handle such a PACI. + + + +Wang, et al. Standards Track [Page 34] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +4.4.4.2. PACI Extensions (Informative) + + This section includes recommendations for future specification + designers on how to extent the PACI syntax to accommodate future + extensions. Obviously, designers are free to specify whatever + appears to be appropriate to them at the time of their design. + However, a lot of thought has been invested into the extension + mechanism described below, and we suggest that deviations from it + warrant a good explanation. + + This memo defines only a single payload header extension (TSCI, + described in Section 4.5); therefore, only the F0 bit carries + semantics. F1 and F2 are already named (and not just marked as + reserved, as a typical video spec designer would do). They are + intended to signal two additional extensions. The Y bit allows one + to, recursively, add further F and Y bits to extend the mechanism + beyond three possible payload header extensions. It is suggested to + define a new packet type (using a different value for Type) when + assigning the F1, F2, or Y bits different semantics than what is + suggested below. + + When a Y bit is set, an 8-bit flag-extension is inserted after the Y + bit. A flag-extension consists of 7 flags F[n..n+6], and another Y + bit. + + The basic PACI header already includes F0, F1, and F2. Therefore, + the Fx bits in the first flag-extensions are numbered F3, F4, ..., + F9; the F bits in the second flag-extension are numbered F10, F11, + ..., F16, and so forth. As a result, at least three Fx bits are + always in the PACI, but the number of Fx bits (and associated types + of extensions) can be increased by setting the next Y bit and adding + an octet of flag-extensions, carrying seven flags and another Y bit. + The size of this list of flags is subject to the limits specified in + Section 4.4.4 (32 octets for all flag-extensions and the PHES + information combined). + + Each of the F bits can indicate either the presence or the absence of + certain information in the Payload Header Extension Structure (PHES). + + When a spec developer devises a new syntax that takes advantage of + the PACI extension mechanism, he/she must follow the constraints + listed below; otherwise, the extension mechanism may break. + + 1) The fields added for a particular Fx bit MUST be fixed in + length and not depend on what other Fx bits are set (no parsing + dependency). + + 2) The Fx bits must be assigned in order. + + + +Wang, et al. Standards Track [Page 35] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + 3) An implementation that supports the n-th Fn bit for any value + of n must understand the syntax (though not necessarily the + semantics) of the fields Fk (with k < n), so as to be able to + either use those bits when present, or at least be able to skip + over them. + +4.5. Temporal Scalability Control Information + + This section describes the single payload header extension defined in + this specification, known as TSCI. If, in the future, additional + payload header extensions become necessary, they could be specified + in this section of an updated version of this document, or in their + own documents. + + When F0 is set to 1 in a PACI, this specifies that the PHES field + includes the TSCI fields TL0PICIDX, IrapPicID, S, and E as follows: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | PayloadHdr (Type=50) |A| cType | PHSsize |F0..2|Y| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | TL0PICIDX | IrapPicID |S|E| RES | | + |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | + | .... | + | PACI payload: NAL unit | + | | + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | :...OPTIONAL RTP padding | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Figure 12: The Structure of a PACI with a PHES Containing a TSCI + + TL0PICIDX (8 bits) + When present, the TL0PICIDX field MUST be set to equal to + temporal_sub_layer_zero_idx as specified in Section D.3.22 of + [HEVC] for the access unit containing the NAL unit in the PACI. + + IrapPicID (8 bits) + When present, the IrapPicID field MUST be set to equal to + irap_pic_id as specified in Section D.3.22 of [HEVC] for the + access unit containing the NAL unit in the PACI. + + + + + + + + + +Wang, et al. Standards Track [Page 36] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + S (1 bit) + The S bit MUST be set to 1 if any of the following conditions is + true and MUST be set to 0 otherwise: + + o The NAL unit in the payload of the PACI is the first VCL NAL + unit, in decoding order, of a picture. + + o The NAL unit in the payload of the PACI is an AP, and the NAL + unit in the first contained aggregation unit is the first VCL + NAL unit, in decoding order, of a picture. + + o The NAL unit in the payload of the PACI is an FU with its S bit + equal to 1 and the FU payload containing a fragment of the + first VCL NAL unit, in decoding order, of a picture. + + E (1 bit) + The E bit MUST be set to 1 if any of the following conditions is + true and MUST be set to 0 otherwise: + + o The NAL unit in the payload of the PACI is the last VCL NAL + unit, in decoding order, of a picture. + + o The NAL unit in the payload of the PACI is an AP and the NAL + unit in the last contained aggregation unit is the last VCL NAL + unit, in decoding order, of a picture. + + o The NAL unit in the payload of the PACI is an FU with its E bit + equal to 1 and the FU payload containing a fragment of the last + VCL NAL unit, in decoding order, of a picture. + + RES (6 bits) + MUST be equal to 0. Reserved for future extensions. + + The value of PHSsize MUST be set to 3. Receivers MUST allow other + values of the fields F0, F1, F2, Y, and PHSsize, and MUST ignore any + additional fields, when present, than specified above in the PHES. + +4.6. Decoding Order Number + + For each NAL unit, the variable AbsDon is derived, representing the + decoding order number that is indicative of the NAL unit decoding + order. + + Let NAL unit n be the n-th NAL unit in transmission order within an + RTP stream. + + + + + + +Wang, et al. Standards Track [Page 37] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + If sprop-max-don-diff is equal to 0 for all the RTP streams carrying + the HEVC bitstream, AbsDon[n], the value of AbsDon for NAL unit n, is + derived as equal to n. + + Otherwise (sprop-max-don-diff is greater than 0 for any of the RTP + streams), AbsDon[n] is derived as follows, where DON[n] is the value + of the variable DON for NAL unit n: + + o If n is equal to 0 (i.e., NAL unit n is the very first NAL unit in + transmission order), AbsDon[0] is set equal to DON[0]. + + o Otherwise (n is greater than 0), the following applies for + derivation of AbsDon[n]: + + If DON[n] == DON[n-1], + AbsDon[n] = AbsDon[n-1] + + If (DON[n] > DON[n-1] and DON[n] - DON[n-1] < 32768), + AbsDon[n] = AbsDon[n-1] + DON[n] - DON[n-1] + + If (DON[n] < DON[n-1] and DON[n-1] - DON[n] >= 32768), + AbsDon[n] = AbsDon[n-1] + 65536 - DON[n-1] + DON[n] + + If (DON[n] > DON[n-1] and DON[n] - DON[n-1] >= 32768), + AbsDon[n] = AbsDon[n-1] - (DON[n-1] + 65536 - + DON[n]) + + If (DON[n] < DON[n-1] and DON[n-1] - DON[n] < 32768), + AbsDon[n] = AbsDon[n-1] - (DON[n-1] - DON[n]) + + For any two NAL units m and n, the following applies: + + o AbsDon[n] greater than AbsDon[m] indicates that NAL unit n follows + NAL unit m in NAL unit decoding order. + + o When AbsDon[n] is equal to AbsDon[m], the NAL unit decoding order + of the two NAL units can be in either order. + + o AbsDon[n] less than AbsDon[m] indicates that NAL unit n precedes + NAL unit m in decoding order. + + Informative note: When two consecutive NAL units in the NAL + unit decoding order have different values of AbsDon, the + absolute difference between the two AbsDon values may be + greater than or equal to 1. + + + + + + +Wang, et al. Standards Track [Page 38] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Informative note: There are multiple reasons to allow for the + absolute difference of the values of AbsDon for two consecutive + NAL units in the NAL unit decoding order to be greater than + one. An increment by one is not required, as at the time of + associating values of AbsDon to NAL units, it may not be known + whether all NAL units are to be delivered to the receiver. For + example, a gateway may not forward VCL NAL units of higher sub- + layers or some SEI NAL units when there is congestion in the + network. In another example, the first intra-coded picture of + a pre-encoded clip is transmitted in advance to ensure that it + is readily available in the receiver, and when transmitting the + first intra-coded picture, the originator does not exactly know + how many NAL units will be encoded before the first intra-coded + picture of the pre-encoded clip follows in decoding order. + Thus, the values of AbsDon for the NAL units of the first + intra-coded picture of the pre-encoded clip have to be + estimated when they are transmitted, and gaps in values of + AbsDon may occur. Another example is MRST or MRMT with sprop- + max-don-diff greater than 0, where the AbsDon values must + indicate cross-layer decoding order for NAL units conveyed in + all the RTP streams. + +5. Packetization Rules + + The following packetization rules apply: + + o If sprop-max-don-diff is greater than 0 for any of the RTP + streams, the transmission order of NAL units carried in the RTP + stream MAY be different than the NAL unit decoding order and the + NAL unit output order. Otherwise (sprop-max-don-diff is equal to + 0 for all the RTP streams), the transmission order of NAL units + carried in the RTP stream MUST be the same as the NAL unit + decoding order and, when tx-mode is equal to "MRST" or "MRMT", + MUST also be the same as the NAL unit output order. + + o A NAL unit of a small size SHOULD be encapsulated in an + aggregation packet together with one or more other NAL units in + order to avoid the unnecessary packetization overhead for small + NAL units. For example, non-VCL NAL units such as access unit + delimiters, parameter sets, or SEI NAL units are typically small + and can often be aggregated with VCL NAL units without violating + MTU size constraints. + + o Each non-VCL NAL unit SHOULD, when possible from an MTU size match + viewpoint, be encapsulated in an aggregation packet together with + its associated VCL NAL unit, as typically a non-VCL NAL unit would + be meaningless without the associated VCL NAL unit being + available. + + + +Wang, et al. Standards Track [Page 39] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + o For carrying exactly one NAL unit in an RTP packet, a single NAL + unit packet MUST be used. + +6. De-packetization Process + + The general concept behind de-packetization is to get the NAL units + out of the RTP packets in an RTP stream and all RTP streams the RTP + stream depends on, if any, and pass them to the decoder in the NAL + unit decoding order. + + The de-packetization process is implementation dependent. Therefore, + the following description should be seen as an example of a suitable + implementation. Other schemes may be used as well, as long as the + output for the same input is the same as the process described below. + The output is the same when the set of output NAL units and their + order are both identical. Optimizations relative to the described + algorithms are possible. + + All normal RTP mechanisms related to buffer management apply. In + particular, duplicated or outdated RTP packets (as indicated by the + RTP sequences number and the RTP timestamp) are removed. To + determine the exact time for decoding, factors such as a possible + intentional delay to allow for proper inter-stream synchronization + must be factored in. + + NAL units with NAL unit type values in the range of 0 to 47, + inclusive, may be passed to the decoder. NAL-unit-like structures + with NAL unit type values in the range of 48 to 63, inclusive, MUST + NOT be passed to the decoder. + + The receiver includes a receiver buffer, which is used to compensate + for transmission delay jitter within individual RTP streams and + across RTP streams, to reorder NAL units from transmission order to + the NAL unit decoding order, and to recover the NAL unit decoding + order in MRST or MRMT, when applicable. In this section, the + receiver operation is described under the assumption that there is no + transmission delay jitter within an RTP stream and across RTP + streams. To make a difference from a practical receiver buffer that + is also used for compensation of transmission delay jitter, the + receiver buffer is hereafter called the de-packetization buffer in + this section. Receivers should also prepare for transmission delay + jitter; that is, either reserve separate buffers for transmission + delay jitter buffering and de-packetization buffering or use a + receiver buffer for both transmission delay jitter and de- + packetization. Moreover, receivers should take transmission delay + jitter into account in the buffering operation, e.g., by additional + initial buffering before starting of decoding and playback. + + + + +Wang, et al. Standards Track [Page 40] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + When sprop-max-don-diff is equal to 0 for all the received RTP + streams, the de-packetization buffer size is zero bytes, and the + process described in the remainder of this paragraph applies. When + there is only one RTP stream received, the NAL units carried in the + single RTP stream are directly passed to the decoder in their + transmission order, which is identical to their decoding order. When + there is more than one RTP stream received, the NAL units carried in + the multiple RTP streams are passed to the decoder in their NTP + timestamp order. When there are several NAL units of different RTP + streams with the same NTP timestamp, the order to pass them to the + decoder is their dependency order, where NAL units of a dependee RTP + stream are passed to the decoder prior to the NAL units of the + dependent RTP stream. When there are several NAL units of the same + RTP stream with the same NTP timestamp, the order to pass them to the + decoder is their transmission order. + + Informative note: The mapping between RTP and NTP timestamps is + conveyed in RTCP SR packets. In addition, the mechanisms for + faster media timestamp synchronization discussed in [RFC6051] may + be used to speed up the acquisition of the RTP-to-wall-clock + mapping. + + When sprop-max-don-diff is greater than 0 for any the received RTP + streams, the process described in the remainder of this section + applies. + + There are two buffering states in the receiver: initial buffering and + buffering while playing. Initial buffering starts when the reception + is initialized. After initial buffering, decoding and playback are + started, and the buffering-while-playing mode is used. + + Regardless of the buffering state, the receiver stores incoming NAL + units, in reception order, into the de-packetization buffer. NAL + units carried in RTP packets are stored in the de-packetization + buffer individually, and the value of AbsDon is calculated and stored + for each NAL unit. When MRST or MRMT is in use, NAL units of all RTP + streams of a bitstream are stored in the same de-packetization + buffer. When NAL units carried in any two RTP streams are available + to be placed into the de-packetization buffer, those NAL units + carried in the RTP stream that is lower in the dependency tree are + placed into the buffer first. For example, if RTP stream A depends + on RTP stream B, then NAL units carried in RTP stream B are placed + into the buffer first. + + + + + + + + +Wang, et al. Standards Track [Page 41] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Initial buffering lasts until condition A (the difference between the + greatest and smallest AbsDon values of the NAL units in the de- + packetization buffer is greater than or equal to the value of sprop- + max-don-diff of the highest RTP stream) or condition B (the number of + NAL units in the de-packetization buffer is greater than the value of + sprop-depack-buf-nalus) is true. + + After initial buffering, whenever condition A or condition B is true, + the following operation is repeatedly applied until both condition A + and condition B become false: + + o The NAL unit in the de-packetization buffer with the smallest + value of AbsDon is removed from the de-packetization buffer and + passed to the decoder. + + When no more NAL units are flowing into the de-packetization buffer, + all NAL units remaining in the de-packetization buffer are removed + from the buffer and passed to the decoder in the order of increasing + AbsDon values. + +7. Payload Format Parameters + + This section specifies the parameters that MAY be used to select + optional features of the payload format and certain features or + properties of the bitstream or the RTP stream. The parameters are + specified here as part of the media type registration for the HEVC + codec. A mapping of the parameters into the Session Description + Protocol (SDP) [RFC4566] is also provided for applications that use + SDP. Equivalent parameters could be defined elsewhere for use with + control protocols that do not use SDP. + +7.1. Media Type Registration + + The media subtype for the HEVC codec is allocated from the IETF tree. + + The receiver MUST ignore any unrecognized parameter. + + Type name: video + + Subtype name: H265 + + Required parameters: none + + OPTIONAL parameters: + + profile-space, tier-flag, profile-id, profile-compatibility- + indicator, interop-constraints, and level-id: + + + + +Wang, et al. Standards Track [Page 42] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + These parameters indicate the profile, tier, default level, and + some constraints of the bitstream carried by the RTP stream and + all RTP streams the RTP stream depends on, or a specific set of + the profile, tier, default level, and some constraints the + receiver supports. + + The profile and some constraints are indicated collectively by + profile-space, profile-id, profile-compatibility-indicator, and + interop-constraints. The profile specifies the subset of + coding tools that may have been used to generate the bitstream + or that the receiver supports. + + Informative note: There are 32 values of profile-id, and + there are 32 flags in profile-compatibility-indicator, each + flag corresponding to one value of profile-id. According to + HEVC version 1 in [HEVC], when more than one of the 32 flags + is set for a bitstream, the bitstream would comply with all + the profiles corresponding to the set flags. However, in a + draft of HEVC version 2 in [HEVCv2], Subclause A.3.5, 19 + Format Range Extensions profiles have been specified, all + using the same value of profile-id (4), differentiated by + some of the 48 bits in interop-constraints; this (rather + unexpected way of profile signaling) means that one of the + 32 flags may correspond to multiple profiles. To be able to + support whatever HEVC extension profile that might be + specified and indicated using profile-space, profile-id, + profile-compatibility-indicator, and interop-constraints in + the future, it would be safe to require symmetric use of + these parameters in SDP offer/answer unless recv-sub-layer- + id is included in the SDP answer for choosing one of the + sub-layers offered. + + The tier is indicated by tier-flag. The default level is + indicated by level-id. The tier and the default level specify + the limits on values of syntax elements or arithmetic + combinations of values of syntax elements that are followed + when generating the bitstream or that the receiver supports. + + A set of profile-space, tier-flag, profile-id, profile- + compatibility-indicator, interop-constraints, and level-id + parameters ptlA is said to be consistent with another set of + these parameters ptlB if any decoder that conforms to the + profile, tier, level, and constraints indicated by ptlB can + decode any bitstream that conforms to the profile, tier, level, + and constraints indicated by ptlA. + + + + + + +Wang, et al. Standards Track [Page 43] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + In SDP offer/answer, when the SDP answer does not include the + recv-sub-layer-id parameter that is less than the sprop-sub- + layer-id parameter in the SDP offer, the following applies: + + o The profile-space, tier-flag, profile-id, profile- + compatibility-indicator, and interop-constraints + parameters MUST be used symmetrically, i.e., the value of + each of these parameters in the offer MUST be the same as + that in the answer, either explicitly signaled or + implicitly inferred. + + o The level-id parameter is changeable as long as the + highest level indicated by the answer is either equal to + or lower than that in the offer. Note that the highest + level is indicated by level-id and max-recv-level-id + together. + + In SDP offer/answer, when the SDP answer does include the recv- + sub-layer-id parameter that is less than the sprop-sub-layer-id + parameter in the SDP offer, the set of profile-space, tier- + flag, profile-id, profile-compatibility-indicator, interop- + constraints, and level-id parameters included in the answer + MUST be consistent with that for the chosen sub-layer + representation as indicated in the SDP offer, with the + exception that the level-id parameter in the SDP answer is + changeable as long as the highest level indicated by the answer + is either lower than or equal to that in the offer. + + More specifications of these parameters, including how they + relate to the values of the profile, tier, and level syntax + elements specified in [HEVC] are provided below. + + profile-space, profile-id: + + The value of profile-space MUST be in the range of 0 to 3, + inclusive. The value of profile-id MUST be in the range of 0 + to 31, inclusive. + + When profile-space is not present, a value of 0 MUST be + inferred. When profile-id is not present, a value of 1 (i.e., + the Main profile) MUST be inferred. + + When used to indicate properties of a bitstream, profile-space + and profile-id are derived from the profile, tier, and level + syntax elements in SPS or VPS NAL units as follows, where + general_profile_space, general_profile_idc, + sub_layer_profile_space[j], and sub_layer_profile_idc[j] are + specified in [HEVC]: + + + +Wang, et al. Standards Track [Page 44] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + If the RTP stream is the highest RTP stream, the following + applies: + + o profile-space = general_profile_space + o profile-id = general_profile_idc + + Otherwise (the RTP stream is a dependee RTP stream), the + following applies, with j being the value of the sprop-sub- + layer-id parameter: + + o profile-space = sub_layer_profile_space[j] + o profile-id = sub_layer_profile_idc[j] + + tier-flag, level-id: + + The value of tier-flag MUST be in the range of 0 to 1, + inclusive. The value of level-id MUST be in the range of 0 to + 255, inclusive. + + If the tier-flag and level-id parameters are used to indicate + properties of a bitstream, they indicate the tier and the + highest level the bitstream complies with. + + If the tier-flag and level-id parameters are used for + capability exchange, the following applies. If max-recv-level- + id is not present, the default level defined by level-id + indicates the highest level the codec wishes to support. + Otherwise, max-recv-level-id indicates the highest level the + codec supports for receiving. For either receiving or sending, + all levels that are lower than the highest level supported MUST + also be supported. + + If no tier-flag is present, a value of 0 MUST be inferred; if + no level-id is present, a value of 93 (i.e., level 3.1) MUST be + inferred. + + When used to indicate properties of a bitstream, the tier-flag + and level-id parameters are derived from the profile, tier, and + level syntax elements in SPS or VPS NAL units as follows, where + general_tier_flag, general_level_idc, sub_layer_tier_flag[j], + and sub_layer_level_idc[j] are specified in [HEVC]: + + If the RTP stream is the highest RTP stream, the following + applies: + + o tier-flag = general_tier_flag + o level-id = general_level_idc + + + + +Wang, et al. Standards Track [Page 45] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Otherwise (the RTP stream is a dependee RTP stream), the + following applies, with j being the value of the sprop-sub- + layer-id parameter: + + o tier-flag = sub_layer_tier_flag[j] + o level-id = sub_layer_level_idc[j] + + interop-constraints: + + A base16 [RFC4648] (hexadecimal) representation of six bytes of + data, consisting of progressive_source_flag, + interlaced_source_flag, non_packed_constraint_flag, + frame_only_constraint_flag, and reserved_zero_44bits. + + If the interop-constraints parameter is not present, the + following MUST be inferred: + + o progressive_source_flag = 1 + o interlaced_source_flag = 0 + o non_packed_constraint_flag = 1 + o frame_only_constraint_flag = 1 + o reserved_zero_44bits = 0 + + When the interop-constraints parameter is used to indicate + properties of a bitstream, the following applies, where + general_progressive_source_flag, + general_interlaced_source_flag, + general_non_packed_constraint_flag, + general_non_packed_constraint_flag, + general_frame_only_constraint_flag, + general_reserved_zero_44bits, + sub_layer_progressive_source_flag[j], + sub_layer_interlaced_source_flag[j], + sub_layer_non_packed_constraint_flag[j], + sub_layer_frame_only_constraint_flag[j], and + sub_layer_reserved_zero_44bits[j] are specified in [HEVC]: + + If the RTP stream is the highest RTP stream, the following + applies: + + o progressive_source_flag = general_progressive_source_flag + + o interlaced_source_flag = general_interlaced_source_flag + + o non_packed_constraint_flag = + general_non_packed_constraint_flag + + + + + +Wang, et al. Standards Track [Page 46] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + o frame_only_constraint_flag = + general_frame_only_constraint_flag + + o reserved_zero_44bits = general_reserved_zero_44bits + + Otherwise (the RTP stream is a dependee RTP stream), the + following applies, with j being the value of the sprop-sub- + layer-id parameter: + + o progressive_source_flag = + sub_layer_progressive_source_flag[j] + + o interlaced_source_flag = + sub_layer_interlaced_source_flag[j] + + o non_packed_constraint_flag = + sub_layer_non_packed_constraint_flag[j] + + o frame_only_constraint_flag = + sub_layer_frame_only_constraint_flag[j] + + o reserved_zero_44bits = sub_layer_reserved_zero_44bits[j] + + Using interop-constraints for capability exchange results in + a requirement on any bitstream to be compliant with the + interop-constraints. + + profile-compatibility-indicator: + + A base16 [RFC4648] representation of four bytes of data. + + When profile-compatibility-indicator is used to indicate + properties of a bitstream, the following applies, where + general_profile_compatibility_flag[j] and + sub_layer_profile_compatibility_flag[i][j] are specified in + [HEVC]: + + The profile-compatibility-indicator in this case indicates + additional profiles to the profile defined by profile-space, + profile-id, and interop-constraints the bitstream conforms + to. A decoder that conforms to any of all the profiles the + bitstream conforms to would be capable of decoding the + bitstream. These additional profiles are defined by + profile-space, each set bit of profile-compatibility- + indicator, and interop-constraints. + + + + + + +Wang, et al. Standards Track [Page 47] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + If the RTP stream is the highest RTP stream, the following + applies for each value of j in the range of 0 to 31, + inclusive: + + o bit j of profile-compatibility-indicator = + general_profile_compatibility_flag[j] + + Otherwise (the RTP stream is a dependee RTP stream), the + following applies for i equal to sprop-sub-layer-id and for + each value of j in the range of 0 to 31, inclusive: + + o bit j of profile-compatibility-indicator = + sub_layer_profile_compatibility_flag[i][j] + + Using profile-compatibility-indicator for capability exchange + results in a requirement on any bitstream to be compliant with + the profile-compatibility-indicator. This is intended to + handle cases where any future HEVC profile is defined as an + intersection of two or more profiles. + + If this parameter is not present, this parameter defaults to + the following: bit j, with j equal to profile-id, of profile- + compatibility-indicator is inferred to be equal to 1, and all + other bits are inferred to be equal to 0. + + sprop-sub-layer-id: + + This parameter MAY be used to indicate the highest allowed + value of TID in the bitstream. When not present, the value of + sprop-sub-layer-id is inferred to be equal to 6. + + The value of sprop-sub-layer-id MUST be in the range of 0 to 6, + inclusive. + + recv-sub-layer-id: + + This parameter MAY be used to signal a receiver's choice of the + offered or declared sub-layer representations in the sprop-vps. + The value of recv-sub-layer-id indicates the TID of the highest + sub-layer of the bitstream that a receiver supports. When not + present, the value of recv-sub-layer-id is inferred to be equal + to the value of the sprop-sub-layer-id parameter in the SDP + offer. + + The value of recv-sub-layer-id MUST be in the range of 0 to 6, + inclusive. + + + + + +Wang, et al. Standards Track [Page 48] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + max-recv-level-id: + + This parameter MAY be used to indicate the highest level a + receiver supports. The highest level the receiver supports is + equal to the value of max-recv-level-id divided by 30. + + The value of max-recv-level-id MUST be in the range of 0 to + 255, inclusive. + + When max-recv-level-id is not present, the value is inferred to + be equal to level-id. + + max-recv-level-id MUST NOT be present when the highest level + the receiver supports is not higher than the default level. + + tx-mode: + + This parameter indicates whether the transmission mode is SRST, + MRST, or MRMT. + + The value of tx-mode MUST be equal to "SRST", "MRST" or "MRMT". + When not present, the value of tx-mode is inferred to be equal + to "SRST". + + If the value is equal to "MRST", MRST MUST be in use. + Otherwise, if the value is equal to "MRMT", MRMT MUST be in + use. Otherwise (the value is equal to "SRST"), SRST MUST be in + use. + + The value of tx-mode MUST be equal to "MRST" for all RTP + streams in an MRST. + + The value of tx-mode MUST be equal to "MRMT" for all RTP + streams in an MRMT. + + sprop-vps: + + This parameter MAY be used to convey any video parameter set + NAL unit of the bitstream for out-of-band transmission of video + parameter sets. The parameter MAY also be used for capability + exchange and to indicate sub-stream characteristics (i.e., + properties of sub-layer representations as defined in [HEVC]). + The value of the parameter is a comma-separated (',') list of + base64 [RFC4648] representations of the video parameter set NAL + units as specified in Section 7.3.2.1 of [HEVC]. + + + + + + +Wang, et al. Standards Track [Page 49] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + The sprop-vps parameter MAY contain one or more than one video + parameter set NAL unit. However, all other video parameter sets + contained in the sprop-vps parameter MUST be consistent with + the first video parameter set in the sprop-vps parameter. A + video parameter set vpsB is said to be consistent with another + video parameter set vpsA if any decoder that conforms to the + profile, tier, level, and constraints indicated by the 12 bytes + of data starting from the syntax element general_profile_space + to the syntax element general_level_idc, inclusive, in the + first profile_tier_level( ) syntax structure in vpsA can decode + any bitstream that conforms to the profile, tier, level, and + constraints indicated by the 12 bytes of data starting from the + syntax element general_profile_space to the syntax element + general_level_idc, inclusive, in the first profile_tier_level( + ) syntax structure in vpsB. + + sprop-sps: + + This parameter MAY be used to convey sequence parameter set NAL + units of the bitstream for out-of-band transmission of sequence + parameter sets. The value of the parameter is a comma- + separated (',') list of base64 [RFC4648] representations of the + sequence parameter set NAL units as specified in Section + 7.3.2.2 of [HEVC]. + + sprop-pps: + + This parameter MAY be used to convey picture parameter set NAL + units of the bitstream for out-of-band transmission of picture + parameter sets. The value of the parameter is a comma- + separated (',') list of base64 [RFC4648] representations of the + picture parameter set NAL units as specified in Section 7.3.2.3 + of [HEVC]. + + sprop-sei: + + This parameter MAY be used to convey one or more SEI messages + that describe bitstream characteristics. When present, a + decoder can rely on the bitstream characteristics that are + described in the SEI messages for the entire duration of the + session, independently from the persistence scopes of the SEI + messages as specified in [HEVC]. + + The value of the parameter is a comma-separated (',') list of + base64 [RFC4648] representations of SEI NAL units as specified + in Section 7.3.2.4 of [HEVC]. + + + + + +Wang, et al. Standards Track [Page 50] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Informative note: Intentionally, no list of applicable or + inapplicable SEI messages is specified here. Conveying + certain SEI messages in sprop-sei may be sensible in some + application scenarios and meaningless in others. However, a + few examples are described below: + + 1) In an environment where the bitstream was created from + film-based source material, and no splicing is going + to occur during the lifetime of the session, the film + grain characteristics SEI message or the tone mapping + information SEI message are likely meaningful, and + sending them in sprop-sei rather than in the bitstream + at each entry point may help with saving bits and + allows one to configure the renderer only once, + avoiding unwanted artifacts. + + 2) The structure of pictures information SEI message in + sprop-sei can be used to inform a decoder of + information on the NAL unit types, picture-order count + values, and prediction dependencies of a sequence of + pictures. Having such knowledge can be helpful for + error recovery. + + 3) Examples for SEI messages that would be meaningless to + be conveyed in sprop-sei include the decoded picture + hash SEI message (it is close to impossible that all + decoded pictures have the same hashtag), the display + orientation SEI message when the device is a handheld + device (as the display orientation may change when the + handheld device is turned around), or the filler + payload SEI message (as there is no point in just + having more bits in SDP). + + max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, max-tc: + + These parameters MAY be used to signal the capabilities of a + receiver implementation. These parameters MUST NOT be used for + any other purpose. The highest level (specified by max-recv- + level-id) MUST be the highest that the receiver is fully + capable of supporting. max-lsr, max-lps, max-cpb, max-dpb, + max-br, max-tr, and max-tc MAY be used to indicate capabilities + of the receiver that extend the required capabilities of the + highest level, as specified below. + + When more than one parameter from the set (max-lsr, max-lps, + max-cpb, max-dpb, max-br, max-tr, max-tc) is present, the + receiver MUST support all signaled capabilities simultaneously. + For example, if both max-lsr and max-br are present, the + + + +Wang, et al. Standards Track [Page 51] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + highest level with the extension of both the picture rate and + bitrate is supported. That is, the receiver is able to decode + bitstreams in which the luma sample rate is up to max-lsr + (inclusive), the bitrate is up to max-br (inclusive), the coded + picture buffer size is derived as specified in the semantics of + the max-br parameter below, and the other properties comply + with the highest level specified by max-recv-level-id. + + Informative note: When the OPTIONAL media type parameters + are used to signal the properties of a bitstream, and max- + lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, and max-tc + are not present, the values of profile-space, tier-flag, + profile-id, profile-compatibility-indicator, interop- + constraints, and level-id must always be such that the + bitstream complies fully with the specified profile, tier, + and level. + + max-lsr: + + The value of max-lsr is an integer indicating the maximum + processing rate in units of luma samples per second. The max- + lsr parameter signals that the receiver is capable of decoding + video at a higher rate than is required by the highest level. + + When max-lsr is signaled, the receiver MUST be able to decode + bitstreams that conform to the highest level, with the + exception that the MaxLumaSR value in Table A-2 of [HEVC] for + the highest level is replaced with the value of max-lsr. + Senders MAY use this knowledge to send pictures of a given size + at a higher picture rate than is indicated in the highest + level. + + When not present, the value of max-lsr is inferred to be equal + to the value of MaxLumaSR given in Table A-2 of [HEVC] for the + highest level. + + The value of max-lsr MUST be in the range of MaxLumaSR to 16 * + MaxLumaSR, inclusive, where MaxLumaSR is given in Table A-2 of + [HEVC] for the highest level. + + max-lps: + + The value of max-lps is an integer indicating the maximum + picture size in units of luma samples. The max-lps parameter + signals that the receiver is capable of decoding larger picture + sizes than are required by the highest level. When max-lps is + signaled, the receiver MUST be able to decode bitstreams that + conform to the highest level, with the exception that the + + + +Wang, et al. Standards Track [Page 52] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + MaxLumaPS value in Table A-1 of [HEVC] for the highest level is + replaced with the value of max-lps. Senders MAY use this + knowledge to send larger pictures at a proportionally lower + picture rate than is indicated in the highest level. + + When not present, the value of max-lps is inferred to be equal + to the value of MaxLumaPS given in Table A-1 of [HEVC] for the + highest level. + + The value of max-lps MUST be in the range of MaxLumaPS to 16 * + MaxLumaPS, inclusive, where MaxLumaPS is given in Table A-1 of + [HEVC] for the highest level. + + max-cpb: + + The value of max-cpb is an integer indicating the maximum coded + picture buffer size in units of CpbBrVclFactor bits for the VCL + HRD parameters and in units of CpbBrNalFactor bits for the NAL + HRD parameters, where CpbBrVclFactor and CpbBrNalFactor are + defined in Section A.4 of [HEVC]. The max-cpb parameter + signals that the receiver has more memory than the minimum + amount of coded picture buffer memory required by the highest + level. When max-cpb is signaled, the receiver MUST be able to + decode bitstreams that conform to the highest level, with the + exception that the MaxCPB value in Table A-1 of [HEVC] for the + highest level is replaced with the value of max-cpb. Senders + MAY use this knowledge to construct coded bitstreams with + greater variation of bitrate than can be achieved with the + MaxCPB value in Table A-1 of [HEVC]. + + When not present, the value of max-cpb is inferred to be equal + to the value of MaxCPB given in Table A-1 of [HEVC] for the + highest level. + + The value of max-cpb MUST be in the range of MaxCPB to 16 * + MaxCPB, inclusive, where MaxLumaCPB is given in Table A-1 of + [HEVC] for the highest level. + + Informative note: The coded picture buffer is used in the + hypothetical reference decoder (Annex C of [HEVC]). The use + of the hypothetical reference decoder is recommended in HEVC + encoders to verify that the produced bitstream conforms to + the standard and to control the output bitrate. Thus, the + coded picture buffer is conceptually independent of any + other potential buffers in the receiver, including de- + packetization and de-jitter buffers. The coded picture + buffer need not be implemented in decoders as specified in + Annex C of [HEVC], but rather standard-compliant decoders + + + +Wang, et al. Standards Track [Page 53] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + can have any buffering arrangements provided that they can + decode standard-compliant bitstreams. Thus, in practice, + the input buffer for a video decoder can be integrated with + de-packetization and de-jitter buffers of the receiver. + + max-dpb: + + The value of max-dpb is an integer indicating the maximum + decoded picture buffer size in units decoded pictures at the + MaxLumaPS for the highest level, i.e., the number of decoded + pictures at the maximum picture size defined by the highest + level. The value of max-dpb MUST be in the range of 1 to 16, + respectively. The max-dpb parameter signals that the receiver + has more memory than the minimum amount of decoded picture + buffer memory required by default, which is MaxDpbPicBuf as + defined in [HEVC] (equal to 6). When max-dpb is signaled, the + receiver MUST be able to decode bitstreams that conform to the + highest level, with the exception that the MaxDpbPicBuff value + defined in [HEVC] as 6 is replaced with the value of max-dpb. + Consequently, a receiver that signals max-dpb MUST be capable + of storing the following number of decoded pictures + (MaxDpbSize) in its decoded picture buffer: + + if( PicSizeInSamplesY <= ( MaxLumaPS >> 2 ) ) + MaxDpbSize = Min( 4 * max-dpb, 16 ) + else if ( PicSizeInSamplesY <= ( MaxLumaPS >> 1 ) ) + MaxDpbSize = Min( 2 * max-dpb, 16 ) + else if ( PicSizeInSamplesY <= ( ( 3 * MaxLumaPS ) >> 2 + ) ) + MaxDpbSize = Min( (4 * max-dpb) / 3, 16 ) + else + MaxDpbSize = max-dpb + + Wherein MaxLumaPS given in Table A-1 of [HEVC] for the highest + level and PicSizeInSamplesY is the current size of each decoded + picture in units of luma samples as defined in [HEVC]. + + The value of max-dpb MUST be greater than or equal to the value + of MaxDpbPicBuf (i.e., 6) as defined in [HEVC]. Senders MAY + use this knowledge to construct coded bitstreams with improved + compression. + + When not present, the value of max-dpb is inferred to be equal + to the value of MaxDpbPicBuf (i.e., 6) as defined in [HEVC]. + + Informative note: This parameter was added primarily to + complement a similar codepoint in the ITU-T Recommendation + H.245, so as to facilitate signaling gateway designs. The + + + +Wang, et al. Standards Track [Page 54] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + decoded picture buffer stores reconstructed samples. There + is no relationship between the size of the decoded picture + buffer and the buffers used in RTP, especially de- + packetization and de-jitter buffers. + + max-br: + + The value of max-br is an integer indicating the maximum video + bitrate in units of CpbBrVclFactor bits per second for the VCL + HRD parameters and in units of CpbBrNalFactor bits per second + for the NAL HRD parameters, where CpbBrVclFactor and + CpbBrNalFactor are defined in Section A.4 of [HEVC]. + + The max-br parameter signals that the video decoder of the + receiver is capable of decoding video at a higher bitrate than + is required by the highest level. + + When max-br is signaled, the video codec of the receiver MUST + be able to decode bitstreams that conform to the highest level, + with the following exceptions in the limits specified by the + highest level: + + o The value of max-br replaces the MaxBR value in Table A-2 + of [HEVC] for the highest level. + + o When the max-cpb parameter is not present, the result of + the following formula replaces the value of MaxCPB in + Table A-1 of [HEVC]: + + (MaxCPB of the highest level) * max-br / (MaxBR of the + highest level) + + For example, if a receiver signals capability for Main profile + Level 2 with max-br equal to 2000, this indicates a maximum + video bitrate of 2000 kbits/sec for VCL HRD parameters, a + maximum video bitrate of 2200 kbits/sec for NAL HRD parameters, + and a CPB size of 2000000 bits (2000000 / 1500000 * 1500000). + + Senders MAY use this knowledge to send higher bitrate video as + allowed in the level definition of Annex A of [HEVC] to achieve + improved video quality. + + When not present, the value of max-br is inferred to be equal + to the value of MaxBR given in Table A-2 of [HEVC] for the + highest level. + + + + + + +Wang, et al. Standards Track [Page 55] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + The value of max-br MUST be in the range of MaxBR to 16 * + MaxBR, inclusive, where MaxBR is given in Table A-2 of [HEVC] + for the highest level. + + Informative note: This parameter was added primarily to + complement a similar codepoint in the ITU-T Recommendation + H.245, so as to facilitate signaling gateway designs. The + assumption that the network is capable of handling such + bitrates at any given time cannot be made from the value of + this parameter. In particular, no conclusion can be drawn + that the signaled bitrate is possible under congestion + control constraints. + + max-tr: + + The value of max-tr is an integer indication the maximum number + of tile rows. The max-tr parameter signals that the receiver + is capable of decoding video with a larger number of tile rows + than the value allowed by the highest level. + + When max-tr is signaled, the receiver MUST be able to decode + bitstreams that conform to the highest level, with the + exception that the MaxTileRows value in Table A-1 of [HEVC] for + the highest level is replaced with the value of max-tr. + + Senders MAY use this knowledge to send pictures utilizing a + larger number of tile rows than the value allowed by the + highest level. + + When not present, the value of max-tr is inferred to be equal + to the value of MaxTileRows given in Table A-1 of [HEVC] for + the highest level. + + The value of max-tr MUST be in the range of MaxTileRows to 16 * + MaxTileRows, inclusive, where MaxTileRows is given in Table A-1 + of [HEVC] for the highest level. + + max-tc: + + The value of max-tc is an integer indication the maximum number + of tile columns. The max-tc parameter signals that the + receiver is capable of decoding video with a larger number of + tile columns than the value allowed by the highest level. + + When max-tc is signaled, the receiver MUST be able to decode + bitstreams that conform to the highest level, with the + exception that the MaxTileCols value in Table A-1 of [HEVC] for + the highest level is replaced with the value of max-tc. + + + +Wang, et al. Standards Track [Page 56] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Senders MAY use this knowledge to send pictures utilizing a + larger number of tile columns than the value allowed by the + highest level. + + When not present, the value of max-tc is inferred to be equal + to the value of MaxTileCols given in Table A-1 of [HEVC] for + the highest level. + + The value of max-tc MUST be in the range of MaxTileCols to 16 * + MaxTileCols, inclusive, where MaxTileCols is given in Table A-1 + of [HEVC] for the highest level. + + max-fps: + + The value of max-fps is an integer indicating the maximum + picture rate in units of pictures per 100 seconds that can be + effectively processed by the receiver. The max-fps parameter + MAY be used to signal that the receiver has a constraint in + that it is not capable of processing video effectively at the + full picture rate that is implied by the highest level and, + when present, one or more of the parameters max-lsr, max-lps, + and max-br. + + The value of max-fps is not necessarily the picture rate at + which the maximum picture size can be sent, it constitutes a + constraint on maximum picture rate for all resolutions. + + Informative note: The max-fps parameter is semantically + different from max-lsr, max-lps, max-cpb, max-dpb, max-br, + max-tr, and max-tc in that max-fps is used to signal a + constraint, lowering the maximum picture rate from what is + implied by other parameters. + + The encoder MUST use a picture rate equal to or less than this + value. In cases where the max-fps parameter is absent, the + encoder is free to choose any picture rate according to the + highest level and any signaled optional parameters. + + The value of max-fps MUST be smaller than or equal to the full + picture rate that is implied by the highest level and, when + present, one or more of the parameters max-lsr, max-lps, and + max-br. + + + + + + + + + +Wang, et al. Standards Track [Page 57] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + sprop-max-don-diff: + + If tx-mode is equal to "SRST" and there is no NAL unit naluA + that is followed in transmission order by any NAL unit + preceding naluA in decoding order (i.e., the transmission order + of the NAL units is the same as the decoding order), the value + of this parameter MUST be equal to 0. + + Otherwise, if tx-mode is equal to "MRST" or "MRMT", the + decoding order of the NAL units of all the RTP streams is the + same as the NAL unit transmission order and the NAL unit output + order, the value of this parameter MUST be equal to either 0 or + 1. + + Otherwise, if tx-mode is equal to "MRST" or "MRMT" and the + decoding order of the NAL units of all the RTP streams is the + same as the NAL unit transmission order but not the same as the + NAL unit output order, the value of this parameter MUST be + equal to 1. + + Otherwise, this parameter specifies the maximum absolute + difference between the decoding order number (i.e., AbsDon) + values of any two NAL units naluA and naluB, where naluA + follows naluB in decoding order and precedes naluB in + transmission order. + + The value of sprop-max-don-diff MUST be an integer in the range + of 0 to 32767, inclusive. + + When not present, the value of sprop-max-don-diff is inferred + to be equal to 0. + + sprop-depack-buf-nalus: + + This parameter specifies the maximum number of NAL units that + precede a NAL unit in transmission order and follow the NAL + unit in decoding order. + + The value of sprop-depack-buf-nalus MUST be an integer in the + range of 0 to 32767, inclusive. + + When not present, the value of sprop-depack-buf-nalus is + inferred to be equal to 0. + + When sprop-max-don-diff is present and greater than 0, this + parameter MUST be present and the value MUST be greater than 0. + + + + + +Wang, et al. Standards Track [Page 58] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + sprop-depack-buf-bytes: + + This parameter signals the required size of the de- + packetization buffer in units of bytes. The value of the + parameter MUST be greater than or equal to the maximum buffer + occupancy (in units of bytes) of the de-packetization buffer as + specified in Section 6. + + The value of sprop-depack-buf-bytes MUST be an integer in the + range of 0 to 4294967295, inclusive. + + When sprop-max-don-diff is present and greater than 0, this + parameter MUST be present and the value MUST be greater than 0. + When not present, the value of sprop-depack-buf-bytes is + inferred to be equal to 0. + + Informative note: The value of sprop-depack-buf-bytes + indicates the required size of the de-packetization buffer + only. When network jitter can occur, an appropriately sized + jitter buffer has to be available as well. + + depack-buf-cap: + + This parameter signals the capabilities of a receiver + implementation and indicates the amount of de-packetization + buffer space in units of bytes that the receiver has available + for reconstructing the NAL unit decoding order from NAL units + carried in one or more RTP streams. A receiver is able to + handle any RTP stream, and all RTP streams the RTP stream + depends on, when present, for which the value of the sprop- + depack-buf-bytes parameter is smaller than or equal to this + parameter. + + When not present, the value of depack-buf-cap is inferred to be + equal to 4294967295. The value of depack-buf-cap MUST be an + integer in the range of 1 to 4294967295, inclusive. + + Informative note: depack-buf-cap indicates the maximum + possible size of the de-packetization buffer of the receiver + only, without allowing for network jitter. + + + + + + + + + + + +Wang, et al. Standards Track [Page 59] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + sprop-segmentation-id: + + This parameter MAY be used to signal the segmentation tools + present in the bitstream and that can be used for + parallelization. The value of sprop-segmentation-id MUST be an + integer in the range of 0 to 3, inclusive. When not present, + the value of sprop-segmentation-id is inferred to be equal to + 0. + + When sprop-segmentation-id is equal to 0, no information about + the segmentation tools is provided. When sprop-segmentation-id + is equal to 1, it indicates that slices are present in the + bitstream. When sprop-segmentation-id is equal to 2, it + indicates that tiles are present in the bitstream. When sprop- + segmentation-id is equal to 3, it indicates that WPP is used in + the bitstream. + + sprop-spatial-segmentation-idc: + + A base16 [RFC4648] representation of the syntax element + min_spatial_segmentation_idc as specified in [HEVC]. This + parameter MAY be used to describe parallelization capabilities + of the bitstream. + + dec-parallel-cap: + + This parameter MAY be used to indicate the decoder's additional + decoding capabilities given the presence of tools enabling + parallel decoding, such as slices, tiles, and WPP, in the + bitstream. The decoding capability of the decoder may vary + with the setting of the parallel decoding tools present in the + bitstream, e.g., the size of the tiles that are present in a + bitstream. Therefore, multiple capability points may be + provided, each indicating the minimum required decoding + capability that is associated with a parallelism requirement, + which is a requirement on the bitstream that enables parallel + decoding. + + Each capability point is defined as a combination of 1) a + parallelism requirement, 2) a profile (determined by profile- + space and profile-id), 3) a highest level, and 4) a maximum + processing rate, a maximum picture size, and a maximum video + bitrate that may be equal to or greater than that determined by + the highest level. The parameter's syntax in ABNF [RFC5234] is + as follows: + + + + + + +Wang, et al. Standards Track [Page 60] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + dec-parallel-cap = "dec-parallel-cap={" cap-point *("," + cap-point) "}" + + cap-point = ("w" / "t") ":" spatial-seg-idc 1*(";" + cap-parameter) + + spatial-seg-idc = 1*4DIGIT ; (1-4095) + + cap-parameter = tier-flag / level-id / max-lsr + / max-lps / max-br + + tier-flag = "tier-flag" EQ ("0" / "1") + + level-id = "level-id" EQ 1*3DIGIT ; (0-255) + + max-lsr = "max-lsr" EQ 1*20DIGIT ; (0- + 18,446,744,073,709,551,615) + + max-lps = "max-lps" EQ 1*10DIGIT ; (0-4,294,967,295) + + max-br = "max-br" EQ 1*20DIGIT ; (0- + 18,446,744,073,709,551,615) + + EQ = "=" + + The set of capability points expressed by the dec-parallel-cap + parameter is enclosed in a pair of curly braces ("{}"). Each + set of two consecutive capability points is separated by a + comma (','). Within each capability point, each set of two + consecutive parameters, and, when present, their values, is + separated by a semicolon (';'). + + The profile of all capability points is determined by profile- + space and profile-id, which are outside the dec-parallel-cap + parameter. + + Each capability point starts with an indication of the + parallelism requirement, which consists of a parallel tool + type, which may be equal to 'w' or 't', and a decimal value of + the spatial-seg-idc parameter. When the type is 'w', the + capability point is valid only for H.265 bitstreams with WPP in + use, i.e., entropy_coding_sync_enabled_flag equal to 1. When + the type is 't', the capability point is valid only for H.265 + bitstreams with WPP not in use (i.e., + entropy_coding_sync_enabled_flag equal to 0). The capability- + point is valid only for H.265 bitstreams with + min_spatial_segmentation_idc equal to or greater than spatial- + seg-idc. + + + +Wang, et al. Standards Track [Page 61] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + After the parallelism requirement indication, each capability + point continues with one or more pairs of parameter and value + in any order for any of the following parameters: + + o tier-flag + o level-id + o max-lsr + o max-lps + o max-br + + At most, one occurrence of each of the above five parameters is + allowed within each capability point. + + The values of dec-parallel-cap.tier-flag and dec-parallel- + cap.level-id for a capability point indicate the highest level + of the capability point. The values of dec-parallel-cap.max- + lsr, dec-parallel-cap.max-lps, and dec-parallel-cap.max-br for + a capability point indicate the maximum processing rate in + units of luma samples per second, the maximum picture size in + units of luma samples, and the maximum video bitrate (in units + of CpbBrVclFactor bits per second for the VCL HRD parameters + and in units of CpbBrNalFactor bits per second for the NAL HRD + parameters where CpbBrVclFactor and CpbBrNalFactor are defined + in Section A.4 of [HEVC]). + + When not present, the value of dec-parallel-cap.tier-flag is + inferred to be equal to the value of tier-flag outside the dec- + parallel-cap parameter. When not present, the value of dec- + parallel-cap.level-id is inferred to be equal to the value of + max-recv-level-id outside the dec-parallel-cap parameter. When + not present, the value of dec-parallel-cap.max-lsr, dec- + parallel-cap.max-lps, or dec-parallel-cap.max-br is inferred to + be equal to the value of max-lsr, max-lps, or max-br, + respectively, outside the dec-parallel-cap parameter. + + The general decoding capability, expressed by the set of + parameters outside of dec-parallel-cap, is defined as the + capability point that is determined by the following + combination of parameters: 1) the parallelism requirement + corresponding to the value of sprop-segmentation-id equal to 0 + for a bitstream, 2) the profile determined by profile-space, + profile-id, profile-compatibility-indicator, and interop- + constraints, 3) the tier and the highest level determined by + tier-flag and max-recv-level-id, and 4) the maximum processing + rate, the maximum picture size, and the maximum video bitrate + determined by the highest level. The general decoding + capability MUST NOT be included as one of the set of capability + points in the dec-parallel-cap parameter. + + + +Wang, et al. Standards Track [Page 62] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + For example, the following parameters express the general + decoding capability of 720p30 (Level 3.1) plus an additional + decoding capability of 1080p30 (Level 4) given that the + spatially largest tile or slice used in the bitstream is equal + to or less than 1/3 of the picture size: + + a=fmtp:98 level-id=93;dec-parallel-cap={t:8;level- id=120} + + For another example, the following parameters express an + additional decoding capability of 1080p30, using dec-parallel- + cap.max-lsr and dec-parallel-cap.max-lps, given that WPP is + used in the bitstream: + + a=fmtp:98 level-id=93;dec-parallel-cap={w:8; + max-lsr=62668800;max-lps=2088960} + + Informative note: When min_spatial_segmentation_idc is + present in a bitstream and WPP is not used, [HEVC] specifies + that there is no slice or no tile in the bitstream + containing more than 4 * PicSizeInSamplesY / ( + min_spatial_segmentation_idc + 4 ) luma samples. + + include-dph: + + This parameter is used to indicate the capability and + preference to utilize or include Decoded Picture Hash (DPH) SEI + messages (see Section D.3.19 of [HEVC]) in the bitstream. DPH + SEI messages can be used to detect picture corruption so the + receiver can request picture repair, see Section 8. The value + is a comma-separated list of hash types that is supported or + requested to be used, each hash type provided as an unsigned + integer value (0-255), with the hash types listed from most + preferred to the least preferred. Example: "include-dph=0,2", + which indicates the capability for MD5 (most preferred) and + Checksum (less preferred). If the parameter is not included or + the value contains no hash types, then no capability to utilize + DPH SEI messages is assumed. Note that DPH SEI messages MAY + still be included in the bitstream even when there is no + declaration of capability to use them, as in general SEI + messages do not affect the normative decoding process and + decoders are allowed to ignore SEI messages. + + Encoding considerations: + + This type is only defined for transfer via RTP (RFC 3550). + + + + + + +Wang, et al. Standards Track [Page 63] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Security considerations: + + See Section 9 of RFC 7798. + + Published specification: + + Please refer to RFC 7798 and its Section 12. + + Additional information: None + + File extensions: none + + Macintosh file type code: none + + Object identifier or OID: none + + Person & email address to contact for further information: + + Ye-Kui Wang (yekui.wang@gmail.com) + + Intended usage: COMMON + + Author: See Authors' Addresses section of RFC 7798. + + Change controller: + + IETF Audio/Video Transport Payloads working group delegated from + the IESG. + +7.2. SDP Parameters + + The receiver MUST ignore any parameter unspecified in this memo. + +7.2.1. Mapping of Payload Type Parameters to SDP + + The media type video/H265 string is mapped to fields in the Session + Description Protocol (SDP) [RFC4566] as follows: + + o The media name in the "m=" line of SDP MUST be video. + + o The encoding name in the "a=rtpmap" line of SDP MUST be H265 (the + media subtype). + + o The clock rate in the "a=rtpmap" line MUST be 90000. + + o The OPTIONAL parameters profile-space, profile-id, tier-flag, + level-id, interop-constraints, profile-compatibility-indicator, + sprop-sub-layer-id, recv-sub-layer-id, max-recv-level-id, tx-mode, + + + +Wang, et al. Standards Track [Page 64] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, max-tc, max- + fps, sprop-max-don-diff, sprop-depack-buf-nalus, sprop-depack-buf- + bytes, depack-buf-cap, sprop-segmentation-id, sprop-spatial- + segmentation-idc, dec-parallel-cap, and include-dph, when present, + MUST be included in the "a=fmtp" line of SDP. This parameter is + expressed as a media type string, in the form of a semicolon- + separated list of parameter=value pairs. + + o The OPTIONAL parameters sprop-vps, sprop-sps, and sprop-pps, when + present, MUST be included in the "a=fmtp" line of SDP or conveyed + using the "fmtp" source attribute as specified in Section 6.3 of + [RFC5576]. For a particular media format (i.e., RTP payload + type), sprop-vps sprop-sps, or sprop-pps MUST NOT be both included + in the "a=fmtp" line of SDP and conveyed using the "fmtp" source + attribute. When included in the "a=fmtp" line of SDP, these + parameters are expressed as a media type string, in the form of a + semicolon-separated list of parameter=value pairs. When conveyed + in the "a=fmtp" line of SDP for a particular payload type, the + parameters sprop-vps, sprop-sps, and sprop-pps MUST be applied to + each SSRC with the payload type. When conveyed using the "fmtp" + source attribute, these parameters are only associated with the + given source and payload type as parts of the "fmtp" source + attribute. + + Informative note: Conveyance of sprop-vps, sprop-sps, and + sprop-pps using the "fmtp" source attribute allows for out-of- + band transport of parameter sets in topologies like Topo-Video- + switch-MCU as specified in [RFC7667]. + + An example of media representation in SDP is as follows: + + m=video 49170 RTP/AVP 98 + a=rtpmap:98 H265/90000 + a=fmtp:98 profile-id=1; + sprop-vps=<video parameter sets data> + +7.2.2. Usage with SDP Offer/Answer Model + + When HEVC is offered over RTP using SDP in an offer/answer model + [RFC3264] for negotiation for unicast usage, the following + limitations and rules apply: + + o The parameters identifying a media format configuration for HEVC + are profile-space, profile-id, tier-flag, level-id, interop- + constraints, profile-compatibility-indicator, and tx-mode. These + media configuration parameters, except level-id, MUST be used + symmetrically when the answerer does not include recv-sub-layer-id + + + + +Wang, et al. Standards Track [Page 65] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + in the answer for the media format (payload type) or the included + recv-sub-layer-id is equal to sprop-sub-layer-id in the offer. + The answerer MUST: + + 1) maintain all configuration parameters with the values remaining + the same as in the offer for the media format (payload type), + with the exception that the value of level-id is changeable as + long as the highest level indicated by the answer is not higher + than that indicated by the offer; + + 2) include in the answer the recv-sub-layer-id parameter, with a + value less than the sprop-sub-layer-id parameter in the offer, + for the media format (payload type), and maintain all + configuration parameters with the values being the same as + signaled in the sprop-vps for the chosen sub-layer + representation, with the exception that the value of level-id + is changeable as long as the highest level indicated by the + answer is not higher than the level indicated by the sprop-vps + in offer for the chosen sub-layer representation; or + + 3) remove the media format (payload type) completely (when one or + more of the parameter values are not supported). + + Informative note: The above requirement for symmetric use + does not apply for level-id, and does not apply for the + other bitstream or RTP stream properties and capability + parameters. + + o The profile-compatibility-indicator, when offered as sendonly, + describes bitstream properties. The answerer MAY accept an RTP + payload type even if the decoder is not capable of handling the + profile indicated by the profile-space, profile-id, and interop- + constraints parameters, but capable of any of the profiles + indicated by the profile-space, profile-compatibility-indicator, + and interop-constraints. However, when the profile-compatibility- + indicator is used in a recvonly or sendrecv media description, the + bitstream using this RTP payload type is required to conform to + all profiles indicated by profile-space, profile-compatibility- + indicator, and interop-constraints. + + o To simplify handling and matching of these configurations, the + same RTP payload type number used in the offer SHOULD also be used + in the answer, as specified in [RFC3264]. + + o The same RTP payload type number used in the offer for the media + subtype H265 MUST be used in the answer when the answer includes + recv-sub-layer-id. When the answer does not include recv-sub- + layer-id, the answer MUST NOT contain a payload type number used + + + +Wang, et al. Standards Track [Page 66] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + in the offer for the media subtype H265 unless the configuration + is exactly the same as in the offer or the configuration in the + answer only differs from that in the offer with a different value + of level-id. The answer MAY contain the recv-sub-layer-id + parameter if an HEVC bitstream contains multiple operation points + (using temporal scalability and sub-layers) and sprop-vps is + included in the offer where information of sub-layers are present + in the first video parameter set contained in sprop-vps. If the + sprop-vps is provided in an offer, an answerer MAY select a + particular operation point indicated in the first video parameter + set contained in sprop-vps. When the answer includes a recv-sub- + layer-id that is less than a sprop-sub-layer-id in the offer, all + video parameter sets contained in the sprop-vps parameter in the + SDP answer and all video parameter sets sent in-band for either + the offerer-to-answerer direction or the answerer-to-offerer + direction MUST be consistent with the first video parameter set in + the sprop-vps parameter of the offer (see the semantics of sprop- + vps in Section 7.1 of this document on one video parameter set + being consistent with another video parameter set), and the + bitstream sent in either direction MUST conform to the profile, + tier, level, and constraints of the chosen sub-layer + representation as indicated by the first profile_tier_level( ) + syntax structure in the first video parameter set in the sprop-vps + parameter of the offer. + + Informative note: When an offerer receives an answer that does + not include recv-sub-layer-id, it has to compare payload types + not declared in the offer based on the media type (i.e., + video/H265) and the above media configuration parameters with + any payload types it has already declared. This will enable it + to determine whether the configuration in question is new or if + it is equivalent to configuration already offered, since a + different payload type number may be used in the answer. The + ability to perform operation point selection enables a receiver + to utilize the temporal scalable nature of an HEVC bitstream. + + o The parameters sprop-max-don-diff, sprop-depack-buf-nalus, and + sprop-depack-buf-bytes describe the properties of an RTP stream, + and all RTP streams the RTP stream depends on, when present, that + the offerer or the answerer is sending for the media format + configuration. This differs from the normal usage of the + offer/answer parameters: normally such parameters declare the + properties of the bitstream or RTP stream that the offerer or the + answerer is able to receive. When dealing with HEVC, the offerer + assumes that the answerer will be able to receive media encoded + using the configuration being offered. + + + + + +Wang, et al. Standards Track [Page 67] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Informative note: The above parameters apply for any RTP + stream and all RTP streams the RTP stream depends on, when + present, sent by a declaring entity with the same + configuration. In other words, the applicability of the above + parameters to RTP streams depends on the source endpoint. + Rather than being bound to the payload type, the values may + have to be applied to another payload type when being sent, as + they apply for the configuration. + + o The capability parameters max-lsr, max-lps, max-cpb, max-dpb, max- + br, max-tr, and max-tc MAY be used to declare further capabilities + of the offerer or answerer for receiving. These parameters MUST + NOT be present when the direction attribute is sendonly. + + o The capability parameter max-fps MAY be used to declare lower + capabilities of the offerer or answerer for receiving. The + parameters MUST NOT be present when the direction attribute is + sendonly. + + o The capability parameter dec-parallel-cap MAY be used to declare + additional decoding capabilities of the offerer or answerer for + receiving. Upon receiving such a declaration of a receiver, a + sender MAY send a bitstream to the receiver utilizing those + capabilities under the assumption that the bitstream fulfills the + parallelism requirement. A bitstream that is sent based on + choosing a capability point with parallel tool type 'w' from dec- + parallel-cap MUST have entropy_coding_sync_enabled_flag equal to 1 + and min_spatial_segmentation_idc equal to or larger than dec- + parallel-cap.spatial-seg-idc of the capability point. A bitstream + that is sent based on choosing a capability point with parallel + tool type 't' from dec-parallel-cap MUST have + entropy_coding_sync_enabled_flag equal to 0 and + min_spatial_segmentation_idc equal to or larger than dec-parallel- + cap.spatial-seg-idc of the capability point. + + o An offerer has to include the size of the de-packetization buffer, + sprop-depack-buf-bytes, as well as sprop-max-don-diff and sprop- + depack-buf-nalus, in the offer for an interleaved HEVC bitstream + or for the MRST or MRMT transmission mode when sprop-max-don-diff + is greater than 0 for at least one of the RTP streams. To enable + the offerer and answerer to inform each other about their + capabilities for de-packetization buffering in receiving RTP + streams, both parties are RECOMMENDED to include depack-buf-cap. + For interleaved RTP streams or in MRST or MRMT, it is also + RECOMMENDED to consider offering multiple payload types with + different buffering requirements when the capabilities of the + receiver are unknown. + + + + +Wang, et al. Standards Track [Page 68] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + o The capability parameter include-dph MAY be used to declare the + capability to utilize decoded picture hash SEI messages and which + types of hashes in any HEVC RTP streams received by the offerer or + answerer. + + o The sprop-vps, sprop-sps, or sprop-pps, when present (included in + the "a=fmtp" line of SDP or conveyed using the "fmtp" source + attribute as specified in Section 6.3 of [RFC5576]), are used for + out-of-band transport of the parameter sets (VPS, SPS, or PPS, + respectively). + + o The answerer MAY use either out-of-band or in-band transport of + parameter sets for the bitstream it is sending, regardless of + whether out-of-band parameter sets transport has been used in the + offerer-to-answerer direction. Parameter sets included in an + answer are independent of those parameter sets included in the + offer, as they are used for decoding two different bitstreams, one + from the answerer to the offerer and the other in the opposite + direction. In case some RTP streams are sent before the SDP + offer/answer settles down, in-band parameter sets MUST be used for + those RTP stream parts sent before the SDP offer/answer. + + o The following rules apply to transport of parameter set in the + offerer-to-answerer direction. + + + An offer MAY include sprop-vps, sprop-sps, and/or sprop-pps. + If none of these parameters is present in the offer, then only + in-band transport of parameter sets is used. + + + If the level to use in the offerer-to-answerer direction is + equal to the default level in the offer, the answerer MUST be + prepared to use the parameter sets included in sprop-vps, + sprop-sps, and sprop-pps (either included in the "a=fmtp" line + of SDP or conveyed using the "fmtp" source attribute) for + decoding the incoming bitstream, e.g., by passing these + parameter set NAL units to the video decoder before passing any + NAL units carried in the RTP streams. Otherwise, the answerer + MUST ignore sprop-vps, sprop-sps, and sprop-pps (either + included in the "a=fmtp" line of SDP or conveyed using the + "fmtp" source attribute) and the offerer MUST transmit + parameter sets in-band. + + + In MRST or MRMT, the answerer MUST be prepared to use the + parameter sets out-of-band transmitted for the RTP stream and + all RTP streams the RTP stream depends on, when present, for + decoding the incoming bitstream, e.g., by passing these + parameter set NAL units to the video decoder before passing any + NAL units carried in the RTP streams. + + + +Wang, et al. Standards Track [Page 69] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + o The following rules apply to transport of parameter set in the + answerer-to-offerer direction. + + + An answer MAY include sprop-vps, sprop-sps, and/or sprop-pps. + If none of these parameters is present in the answer, then only + in-band transport of parameter sets is used. + + + The offerer MUST be prepared to use the parameter sets included + in sprop-vps, sprop-sps, and sprop-pps (either included in the + "a=fmtp" line of SDP or conveyed using the "fmtp" source + attribute) for decoding the incoming bitstream, e.g., by + passing these parameter set NAL units to the video decoder + before passing any NAL units carried in the RTP streams. + + + In MRST or MRMT, the offerer MUST be prepared to use the + parameter sets out-of-band transmitted for the RTP stream and + all RTP streams the RTP stream depends on, when present, for + decoding the incoming bitstream, e.g., by passing these + parameter set NAL units to the video decoder before passing any + NAL units carried in the RTP streams. + + o When sprop-vps, sprop-sps, and/or sprop-pps are conveyed using the + "fmtp" source attribute as specified in Section 6.3 of [RFC5576], + the receiver of the parameters MUST store the parameter sets + included in sprop-vps, sprop-sps, and/or sprop-pps and associate + them with the source given as part of the "fmtp" source attribute. + Parameter sets associated with one source (given as part of the + "fmtp" source attribute) MUST only be used to decode NAL units + conveyed in RTP packets from the same source (given as part of the + "fmtp" source attribute). When this mechanism is in use, SSRC + collision detection and resolution MUST be performed as specified + in [RFC5576]. + + For bitstreams being delivered over multicast, the following rules + apply: + + o The media format configuration is identified by profile-space, + profile-id, tier-flag, level-id, interop-constraints, profile- + compatibility-indicator, and tx-mode. These media format + configuration parameters, including level-id, MUST be used + symmetrically; that is, the answerer MUST either maintain all + configuration parameters or remove the media format (payload + type) completely. Note that this implies that the level-id for + offer/answer in multicast is not changeable. + + + + + + + +Wang, et al. Standards Track [Page 70] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + o To simplify the handling and matching of these configurations, + the same RTP payload type number used in the offer SHOULD also + be used in the answer, as specified in [RFC3264]. An answer + MUST NOT contain a payload type number used in the offer unless + the configuration is the same as in the offer. + + o Parameter sets received MUST be associated with the originating + source and MUST only be used in decoding the incoming bitstream + from the same source. + + o The rules for other parameters are the same as above for + unicast as long as the three above rules are obeyed. + + Table 1 lists the interpretation of all the parameters that MUST be + used for the various combinations of offer, answer, and direction + attributes. Note that the two columns wherein the recv-sub-layer-id + parameter is used only apply to answers, whereas the other columns + apply to both offers and answers. + + Table 1. Interpretation of parameters for various combinations of + offers, answers, direction attributes, with and without recv-sub- + layer-id. Columns that do not indicate offer or answer apply to + both. + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Wang, et al. Standards Track [Page 71] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + sendonly --+ + answer: recvonly, recv-sub-layer-id --+ | + recvonly w/o recv-sub-layer-id --+ | | + answer: sendrecv, recv-sub-layer-id --+ | | | + sendrecv w/o recv-sub-layer-id --+ | | | | + | | | | | + profile-space C D C D P + profile-id C D C D P + tier-flag C D C D P + level-id D D D D P + interop-constraints C D C D P + profile-compatibility-indicator C D C D P + tx-mode C C C C P + max-recv-level-id R R R R - + sprop-max-don-diff P P - - P + sprop-depack-buf-nalus P P - - P + sprop-depack-buf-bytes P P - - P + depack-buf-cap R R R R - + sprop-segmentation-id P P P P P + sprop-spatial-segmentation-idc P P P P P + max-br R R R R - + max-cpb R R R R - + max-dpb R R R R - + max-lsr R R R R - + max-lps R R R R - + max-tr R R R R - + max-tc R R R R - + max-fps R R R R - + sprop-vps P P - - P + sprop-sps P P - - P + sprop-pps P P - - P + sprop-sub-layer-id P P - - P + recv-sub-layer-id X O X O - + dec-parallel-cap R R R R - + include-dph R R R R - + + Legend: + + C: configuration for sending and receiving bitstreams + D: changeable configuration, same as C except possible + to answer with a different but consistent value (see the + semantics of the six parameters related to profile, tier, + and level on these parameters being consistent) + P: properties of the bitstream to be sent + R: receiver capabilities + O: operation point selection + X: MUST NOT be present + -: not usable, when present MUST be ignored + + + +Wang, et al. Standards Track [Page 72] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + Parameters used for declaring receiver capabilities are, in general, + downgradable; i.e., they express the upper limit for a sender's + possible behavior. Thus, a sender MAY select to set its encoder + using only lower/lesser or equal values of these parameters. + + When the answer does not include a recv-sub-layer-id that is less + than the sprop-sub-layer-id in the offer, parameters declaring a + configuration point are not changeable, with the exception of the + level-id parameter for unicast usage, and these parameters express + values a receiver expects to be used and MUST be used verbatim in the + answer as in the offer. + + When a sender's capabilities are declared with the configuration + parameters, these parameters express a configuration that is + acceptable for the sender to receive bitstreams. In order to achieve + high interoperability levels, it is often advisable to offer multiple + alternative configurations. It is impossible to offer multiple + configurations in a single payload type. Thus, when multiple + configuration offers are made, each offer requires its own RTP + payload type associated with the offer. However, it is possible to + offer multiple operation points using one configuration in a single + payload type by including sprop-vps in the offer and recv-sub-layer- + id in the answer. + + A receiver SHOULD understand all media type parameters, even if it + only supports a subset of the payload format's functionality. This + ensures that a receiver is capable of understanding when an offer to + receive media can be downgraded to what is supported by the receiver + of the offer. + + An answerer MAY extend the offer with additional media format + configurations. However, to enable their usage, in most cases a + second offer is required from the offerer to provide the bitstream + property parameters that the media sender will use. This also has + the effect that the offerer has to be able to receive this media + format configuration, not only to send it. + +7.2.3. Usage in Declarative Session Descriptions + + When HEVC over RTP is offered with SDP in a declarative style, as in + Real Time Streaming Protocol (RTSP) [RFC2326] or Session Announcement + Protocol (SAP) [RFC2974], the following considerations are necessary. + + + + + + + + + +Wang, et al. Standards Track [Page 73] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + o All parameters capable of indicating both bitstream properties + and receiver capabilities are used to indicate only bitstream + properties. For example, in this case, the parameter profile- + tier-level-id declares the values used by the bitstream, not + the capabilities for receiving bitstreams. As a result, the + following interpretation of the parameters MUST be used: + + + Declaring actual configuration or bitstream properties: + - profile-space + - profile-id + - tier-flag + - level-id + - interop-constraints + - profile-compatibility-indicator + - tx-mode + - sprop-vps + - sprop-sps + - sprop-pps + - sprop-max-don-diff + - sprop-depack-buf-nalus + - sprop-depack-buf-bytes + - sprop-segmentation-id + - sprop-spatial-segmentation-idc + + + Not usable (when present, they MUST be ignored): + - max-lps + - max-lsr + - max-cpb + - max-dpb + - max-br + - max-tr + - max-tc + - max-fps + - max-recv-level-id + - depack-buf-cap + - sprop-sub-layer-id + - dec-parallel-cap + - include-dph + + o A receiver of the SDP is required to support all parameters and + values of the parameters provided; otherwise, the receiver MUST + reject (RTSP) or not participate in (SAP) the session. It + falls on the creator of the session to use values that are + expected to be supported by the receiving application. + + + + + + + +Wang, et al. Standards Track [Page 74] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +7.2.4. Considerations for Parameter Sets + + When out-of-band transport of parameter sets is used, parameter sets + MAY still be additionally transported in-band unless explicitly + disallowed by an application, and some of these additional parameter + sets may update some of the out-of-band transported parameter sets. + Update of a parameter set refers to the sending of a parameter set of + the same type using the same parameter set ID but with different + values for at least one other parameter of the parameter set. + +7.2.5. Dependency Signaling in Multi-Stream Mode + + If MRST or MRMT is used, the rules on signaling media decoding + dependency in SDP as defined in [RFC5583] apply. The rules on + "hierarchical or layered encoding" with multicast in Section 5.7 of + [RFC4566] do not apply. This means that the notation for Connection + Data "c=" SHALL NOT be used with more than one address, i.e., the + sub-field <number of addresses> in the sub-field <connection-address> + of the "c=" field, described in [RFC4566], must not be present. The + order of session dependency is given from the RTP stream containing + the lowest temporal sub-layer to the RTP stream containing the + highest temporal sub-layer. + +8. Use with Feedback Messages + + The following subsections define the use of the Picture Loss + Indication (PLI), Slice Lost Indication (SLI), Reference Picture + Selection Indication (RPSI), and Full Intra Request (FIR) feedback + messages with HEVC. The PLI, SLI, and RPSI messages are defined in + [RFC4585], and the FIR message is defined in [RFC5104]. + +8.1. Picture Loss Indication (PLI) + + As specified in RFC 4585, Section 6.3.1, the reception of a PLI by a + media sender indicates "the loss of an undefined amount of coded + video data belonging to one or more pictures". Without having any + specific knowledge of the setup of the bitstream (such as use and + location of in-band parameter sets, non-IDR decoder refresh points, + picture structures, and so forth), a reaction to the reception of an + PLI by an HEVC sender SHOULD be to send an IDR picture and relevant + parameter sets; potentially with sufficient redundancy so to ensure + correct reception. However, sometimes information about the + bitstream structure is known. For example, state could have been + established outside of the mechanisms defined in this document that + parameter sets are conveyed out of band only, and stay static for the + duration of the session. In that case, it is obviously unnecessary + to send them in-band as a result of the reception of a PLI. Other + + + + +Wang, et al. Standards Track [Page 75] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + examples could be devised based on a priori knowledge of different + aspects of the bitstream structure. In all cases, the timing and + congestion control mechanisms of RFC 4585 MUST be observed. + +8.2. Slice Loss Indication (SLI) + + The SLI described in RFC 4585 can be used to indicate, to a sender, + the loss of a number of Coded Tree Blocks (CTBs) in a CTB raster scan + order of a picture. In the SLI's Feedback Control Indication (FCI) + field, the subfield "First" MUST be set to the CTB address of the + first lost CTB. Note that the CTB address is in CTB-raster-scan + order of a picture. For the first CTB of a slice segment, the CTB + address is the value of slice_segment_address when present, or 0 when + the value of first_slice_segment_in_pic_flag is equal to 1; both + syntax elements are in the slice segment header. The subfield + "Number" MUST be set to the number of consecutive lost CTBs, again in + CTB-raster-scan order of a picture. Note that due to both the + "First" and "Number" being counted in CTBs in CTB-raster-scan order, + of a picture, not in tile-scan order (which is the bitstream order of + CTBs), multiple SLI messages may be needed to report the loss of one + tile covering multiple CTB rows but less wide than the picture. + + The subfield "PictureID" MUST be set to the 6 least significant bits + of a binary representation of the value of PicOrderCntVal, as defined + in [HEVC], of the picture for which the lost CTBs are indicated. + Note that for IDR pictures the syntax element slice_pic_order_cnt_lsb + is not present, but then the value is inferred to be equal to 0. + + As described in RFC 4585, an encoder in a media sender can use this + information to "clean up" the corrupted picture by sending intra + information, while observing the constraints described in RFC 4585, + for example, with respect to congestion control. In many cases, + error tracking is required to identify the corrupted region in the + receiver's state (reference pictures) because of error import in + uncorrupted regions of the picture through motion compensation. + Reference-picture selection can also be used to "clean up" the + corrupted picture, which is usually more efficient and less likely to + generate congestion than sending intra information. + + In contrast to the video codecs contemplated in RFCs 4585 and 5104 + [RFC5104], in HEVC, the "macroblock size" is not fixed to 16x16 luma + samples, but is variable. That, however, does not create a + conceptual difficulty with SLI, because the setting of the CTB size + is a sequence-level functionality, and using a slice loss indication + across CVS boundaries is meaningless as there is no prediction across + sequence boundaries. However, a proper use of SLI messages is not as + straightforward as it was with older, fixed-macroblock-sized video + + + + +Wang, et al. Standards Track [Page 76] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + codecs, as the state of the sequence parameter set (where the CTB + size is located) has to be taken into account when interpreting the + "First" subfield in the FCI. + +8.3. Reference Picture Selection Indication (RPSI) + + Feedback-based reference picture selection has been shown as a + powerful tool to stop temporal error propagation for improved error + resilience [Girod99][Wang05]. In one approach, the decoder side + tracks errors in the decoded pictures and informs the encoder side + that a particular picture that has been decoded relatively earlier is + correct and still present in the decoded picture buffer; it requests + the encoder to use that correct picture-availability information when + encoding the next picture, so to stop further temporal error + propagation. For this approach, the decoder side should use the RPSI + feedback message. + + Encoders can encode some long-term reference pictures as specified in + H.264 or HEVC for purposes described in the previous paragraph + without the need of a huge decoded picture buffer. As shown in + [Wang05], with a flexible reference picture management scheme, as in + H.264 and HEVC, even a decoded picture buffer size of two picture + storage buffers would work for the approach described in the previous + paragraph. + + The field "Native RPSI bit string defined per codec" is a base16 + [RFC4648] representation of the 8 bits consisting of the 2 most + significant bits equal to 0 and 6 bits of nuh_layer_id, as defined in + [HEVC], followed by the 32 bits representing the value of the + PicOrderCntVal (in network byte order), as defined in [HEVC], for the + picture that is indicated by the RPSI feedback message. + + The use of the RPSI feedback message as positive acknowledgement with + HEVC is deprecated. In other words, the RPSI feedback message MUST + only be used as a reference picture selection request, such that it + can also be used in multicast. + +8.4. Full Intra Request (FIR) + + The purpose of the FIR message is to force an encoder to send an + independent decoder refresh point as soon as possible (observing, for + example, the congestion-control-related constraints set out in RFC + 5104). + + Upon reception of a FIR, a sender MUST send an IDR picture. + Parameter sets MUST also be sent, except when there is a priori + knowledge that the parameter sets have been correctly established. A + + + + +Wang, et al. Standards Track [Page 77] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + typical example for that is an understanding between sender and + receiver, established by means outside this document, that parameter + sets are exclusively sent out-of-band. + +9. Security Considerations + + The scope of this Security Considerations section is limited to the + payload format itself and to one feature of HEVC that may pose a + particularly serious security risk if implemented naively. The + payload format, in isolation, does not form a complete system. + Implementers are advised to read and understand relevant security- + related documents, especially those pertaining to RTP (see the + Security Considerations section in [RFC3550]), and the security of + the call-control stack chosen (that may make use of the media type + registration of this memo). Implementers should also consider known + security vulnerabilities of video coding and decoding implementations + in general and avoid those. + + Within this RTP payload format, and with the exception of the user + data SEI message as described below, no security threats other than + those common to RTP payload formats are known. In other words, + neither the various media-plane-based mechanisms, nor the signaling + part of this memo, seems to pose a security risk beyond those common + to all RTP-based systems. + + RTP packets using the payload format defined in this specification + are subject to the security considerations discussed in the RTP + specification [RFC3550], and in any applicable RTP profile such as + RTP/AVP [RFC3551], RTP/AVPF [RFC4585], RTP/SAVP [RFC3711], or + RTP/SAVPF [RFC5124]. However, as "Securing the RTP Framework: Why + RTP Does Not Mandate a Single Media Security Solution" [RFC7202] + discusses, it is not an RTP payload format's responsibility to + discuss or mandate what solutions are used to meet the basic security + goals like confidentiality, integrity and source authenticity for RTP + in general. This responsibility lays on anyone using RTP in an + application. They can find guidance on available security mechanisms + and important considerations in "Options for Securing RTP Sessions" + [RFC7201]. Applications SHOULD use one or more appropriate strong + security mechanisms. The rest of this section discusses the security + impacting properties of the payload format itself. + + Because the data compression used with this payload format is applied + end-to-end, any encryption needs to be performed after compression. + A potential denial-of-service threat exists for data encodings using + compression techniques that have non-uniform receiver-end + computational load. The attacker can inject pathological datagrams + into the bitstream that are complex to decode and that cause the + receiver to be overloaded. H.265 is particularly vulnerable to such + + + +Wang, et al. Standards Track [Page 78] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + attacks, as it is extremely simple to generate datagrams containing + NAL units that affect the decoding process of many future NAL units. + Therefore, the usage of data origin authentication and data integrity + protection of at least the RTP packet is RECOMMENDED, for example, + with SRTP [RFC3711]. + + Like [H.264], HEVC includes a user data Supplemental Enhancement + Information (SEI) message. This SEI message allows inclusion of an + arbitrary bitstring into the video bitstream. Such a bitstring could + include JavaScript, machine code, and other active content. HEVC + leaves the handling of this SEI message to the receiving system. In + order to avoid harmful side effects of the user data SEI message, + decoder implementations cannot naively trust its content. For + example, it would be a bad and insecure implementation practice to + forward any JavaScript a decoder implementation detects to a web + browser. The safest way to deal with user data SEI messages is to + simply discard them, but that can have negative side effects on the + quality of experience by the user. + + End-to-end security with authentication, integrity, or + confidentiality protection will prevent a MANE from performing media- + aware operations other than discarding complete packets. In the case + of confidentiality protection, it will even be prevented from + discarding packets in a media-aware way. To be allowed to perform + such operations, a MANE is required to be a trusted entity that is + included in the security context establishment. + +10. Congestion Control + + Congestion control for RTP SHALL be used in accordance with RTP + [RFC3550] and with any applicable RTP profile, e.g., AVP [RFC3551]. + If best-effort service is being used, an additional requirement is + that users of this payload format MUST monitor packet loss to ensure + that the packet loss rate is within an acceptable range. Packet loss + is considered acceptable if a TCP flow across the same network path, + and experiencing the same network conditions, would achieve an + average throughput, measured on a reasonable timescale, that is not + less than all RTP streams combined is achieving. This condition can + be satisfied by implementing congestion-control mechanisms to adapt + the transmission rate, the number of layers subscribed for a layered + multicast session, or by arranging for a receiver to leave the + session if the loss rate is unacceptably high. + + The bitrate adaptation necessary for obeying the congestion control + principle is easily achievable when real-time encoding is used, for + example, by adequately tuning the quantization parameter. + + + + + +Wang, et al. Standards Track [Page 79] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + However, when pre-encoded content is being transmitted, bandwidth + adaptation requires the pre-coded bitstream to be tailored for such + adaptivity. The key mechanism available in HEVC is temporal + scalability. A media sender can remove NAL units belonging to higher + temporal sub-layers (i.e., those NAL units with a high value of TID) + until the sending bitrate drops to an acceptable range. HEVC + contains mechanisms that allow the lightweight identification of + switching points in temporal enhancement layers, as discussed in + Section 1.1.2 of this memo. An HEVC media sender can send packets + belonging to NAL units of temporal enhancement layers starting from + these switching points to probe for available bandwidth and to + utilized bandwidth that has been shown to be available. + + Above mechanisms generally work within a defined profile and level + and, therefore, no renegotiation of the channel is required. Only + when non-downgradable parameters (such as profile) are required to be + changed does it become necessary to terminate and restart the RTP + stream(s). This may be accomplished by using different RTP payload + types. + + MANEs MAY remove certain unusable packets from the RTP stream when + that RTP stream was damaged due to previous packet losses. This can + help reduce the network load in certain special cases. For example, + MANES can remove those FUs where the leading FUs belonging to the + same NAL unit have been lost or those dependent slice segments when + the leading slice segments belonging to the same slice have been + lost, because the trailing FUs or dependent slice segments are + meaningless to most decoders. MANES can also remove higher temporal + scalable layers if the outbound transmission (from the MANE's + viewpoint) experiences congestion. + +11. IANA Considerations + + A new media type, as specified in Section 7.1 of this memo, has been + registered with IANA. + +12. References + +12.1. Normative References + + [H.264] ITU-T, "Advanced video coding for generic audiovisual + services", ITU-T Recommendation H.264, April 2013. + + [HEVC] ITU-T, "High efficiency video coding", ITU-T Recommendation + H.265, April 2013. + + + + + + +Wang, et al. Standards Track [Page 80] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + [ISO23008-2] + ISO/IEC, "Information technology -- High efficiency coding + and media delivery in heterogeneous environments -- Part 2: + High efficiency video coding", ISO/IEC 23008-2, 2013. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, + DOI 10.17487/RFC2119, March 1997, + <http://www.rfc-editor.org/info/rfc2119>. + + [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model + with Session Description Protocol (SDP)", RFC 3264, + DOI 10.17487/RFC3264, June 2002, + <http://www.rfc-editor.org/info/rfc3264>. + + [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. + Jacobson, "RTP: A Transport Protocol for Real-Time + Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, July + 2003, <http://www.rfc-editor.org/info/rfc3550>. + + [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and + Video Conferences with Minimal Control", STD 65, RFC 3551, + DOI 10.17487/RFC3551, July 2003, + <http://www.rfc-editor.org/info/rfc3551>. + + [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. + Norrman, "The Secure Real-time Transport Protocol (SRTP)", + RFC 3711, DOI 10.17487/RFC3711, March 2004, + <http://www.rfc-editor.org/info/rfc3711>. + + [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session + Description Protocol", RFC 4566, DOI 10.17487/RFC4566, July + 2006, <http://www.rfc-editor.org/info/rfc4566>. + + [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, + "Extended RTP Profile for Real-time Transport Control + Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, + DOI 10.17487/RFC4585, July 2006, + <http://www.rfc-editor.org/info/rfc4585>. + + [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data + Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, + <http://www.rfc-editor.org/info/rfc4648>. + + [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, + "Codec Control Messages in the RTP Audio-Visual Profile + with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104, + February 2008, <http://www.rfc-editor.org/info/rfc5104>. + + + +Wang, et al. Standards Track [Page 81] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for + Real-time Transport Control Protocol (RTCP)-Based Feedback + (RTP/SAVPF)", RFC 5124, DOI 10.17487/RFC5124, February + 2008, <http://www.rfc-editor.org/info/rfc5124>. + + [RFC5234] Crocker, D., Ed., and P. Overell, "Augmented BNF for Syntax + Specifications: ABNF", STD 68, RFC 5234, + DOI 10.17487/RFC5234, January 2008, + <http://www.rfc-editor.org/info/rfc5234>. + + [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific Media + Attributes in the Session Description Protocol (SDP)", + RFC 5576, DOI 10.17487/RFC5576, June 2009, + <http://www.rfc-editor.org/info/rfc5576>. + + [RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding + Dependency in the Session Description Protocol (SDP)", + RFC 5583, DOI 10.17487/RFC5583, July 2009, + <http://www.rfc-editor.org/info/rfc5583>. + +12.2. Informative References + + [3GPDASH] 3GPP, "Transparent end-to-end Packet-switched Streaming + Service (PSS); Progressive Download and Dynamic Adaptive + Streaming over HTTP (3GP-DASH)", 3GPP TS 26.247 12.1.0, + December 2013. + + [3GPPFF] 3GPP, "Transparent end-to-end packet switched streaming + service (PSS); 3GPP file format (3GP)", 3GPP TS 26.244 + 12.20, December 2013. + + [CABAC] Sole, J., Joshi, R., Nguyen, N., Ji, T., Karczewicz, M., + Clare, G., Henry, F., and Duenas, A., "Transform + coefficient coding in HEVC", IEEE Transactions on Circuts + and Systems for Video Technology, Vol. 22, No. 12, + pp. 1765-1777, DOI 10.1109/TCSVT.2012.2223055, December + 2012. + + [Girod99] Girod, B. and Faerber, F., "Feedback-based error control + for mobile video transmission", Proceedings of the IEEE, + Vol. 87, No. 10, pp. 1707-1723, DOI 10.1109/5.790632, + October 1999. + + [H.265.1] ITU-T, "Conformance specification for ITU-T H.265 high + efficiency video coding", ITU-T Recommendation H.265.1, + October 2014. + + + + + +Wang, et al. Standards Track [Page 82] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + [HEVCv2] Flynn, D., Naccari, M., Rosewarne, C., Sharman, K., Sole, + J., Sullivan, G. J., and T. Suzuki, "High Efficiency Video + Coding (HEVC) Range Extensions text specification: Draft + 7", JCT-VC document JCTVC-Q1005, 17th JCT-VC meeting, + Valencia, Spain, March/April 2014. + + [IS014496-12] + IS0/IEC, "Information technology - Coding of audio-visual + objects - Part 12: ISO base media file format", IS0/IEC + 14496-12, 2015. + + [IS015444-12] + IS0/IEC, "Information technology - JPEG 2000 image coding + system - Part 12: ISO base media file format", IS0/IEC + 15444-12, 2015. + + [JCTVC-J0107] + Wang, Y.-K., Chen, Y., Joshi, R., and Ramasubramonian, K., + "AHG9: On RAP pictures", JCT-VC document JCTVC-L0107, 10th + JCT-VC meeting, Stockholm, Sweden, July 2012. + + [MPEG2S] ISO/IEC, "Information technology - Generic coding of moving + pictures and associated audio information - Part 1: + Systems", ISO International Standard 13818-1, 2013. + + [MPEGDASH] ISO/IEC, "Information technology - Dynamic adaptive + streaming over HTTP (DASH) -- Part 1: Media presentation + description and segment formats", ISO International + Standard 23009-1, 2012. + + [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time + Streaming Protocol (RTSP)", RFC 2326, DOI 10.17487/RFC2326, + April 1998, <http://www.rfc-editor.org/info/rfc2326>. + + [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session + Announcement Protocol", RFC 2974, DOI 10.17487/RFC2974, + October 2000, <http://www.rfc-editor.org/info/rfc2974>. + + [RFC6051] Perkins, C. and T. Schierl, "Rapid Synchronisation of RTP + Flows", RFC 6051, DOI 10.17487/RFC6051, November 2010, + <http://www.rfc-editor.org/info/rfc6051>. + + [RFC6184] Wang, Y.-K., Even, R., Kristensen, T., and R. Jesup, "RTP + Payload Format for H.264 Video", RFC 6184, + DOI 10.17487/RFC6184, May 2011, + <http://www.rfc-editor.org/info/rfc6184>. + + + + + +Wang, et al. Standards Track [Page 83] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + + [RFC6190] Wenger, S., Wang, Y.-K., Schierl, T., and A. Eleftheriadis, + "RTP Payload Format for Scalable Video Coding", RFC 6190, + DOI 10.17487/RFC6190, May 2011, + <http://www.rfc-editor.org/info/rfc6190>. + + [RFC7201] Westerlund, M. and C. Perkins, "Options for Securing RTP + Sessions", RFC 7201, DOI 10.17487/RFC7201, April 2014, + <http://www.rfc-editor.org/info/rfc7201>. + + [RFC7202] Perkins, C. and M. Westerlund, "Securing the RTP Framework: + Why RTP Does Not Mandate a Single Media Security Solution", + RFC 7202, DOI 10.17487/RFC7202, April 2014, + <http://www.rfc-editor.org/info/rfc7202>. + + [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and + B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms for + Real-Time Transport Protocol (RTP) Sources", RFC 7656, + DOI 10.17487/RFC7656, November 2015, + <http://www.rfc-editor.org/info/rfc7656>. + + [RFC7667] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 7667, + DOI 10.17487/RFC7667, November 2015, + <http://www.rfc-editor.org/info/rfc7667>. + + [RTP-MULTI-STREAM] + Lennox, J., Westerlund, M., Wu, Q., and C. Perkins, + "Sending Multiple Media Streams in a Single RTP Session", + Work in Progress, draft-ietf-avtcore-rtp-multi-stream-11, + December 2015. + + [SDP-NEG] Holmberg, C., Alvestrand, H., and C. Jennings, "Negotiating + Medai Multiplexing Using Session Description Protocol + (SDP)", Work in Progress, + draft-ietf-mmusic-sdp-bundle-negotiation-25, January 2016. + + [Wang05] Wang, Y.-K., Zhu, C., and Li, H., "Error resilient video + coding using flexible reference fames", Visual + Communications and Image Processing 2005 (VCIP 2005), + Beijing, China, July 2005. + + + + + + + + + + + + +Wang, et al. Standards Track [Page 84] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +Acknowledgements + + Muhammed Coban and Marta Karczewicz are thanked for discussions on + the specification of the use with feedback messages and other aspects + in this memo. Jonathan Lennox and Jill Boyce are thanked for their + contributions to the PACI design included in this memo. Rickard + Sjoberg, Arild Fuldseth, Bo Burman, Magnus Westerlund, and Tom + Kristensen are thanked for their contributions to signaling related + to parallel processing. Magnus Westerlund, Jonathan Lennox, Bernard + Aboba, Jonatan Samuelsson, Roni Even, Rickard Sjoberg, Sachin + Deshpande, Woo Johnman, Mo Zanaty, Ross Finlayson, Danny Hong, Bo + Burman, Ben Campbell, Brian Carpenter, Qin Wu, Stephen Farrell, and + Min Wang made valuable review comments that led to improvements. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Wang, et al. Standards Track [Page 85] + +RFC 7798 RTP Payload Format for HEVC March 2016 + + +Authors' Addresses + + Ye-Kui Wang + Qualcomm Incorporated + 5775 Morehouse Drive + San Diego, CA 92121 + United States + Phone: +1-858-651-8345 + Email: yekui.wang@gmail.com + + Yago Sanchez + Fraunhofer HHI + Einsteinufer 37 + D-10587 Berlin + Germany + Phone: +49 30 31002-663 + Email: yago.sanchez@hhi.fraunhofer.de + + Thomas Schierl + Fraunhofer HHI + Einsteinufer 37 + D-10587 Berlin + Germany + Phone: +49-30-31002-227 + Email: thomas.schierl@hhi.fraunhofer.de + + Stephan Wenger + Vidyo, Inc. + 433 Hackensack Ave., 7th floor + Hackensack, NJ 07601 + United States + Phone: +1-415-713-5473 + Email: stewe@stewe.org + + Miska M. Hannuksela + Nokia Corporation + P.O. Box 1000 + 33721 Tampere + Finland + Phone: +358-7180-08000 + Email: miska.hannuksela@nokia.com + + + + + + + + + + +Wang, et al. Standards Track [Page 86] + |