summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc8082.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc8082.txt')
-rw-r--r--doc/rfc/rfc8082.txt619
1 files changed, 619 insertions, 0 deletions
diff --git a/doc/rfc/rfc8082.txt b/doc/rfc/rfc8082.txt
new file mode 100644
index 0000000..86587ef
--- /dev/null
+++ b/doc/rfc/rfc8082.txt
@@ -0,0 +1,619 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) S. Wenger
+Request for Comments: 8082 J. Lennox
+Updates: 5104 Vidyo, Inc.
+Category: Standards Track B. Burman
+ISSN: 2070-1721 M. Westerlund
+ Ericsson
+ March 2017
+
+
+ Using Codec Control Messages in the RTP Audio-Visual Profile with
+ Feedback with Layered Codecs
+
+Abstract
+
+ This document updates RFC 5104 by fixing a shortcoming in the
+ specification language of the Codec Control Message Full Intra
+ Request (FIR) description when using it with layered codecs. In
+ particular, a decoder refresh point needs to be sent by a media
+ sender when a FIR is received on any layer of the layered bitstream,
+ regardless of whether those layers are being sent in a single or in
+ multiple RTP flows. The other payload-specific feedback messages
+ defined in RFC 5104 and RFC 4585 (which was updated by RFC 5506) have
+ also been analyzed, and no corresponding shortcomings have been
+ found.
+
+Status of This Memo
+
+ This is an Internet Standards Track document.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Further information on
+ Internet Standards is available in Section 2 of RFC 7841.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc8082.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 1]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+Copyright Notice
+
+ Copyright (c) 2017 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+Table of Contents
+
+ 1. Introduction and Problem Statement . . . . . . . . . . . . . 3
+ 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 4
+ 3. Updated Definition of Decoder Refresh Point . . . . . . . . . 4
+ 4. Full Intra Request for Layered Codecs . . . . . . . . . . . . 5
+ 5. Identifying the Use of Layered Bitstreams (Informative) . . . 6
+ 6. Layered Codecs and Non-FIR Codec Control Messages
+ (Informative) . . . . . . . . . . . . . . . . . . . . . . . . 7
+ 6.1. Picture Loss Indication (PLI) . . . . . . . . . . . . . . 7
+ 6.2. Slice Loss Indication (SLI) . . . . . . . . . . . . . . . 7
+ 6.3. Reference Picture Selection Indication (RPSI) . . . . . . 7
+ 6.4. Temporal-Spatial Trade-Off Request and Notification
+ (TSTR/TSTN) . . . . . . . . . . . . . . . . . . . . . . . 8
+ 6.5. H.271 Video Back Channel Message (VBCM) . . . . . . . . . 8
+ 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8
+ 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9
+ 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 9
+ 9.1. Normative References . . . . . . . . . . . . . . . . . . 9
+ 9.2. Informative References . . . . . . . . . . . . . . . . . 9
+ Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 11
+ Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 11
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 2]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+1. Introduction and Problem Statement
+
+ The "Extended RTP Profile for Real-time Transport Control Protocol
+ (RTCP)-Based Feedback (RTP/AVPF)" [RFC4585] and "Codec Control
+ Messages in the RTP Audio-Visual Profile with Feedback (AVPF)"
+ [RFC5104] specify a number of payload-specific feedback messages that
+ a media receiver can use to inform a media sender of certain
+ conditions or to make certain requests. The feedback messages are
+ being sent as RTCP receiver reports, and RFC 4585 specifies timing
+ rules that make the use of those messages practical for time-
+ sensitive codec control.
+
+ Since the time those RFCs were developed, layered codecs have gained
+ in popularity and deployment. Layered codecs use multiple sub-
+ bitstreams called "layers" to represent the content in different
+ fidelities. Depending on the media codec and its RTP payload format
+ in use, a number of options exist on how to transport those layers in
+ RTP. Summarizing "A Taxonomy of Semantics and Mechanisms for Real-
+ Time Transport Protocol (RTP) Sources" [RFC7656]):
+
+ single layers or groups of layers may be sent in their own RTP
+ streams in Multiple RTP streams on a Single media Transport (MRST)
+ or Multiple RTP streams on Multiple media Transports (MRMT) mode;
+
+ using media-codec specific multiplexing mechanisms, multiple
+ layers may be sent in a single RTP stream in Single RTP stream on
+ a Single media Transport (SRST) mode.
+
+ The dependency relationship between layers in a truly layered,
+ pyramid-shaped bitstream forms a directed graph, with the base layer
+ at the root. Enhancement layers depend on the base layer and
+ potentially on other enhancement layers, and the target layer and all
+ layers it depends on have to be decoded jointly in order to recreate
+ the uncompressed media signal at the fidelity of the target layer.
+ Such a layering structure is assumed henceforth; for more exotic
+ layering structures, please see Section 5.
+
+ Implementation experience has shown that the Full Intra Request (FIR)
+ command as defined in [RFC5104] is underspecified when used with
+ layered codecs and when more than one RTP stream is used to transport
+ the layers of a layered bitstream at a given fidelity. In
+ particular, from the [RFC5104] specification language, it is not
+ clear whether a FIR received for only a single RTP stream of multiple
+ RTP streams covering the same layered bitstream necessarily triggers
+ the sending of a decoder refresh point (as defined in [RFC5104],
+ Section 2.2) for all layers, or only for the layer that is
+ transported in the RTP stream that the FIR request is associated
+ with.
+
+
+
+Wenger, et al. Standards Track [Page 3]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+ This document fixes this shortcoming by:
+
+ a. Updating the definition of the decoder refresh point (as defined
+ in [RFC5104], Section 2.2) to cover layered codecs, in line with
+ the corresponding definitions used in a popular layered codec
+ format, namely H.264/SVC (Scalable Video Coding) [H.264].
+ Specifically, a decoder refresh point, in conjunction with
+ layered codecs, resets the state of the whole decoder, which
+ implies that it includes hard or gradual single-layer decoder
+ refresh for all layers;
+
+ b. Requiring a media sender to send a decoder refresh point after
+ the media sender has received a FIR over an RTCP stream
+ associated with any of the RTP streams over which a part of the
+ layered bitstream is transported;
+
+ c. Requiring that a media receiver send the FIR on the RTCP stream
+ associated with the base layer. The option of receiving FIR on
+ the enhancement-layer-associated RTCP stream as specified in
+ point b) above is kept for backward compatibility; and
+
+ d. Providing guidance on how to detect that a layered bitstream is
+ in use for which the above rules apply.
+
+ While, clearly, the reaction to FIR for layered codecs in [RFC5104]
+ and the companion documents is underspecified, it appears that this
+ is not the case for any of the other payload-specific codec control
+ messages defined in [RFC4585] and [RFC5104]. A brief summary of the
+ analysis that led to this conclusion is also included in this
+ document.
+
+2. Requirements Language
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described in RFC 2119 [RFC2119].
+
+3. Updated Definition of Decoder Refresh Point
+
+ The remainder of this section replaces the definition of decoder
+ refresh point in Section 2.2 of [RFC5104] in its entirety.
+
+ Decoder Refresh Point: A bit string, packetized in one or more RTP
+ packets, that completely resets the decoder to a known state.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 4]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+ Examples for "hard" single-layer decoder refresh points are Intra
+ pictures in H.261 [H.261], H.263 [H.263], MPEG-1 [MPEG-1], MPEG-2
+ [MPEG-2], and MPEG-4 [MPEG-4]; Instantaneous Decoder Refresh (IDR)
+ pictures in H.264 [H.264] and H.265 [H.265]; and keyframes in VP8
+ [RFC6386] and VP9 [VP9-BITSTREAM]. "Gradual" decoder refresh points
+ may also be used; see, for example, H.264 [H.264]. While both "hard"
+ and "gradual" decoder refresh points are acceptable in the scope of
+ this specification, in most cases the user experience will benefit
+ from using a "hard" decoder refresh point.
+
+ A decoder refresh point also contains all header information above
+ the syntactical level of the picture layer that is conveyed in-band.
+ In [H.264], for example, a decoder refresh point contains those
+ parameter set Network Adaptation Layer (NAL) units that generate
+ parameter sets necessary for the decoding of the following slice/data
+ partition NAL units. (That is, assuming the parameter sets have not
+ been conveyed out of band.)
+
+ When a layered codec is in use, the above definition -- in
+ particular, the requirement to completely reset the decoder to a
+ known state -- implies that the decoder refresh point includes hard
+ or gradual single-layer decoder refresh points for all layers.
+
+4. Full Intra Request for Layered Codecs
+
+ A media receiver or middlebox may decide to send a FIR command based
+ on the guidance provided in Section 4.3.1 of [RFC5104]. When sending
+ the FIR command, it MUST target the RTP stream that carries the base
+ layer of the layered bitstream, and this is done by setting the
+ Feedback Control Information (FCI) (and, in particular, the
+ synchronization source (SSRC) field therein) to refer to the SSRC of
+ the forward RTP stream that carries the base layer.
+
+ When a Full Intra Request command is received by the designated media
+ sender in the RTCP stream associated with any of the RTP streams in
+ which any layer of a layered bitstream are sent, the designated media
+ sender MUST send a decoder refresh point (Section 3) as defined above
+ at its earliest opportunity. The requirements related to congestion
+ control on the forward RTP streams as specified in Sections 3.5.1 and
+ 5 of [RFC5104] apply for the RTP streams both in isolation and
+ combined.
+
+ Note: the requirement to react to FIR commands associated with
+ enhancement layers is included for robustness and backward-
+ compatibility reasons.
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 5]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+5. Identifying the Use of Layered Bitstreams (Informative)
+
+ The above modifications to RFC 5104 unambiguously define how to deal
+ with FIR commands when layered bitstreams are in use. However, it is
+ surprisingly difficult to identify the use of a layered bitstream.
+ In general, it is expected that implementers know when layered
+ bitstreams (in its commonly understood sense: with inter-layer
+ prediction between pyramid-arranged layers) are in use and when not
+ and can therefore implement the above updates to RFC 5104 correctly.
+ However, there are scenarios in which layered codecs are employed
+ creating non-pyramid-shaped bitstreams. Those scenarios may be
+ viewed as somewhat exotic today but clearly are supported by certain
+ video coding syntaxes, such as H.264/SVC. When blindly applying the
+ above rules to those non-pyramid-arranged layering structures,
+ suboptimal system behavior would result. Nothing would break, and
+ there would not be an interoperability failure, but the user
+ experience may suffer through the sending or receiving of decoder
+ refresh points at times or on parts of the bitstream that are
+ unnecessary from a user experience viewpoint. Therefore, this
+ informative section is included that provides the current
+ understanding of when a layered bitstream is in use and when not.
+
+ The key observation made here is that the RTP payload format
+ negotiated for the RTP streams, in isolation, is not necessarily an
+ indicator for the use of a layered bitstream. Some layered codecs
+ (including H.264/SVC) can form decodable bitstreams including only
+ (one or more) enhancement layers, without the base layer, effectively
+ creating simulcastable sub-bitstreams within a single scalable
+ bitstream (as defined in the video coding standard), but without
+ inter-layer prediction. In such a scenario, it is potentially,
+ though not necessarily, counterproductive to send a decoder refresh
+ point on all layers for that payload format and media source. It is
+ beyond the scope of this document to discuss optimized reactions to
+ FIRs received on RTP streams carrying such exotic bitstreams.
+
+ One good indication of the likely use of pyramid-shaped layering with
+ inter-layer prediction is when the various RTP streams are "bound"
+ together on the signaling level. In an SDP environment, this would
+ be the case if they are marked as being dependent on each other using
+ "The Session Description Protocol (SDP) Grouping Framework" [RFC5888]
+ and layer dependency [RFC5583].
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 6]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+6. Layered Codecs and Non-FIR Codec Control Messages (Informative)
+
+ Between them, AVPF [RFC4585] and Codec Control Messages [RFC5104]
+ define a total of seven payload-specific feedback messages. For the
+ FIR command message, guidance has been provided above. In this
+ section, some information is provided with respect to the remaining
+ six codec control messages.
+
+6.1. Picture Loss Indication (PLI)
+
+ PLI is defined in Section 6.3.1 of [RFC4585]. The prudent response
+ to a PLI message received for an enhancement layer is to "repair"
+ that enhancement layer and all dependent enhancement layers through
+ appropriate source-coding-specific means. However, the reference
+ layer or layers used by the enhancement layer for which the PLI was
+ received do not require repair. The encoder can figure out by itself
+ what constitutes a dependent enhancement layer and does not need help
+ from the system stack in doing so. Thus, there is nothing that needs
+ to be specified herein.
+
+6.2. Slice Loss Indication (SLI)
+
+ SLI is defined in Section 6.3.2 of [RFC4585]. The current
+ understanding is that the prudent response to an SLI message received
+ for an enhancement layer is to "repair" the affected spatial area of
+ that enhancement layer and all dependent enhancement layers through
+ appropriate source-coding-specific means. As in PLI, the reference
+ layers used by the enhancement layer for which the SLI was received
+ do not need to be repaired. Again, as in PLI, the encoder can
+ determine by itself what constitutes a dependent enhancement layer
+ and does not need help from the system stack in doing so. Thus,
+ there is nothing that needs to be specified herein. SLI has seen
+ very little implementation and, as far as it is known, none in
+ conjunction with layered systems.
+
+6.3. Reference Picture Selection Indication (RPSI)
+
+ RPSI is defined in Section 6.3.3 of [RFC4585]. While a technical
+ equivalent of RPSI has been in use with non-layered systems for many
+ years, no implementations are known in conjunction of layered codecs.
+ The current understanding is that the reception of an RPSI message on
+ any layer indicating a missing reference picture forces the encoder
+ to appropriately handle that missing reference picture in the layer
+ indicated, and in all dependent layers. Thus, RPSI should work
+ without further need for specification language.
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 7]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+6.4. Temporal-Spatial Trade-Off Request and Notification (TSTR/TSTN)
+
+ TSTR/TSTN are defined in Sections 4.3.2 and 4.3.3 of [RFC5104],
+ respectively. The TSTR request communicates guidance of the
+ preferred trade-off between spatial quality and frame rate. A
+ technical equivalent of TSTR/TSTN has seen deployment for many years
+ in non-scalable systems.
+
+ TSTR and TSTN messages include an SSRC target, which, similarly to
+ FIR, may refer to an RTP stream carrying a base layer, an enhancement
+ layer, or multiple layers. Therefore, the current understanding is
+ that the semantics of the message applies to the layers present in
+ the targeted RTP stream.
+
+ It is noted that per-layer TSTR/TSTN is a mechanism that is, in some
+ ways, counterproductive in a system using layered codecs. Given a
+ sufficiently complex layered bitstream layout, a sending system has
+ flexibility in adjusting the spatio/temporal quality balance by
+ adding and removing temporal, spatial, or quality enhancement layers.
+ At present, it is unclear whether an allowed (or even recommended)
+ option to the reception of a TSTR is to adjust the bit allocation
+ within the layer(s) present in the addressed RTP stream or to adjust
+ the layering structure accordingly -- which can involve more than
+ just the addressed RTP stream.
+
+ Until there is a sufficient critical mass of implementation practice,
+ it is probably prudent for an implementer not to assume either of the
+ two options or any middle ground that may exist between the two.
+ Instead, it is suggested that an implementation be liberal in
+ accepting TSTR messages and upon receipt, responding in TSTN
+ indicating "no change". Further, it is suggested that new
+ implementations do not send TSTR messages except when operating in
+ SRST mode as defined in [RFC7656]. Finally, implementers are
+ encouraged to contribute to the IETF documentation of any
+ implementation requirements that make per-layer TSTR/TSTN useful.
+
+6.5. H.271 Video Back Channel Message (VBCM)
+
+ VBCM is defined in Section 4.3.4 of [RFC5104]. What was said above
+ for RPSI (Section 6.3) applies here as well.
+
+7. IANA Considerations
+
+ This memo includes no request to IANA.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 8]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+8. Security Considerations
+
+ The security considerations of AVPF [RFC4585] (as updated by "Support
+ for Reduced-Size Real-Time Transport Control Protocol (RTCP):
+ Opportunities and Consequences" [RFC5506]) and Codec Control Messages
+ [RFC5104] apply. The clarified response to FIR does not introduce
+ additional security considerations.
+
+9. References
+
+9.1. Normative References
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119,
+ DOI 10.17487/RFC2119, March 1997,
+ <http://www.rfc-editor.org/info/rfc2119>.
+
+ [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey,
+ "Extended RTP Profile for Real-time Transport Control
+ Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585,
+ DOI 10.17487/RFC4585, July 2006,
+ <http://www.rfc-editor.org/info/rfc4585>.
+
+ [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman,
+ "Codec Control Messages in the RTP Audio-Visual Profile
+ with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104,
+ February 2008, <http://www.rfc-editor.org/info/rfc5104>.
+
+ [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size
+ Real-Time Transport Control Protocol (RTCP): Opportunities
+ and Consequences", RFC 5506, DOI 10.17487/RFC5506, April
+ 2009, <http://www.rfc-editor.org/info/rfc5506>.
+
+9.2. Informative References
+
+ [H.261] ITU-T, "Video codec for audiovisual services at p x 64
+ kbit/s", ITU-T Recommendation H.261, March 1993,
+ <http://handle.itu.int/11.1002/1000/1088>.
+
+ [H.263] ITU-T, "Video coding for low bit rate communication",
+ ITU-T Recommendation H.263, January 2005,
+ <http://handle.itu.int/11.1002/1000/7497>.
+
+ [H.264] ITU-T, "Advanced video coding for generic audiovisual
+ services", ITU-T Recommendation H.264, Version 11, October
+ 2016, <http://handle.itu.int/11.1002/1000/12904>.
+
+
+
+
+
+Wenger, et al. Standards Track [Page 9]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+ [H.265] ITU-T, "High efficiency video coding", ITU-T
+ Recommendation H.265, Version 4, December 2016,
+ <http://handle.itu.int/11.1002/1000/12905>.
+
+ [MPEG-1] ISO/IEC, "Information technology -- Coding of moving
+ pictures and associated audio for digital storage media at
+ up to about 1,5 Mbit/s -- Part 2: Video", ISO/
+ IEC 11172-2:1993, August 1993.
+
+ [MPEG-2] ISO/IEC, "Information technology -- Generic coding of
+ moving pictures and associated audio information -- Part
+ 2: Video", ISO/IEC 13818-2:2013, October 2013.
+
+ [MPEG-4] ISO/IEC, "Information technology -- Coding of audio-visual
+ objects -- Part 2: Visual", ISO/IEC 14496-2:2004, June
+ 2004.
+
+ [RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding
+ Dependency in the Session Description Protocol (SDP)",
+ RFC 5583, DOI 10.17487/RFC5583, July 2009,
+ <http://www.rfc-editor.org/info/rfc5583>.
+
+ [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description
+ Protocol (SDP) Grouping Framework", RFC 5888,
+ DOI 10.17487/RFC5888, June 2010,
+ <http://www.rfc-editor.org/info/rfc5888>.
+
+ [RFC6386] Bankoski, J., Koleszar, J., Quillio, L., Salonen, J.,
+ Wilkins, P., and Y. Xu, "VP8 Data Format and Decoding
+ Guide", RFC 6386, DOI 10.17487/RFC6386, November 2011,
+ <http://www.rfc-editor.org/info/rfc6386>.
+
+ [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
+ B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
+ for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
+ DOI 10.17487/RFC7656, November 2015,
+ <http://www.rfc-editor.org/info/rfc7656>.
+
+ [VP9-BITSTREAM]
+ Grange, A., de Rivaz, P., and J. Hunt, "VP9 Bitstream &
+ Decoding Process Specification", Version 0.6, March 2016,
+ <https://storage.googleapis.com/downloads.webmproject.org/
+ docs/vp9/vp9-bitstream-specification-
+ v0.6-20160331-draft.pdf>.
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 10]
+
+RFC 8082 CCM for Layered Codecs March 2017
+
+
+Acknowledgements
+
+ The authors want to thank Mo Zanaty for useful discussions.
+
+Authors' Addresses
+
+ Stephan Wenger
+ Vidyo, Inc.
+
+ Email: stewe@stewe.org
+
+
+ Jonathan Lennox
+ Vidyo, Inc.
+
+ Email: jonathan@vidyo.com
+
+
+ Bo Burman
+ Ericsson
+ Kistavagen 25
+ SE - 164 80 Kista
+ Sweden
+
+ Email: bo.burman@ericsson.com
+
+
+ Magnus Westerlund
+ Ericsson
+ Farogatan 2
+ SE - 164 80 Kista
+ Sweden
+
+ Phone: +46107148287
+ Email: magnus.westerlund@ericsson.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Wenger, et al. Standards Track [Page 11]
+