summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc9317.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc9317.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc9317.txt')
-rw-r--r--doc/rfc/rfc9317.txt2127
1 files changed, 2127 insertions, 0 deletions
diff --git a/doc/rfc/rfc9317.txt b/doc/rfc/rfc9317.txt
new file mode 100644
index 0000000..dce5151
--- /dev/null
+++ b/doc/rfc/rfc9317.txt
@@ -0,0 +1,2127 @@
+
+
+
+
+Internet Engineering Task Force (IETF) J. Holland
+Request for Comments: 9317 Akamai Technologies, Inc.
+Category: Informational A. Begen
+ISSN: 2070-1721 Networked Media
+ S. Dawkins
+ Tencent America LLC
+ October 2022
+
+
+ Operational Considerations for Streaming Media
+
+Abstract
+
+ This document provides an overview of operational networking and
+ transport protocol issues that pertain to the quality of experience
+ (QoE) when streaming video and other high-bitrate media over the
+ Internet.
+
+ This document explains the characteristics of streaming media
+ delivery that have surprised network designers or transport experts
+ who lack specific media expertise, since streaming media highlights
+ key differences between common assumptions in existing networking
+ practices and observations of media delivery issues encountered when
+ streaming media over those existing networks.
+
+Status of This Memo
+
+ This document is not an Internet Standards Track specification; it is
+ published for informational purposes.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Not all documents
+ approved by the IESG are candidates for any level of Internet
+ Standard; see Section 2 of RFC 7841.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ https://www.rfc-editor.org/info/rfc9317.
+
+Copyright Notice
+
+ Copyright (c) 2022 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (https://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Revised BSD License text as described in Section 4.e of the
+ Trust Legal Provisions and are provided without warranty as described
+ in the Revised BSD License.
+
+Table of Contents
+
+ 1. Introduction
+ 1.1. Key Definitions
+ 1.2. Document Scope
+ 2. Our Focus on Streaming Video
+ 3. Bandwidth Provisioning
+ 3.1. Scaling Requirements for Media Delivery
+ 3.1.1. Video Bitrates
+ 3.1.2. Virtual Reality Bitrates
+ 3.2. Path Bottlenecks and Constraints
+ 3.2.1. Recognizing Changes from a Baseline
+ 3.3. Path Requirements
+ 3.4. Caching Systems
+ 3.5. Predictable Usage Profiles
+ 3.6. Unpredictable Usage Profiles
+ 3.6.1. Peer-to-Peer Applications
+ 3.6.2. Impact of Global Pandemic
+ 4. Latency Considerations
+ 4.1. Ultra-Low-Latency
+ 4.1.1. Near-Real-Time Latency
+ 4.2. Low-Latency Live
+ 4.3. Non-Low-Latency Live
+ 4.4. On-Demand
+ 5. Adaptive Encoding, Adaptive Delivery, and Measurement
+ Collection
+ 5.1. Overview
+ 5.2. Adaptive Encoding
+ 5.3. Adaptive Segmented Delivery
+ 5.4. Advertising
+ 5.5. Bitrate Detection Challenges
+ 5.5.1. Idle Time between Segments
+ 5.5.2. Noisy Measurements
+ 5.5.3. Wide and Rapid Variation in Path Capacity
+ 5.6. Measurement Collection
+ 6. Transport Protocol Behaviors and Their Implications for Media
+ Transport Protocols
+ 6.1. Media Transport over Reliable Transport Protocols
+ 6.2. Media Transport over Unreliable Transport Protocols
+ 6.3. QUIC and Changing Transport Protocol Behavior
+ 7. Streaming Encrypted Media
+ 7.1. General Considerations for Streaming Media Encryption
+ 7.2. Considerations for Hop-by-Hop Media Encryption
+ 7.3. Considerations for End-to-End Media Encryption
+ 8. Additional Resources for Streaming Media
+ 9. IANA Considerations
+ 10. Security Considerations
+ 11. Informative References
+ Acknowledgments
+ Authors' Addresses
+
+1. Introduction
+
+ This document provides an overview of operational networking and
+ transport protocol issues that pertain to the quality of experience
+ (QoE) when streaming video and other high-bitrate media over the
+ Internet.
+
+ This document is intended to explain the characteristics of streaming
+ media delivery that have surprised network designers or transport
+ experts who lack specific media expertise, since streaming media
+ highlights key differences between common assumptions in existing
+ networking practices and observations of media delivery issues
+ encountered when streaming media over those existing networks.
+
+1.1. Key Definitions
+
+ This document defines "high-bitrate streaming media over the
+ Internet" as follows:
+
+ * "High-bitrate" is a context-sensitive term broadly intended to
+ capture rates that can be sustained over some but not all of the
+ target audience's network connections. A snapshot of values
+ commonly qualifying as high-bitrate on today's Internet is given
+ by the higher-value entries in Section 3.1.1.
+
+ * "Streaming" means the continuous transmission of media segments
+ from a server to a client and its simultaneous consumption by the
+ client.
+
+ - The term "simultaneous" is critical, as media segment
+ transmission is not considered "streaming" if one downloads a
+ media file and plays it after the download is completed.
+ Instead, this would be called "download and play".
+
+ - This has two implications. First, the sending rate for media
+ segments must match the client's consumption rate (whether
+ loosely or tightly) to provide uninterrupted playback. That
+ is, the client must not run out of media segments (buffer
+ underrun) and must not accept more media segments than it can
+ buffer before playback (buffer overrun).
+
+ - Second, the client's media segment consumption rate is limited
+ not only by the path's available bandwidth but also by media
+ segment availability. The client cannot fetch media segments
+ that a media server cannot provide (yet).
+
+ * "Media" refers to any type of media and associated streams, such
+ as video, audio, metadata, etc.
+
+ * "Over the Internet" means that a single operator does not have
+ control of the entire path between media servers and media
+ clients, so it is not a "walled garden".
+
+ This document uses these terms to describe the streaming media
+ ecosystem:
+
+ Streaming Media Operator: an entity that provides streaming media
+ servers
+
+ Media Server: a server that provides streaming media to a media
+ player, which is also referred to as a streaming media server, or
+ simply a server
+
+ Intermediary: an entity that is on-path, between the streaming media
+ operator and the ultimate media consumer, and that is media aware
+
+ When the streaming media is encrypted, an intermediary must have
+ credentials that allow the intermediary to decrypt the media in
+ order to be media aware.
+
+ An intermediary can be one of many specialized subtypes that meet
+ this definition.
+
+ Media Player: an endpoint that requests streaming media from a media
+ server for an ultimate media consumer, which is also referred to
+ as a streaming media client, or simply a client
+
+ Ultimate Media Consumer: a human or machine using a media player
+
+1.2. Document Scope
+
+ A full review of all streaming media considerations for all types of
+ media over all types of network paths is too broad a topic to cover
+ comprehensively in a single document.
+
+ This document focuses chiefly on the large-scale delivery of
+ streaming high-bitrate media to end users. It is primarily intended
+ for those controlling endpoints involved in delivering streaming
+ media traffic. This can include origin servers publishing content,
+ intermediaries like content delivery networks (CDNs), and providers
+ for client devices and media players.
+
+ Most of the considerations covered in this document apply to both
+ "live media" (created and streamed as an event is in progress) and
+ "media on demand" (previously recorded media that is streamed from
+ storage), except where noted.
+
+ Most of the considerations covered in this document apply to both
+ media that is consumed by a media player, for viewing by a human, and
+ media that is consumed by a machine, such as a media recorder that is
+ executing an adaptive bitrate (ABR) streaming algorithm, except where
+ noted.
+
+ This document contains
+
+ * a short description of streaming video characteristics in
+ Section 2 to set the stage for the rest of the document,
+
+ * general guidance on bandwidth provisioning (Section 3) and latency
+ considerations (Section 4) for streaming media delivery,
+
+ * a description of adaptive encoding and adaptive delivery
+ techniques in common use for streaming video, along with a
+ description of the challenges media senders face in detecting the
+ bitrate available between the media sender and media receiver, and
+ a collection of measurements by a third party for use in analytics
+ (Section 5),
+
+ * a description of existing transport protocols used for media
+ streaming and the issues encountered when using those protocols,
+ along with a description of the QUIC transport protocol [RFC9000]
+ more recently used for streaming media (Section 6),
+
+ * a description of implications when streaming encrypted media
+ (Section 7), and
+
+ * a pointer to additional resources for further reading on this
+ rapidly changing subject (Section 8).
+
+ Topics outside this scope include the following:
+
+ * an in-depth examination of real-time, two-way interactive media,
+ such as videoconferencing; although this document touches lightly
+ on topics related to this space, the intent is to let readers know
+ that for more in-depth coverage they should look to other
+ documents, since the techniques and issues for interactive real-
+ time, two-way media differ so dramatically from those in large-
+ scale, one-way delivery of streaming media.
+
+ * specific recommendations on operational practices to mitigate
+ issues described in this document; although some known mitigations
+ are mentioned in passing, the primary intent is to provide a point
+ of reference for future solution proposals to describe how new
+ technologies address or avoid existing problems.
+
+ * generalized network performance techniques; while considerations,
+ such as data center design, transit network design, and "walled
+ garden" optimizations, can be crucial components of a performant
+ streaming media service, these are considered independent topics
+ that are better addressed by other documents.
+
+ * transparent tunnels; while tunnels can have an impact on streaming
+ media via issues like the round-trip time and the maximum
+ transmission unit (MTU) of packets carried over tunnels, for the
+ purposes of this document, these issues are considered as part of
+ the set of network path properties.
+
+ Questions about whether this document also covers "Web Real-Time
+ Communication (WebRTC)" have come up often. It does not. WebRTC's
+ principal media transport protocol [RFC8834] [RFC8835], the Real-time
+ Transport Protocol (RTP), is mentioned in this document. However, as
+ noted in Section 2, it is difficult to give general guidance for
+ unreliable media transport protocols used to carry interactive real-
+ time media.
+
+2. Our Focus on Streaming Video
+
+ As the Internet has grown, an increasingly large share of the traffic
+ delivered to end users has become video. The most recent available
+ estimates found that 75% of the total traffic to end users was video
+ in 2019 (as described in [RFC8404], such traffic surveys have since
+ become impossible to conduct due to ubiquitous encryption). At that
+ time, the share of video traffic had been growing for years and was
+ projected to continue growing (Appendix D of [CVNI]).
+
+ A substantial part of this growth is due to the increased use of
+ streaming video. However, video traffic in real-time communications
+ (for example, online videoconferencing) has also grown significantly.
+ While both streaming video and videoconferencing have real-time
+ delivery and latency requirements, these requirements vary from one
+ application to another. For additional discussion of latency
+ requirements, see Section 4.
+
+ In many contexts, media traffic can be handled transparently as
+ generic application-level traffic. However, as the volume of media
+ traffic continues to grow, it is becoming increasingly important to
+ consider the effects of network design decisions on application-level
+ performance, with considerations for the impact on media delivery.
+
+ Much of the focus of this document is on media streaming over HTTP.
+ HTTP is widely used for media streaming because
+
+ * support for HTTP is widely available in a wide range of operating
+ systems,
+
+ * HTTP is also used in a wide variety of other applications,
+
+ * HTTP has been demonstrated to provide acceptable performance over
+ the open Internet,
+
+ * HTTP includes state-of-the-art standardized security mechanisms,
+ and
+
+ * HTTP can use already-deployed caching infrastructure, such as
+ CDNs, local proxies, and browser caches.
+
+ Various HTTP versions have been used for media delivery. HTTP/1.0,
+ HTTP/1.1, and HTTP/2 are carried over TCP [RFC9293], and TCP's
+ transport behavior is described in Section 6.1. HTTP/3 is carried
+ over QUIC, and QUIC's transport behavior is described in Section 6.3.
+
+ Unreliable media delivery using RTP and other UDP-based protocols is
+ also discussed in Sections 4.1, 6.2, and 7.2, but it is difficult to
+ give general guidance for these applications. For instance, when
+ packet loss occurs, the most appropriate response may depend on the
+ type of codec being used.
+
+3. Bandwidth Provisioning
+
+3.1. Scaling Requirements for Media Delivery
+
+3.1.1. Video Bitrates
+
+ Video bitrate selection depends on many variables including the
+ resolution (height and width), frame rate, color depth, codec,
+ encoding parameters, scene complexity, and amount of motion.
+ Generally speaking, as the resolution, frame rate, color depth, scene
+ complexity, and amount of motion increase, the encoding bitrate
+ increases. As newer codecs with better compression tools are used,
+ the encoding bitrate decreases. Similarly, a multi-pass encoding
+ generally produces better quality output compared to single-pass
+ encoding at the same bitrate or delivers the same quality at a lower
+ bitrate.
+
+ Here are a few common resolutions used for video content, with
+ typical ranges of bitrates for the two most popular video codecs
+ [Encodings].
+
+ +============+================+============+============+
+ | Name | Width x Height | H.264 | H.265 |
+ +============+================+============+============+
+ | DVD | 720 x 480 | 1.0 Mbps | 0.5 Mbps |
+ +------------+----------------+------------+------------+
+ | 720p (1K) | 1280 x 720 | 3-4.5 Mbps | 2-4 Mbps |
+ +------------+----------------+------------+------------+
+ | 1080p (2K) | 1920 x 1080 | 6-8 Mbps | 4.5-7 Mbps |
+ +------------+----------------+------------+------------+
+ | 2160p (4k) | 3840 x 2160 | N/A | 10-20 Mbps |
+ +------------+----------------+------------+------------+
+
+ Table 1: Typical Resolutions and Bitrate Ranges Used
+ for Video Encoding
+
+ * Note that these codecs do not take the actual "available
+ bandwidth" between media servers and media players into account
+ when encoding because the codec does not have any idea what
+ network paths and network path conditions will carry the encoded
+ video at some point in the future. It is common for codecs to
+ offer a small number of resource variants, differing only in the
+ bandwidth each variant targets.
+
+ * Note that media players attempting to receive encoded video across
+ a network path with insufficient available path bandwidth might
+ request the media server to provide video encoded for lower
+ bitrates, at the cost of lower video quality, as described in
+ Section 5.3.
+
+ * In order to provide multiple encodings for video resources, the
+ codec must produce multiple variants (also called renditions) of
+ the video resource encoded at various bitrates, as described in
+ Section 5.2.
+
+3.1.2. Virtual Reality Bitrates
+
+ The bitrates given in Section 3.1.1 describe video streams that
+ provide the user with a single, fixed point of view -- therefore, the
+ user has no "degrees of freedom", and the user sees all of the video
+ image that is available.
+
+ Even basic virtual reality (360-degree) videos that allow users to
+ look around freely (referred to as "three degrees of freedom" or
+ 3DoF) require substantially larger bitrates when they are captured
+ and encoded, as such videos require multiple fields of view of the
+ scene. Yet, due to smart delivery methods, such as viewport-based or
+ tile-based streaming, there is no need to send the whole scene to the
+ user. Instead, the user needs only the portion corresponding to its
+ viewpoint at any given time [Survey360].
+
+ In more immersive applications, where limited user movement ("three
+ degrees of freedom plus" or 3DoF+) or full user movement ("six
+ degrees of freedom" or 6DoF) is allowed, the required bitrate grows
+ even further. In this case, immersive content is typically referred
+ to as volumetric media. One way to represent the volumetric media is
+ to use point clouds, where streaming a single object may easily
+ require a bitrate of 30 Mbps or higher. Refer to [MPEGI] and [PCC]
+ for more details.
+
+3.2. Path Bottlenecks and Constraints
+
+ Even when the bandwidth requirements for media streams along a path
+ are well understood, additional analysis is required to understand
+ the constraints on bandwidth at various points along the path between
+ media servers and media players. Media streams can encounter
+ bottlenecks at many points along a path, whether the bottleneck
+ happens at a node or at a path segment along the path, and these
+ bottlenecks may involve a lack of processing power, buffering
+ capacity, link speed, or any other exhaustible resource.
+
+ Media servers may react to bandwidth constraints using two
+ independent feedback loops:
+
+ * Media servers often respond to application-level feedback from the
+ media player that indicates a bottleneck somewhere along the path
+ by sending a different media bitrate. This is described in
+ greater detail in Section 5.
+
+ * Media servers also typically rely on transport protocols with
+ capacity-seeking congestion controllers that probe for available
+ path bandwidth and adjust the media sending rate based on
+ transport mechanisms. This is described in greater detail in
+ Section 6.
+
+ The result is that these two (potentially competing) "helpful"
+ mechanisms each respond to the same bottleneck with no coordination
+ between themselves, so that each is unaware of actions taken by the
+ other, and this can result in QoE for users that is significantly
+ lower than what could have been achieved.
+
+ One might wonder why media servers and transport protocols are each
+ unaware of what the other is doing, and there are multiple reasons
+ for that. One reason is that media servers are often implemented as
+ applications executing in user space, relying on a general-purpose
+ operating system that typically has its transport protocols
+ implemented in the operating system kernel, making decisions that the
+ media server never knows about.
+
+ As one example, if a media server overestimates the available
+ bandwidth to the media player,
+
+ * the transport protocol may detect loss due to congestion and
+ reduce its sending window size per round trip,
+
+ * the media server adapts to application-level feedback from the
+ media player and reduces its own sending rate, and/or
+
+ * the transport protocol sends media at the new, lower rate and
+ confirms that this new, lower rate is "safe" because no transport-
+ level loss is occurring.
+
+ However, because the media server continues to send at the new, lower
+ rate, the transport protocol's maximum sending rate is now limited by
+ the amount of information the media server queues for transmission.
+ Therefore, the transport protocol cannot probe for available path
+ bandwidth by sending at a higher rate until the media player requests
+ segments that buffer enough data for the transport to perform the
+ probing.
+
+ To avoid these types of situations, which can potentially affect all
+ the users whose streaming media segments traverse a bottleneck path
+ segment, there are several possible mitigations that streaming
+ operators can use. However, the first step toward mitigating a
+ problem is knowing that a problem is occurring.
+
+3.2.1. Recognizing Changes from a Baseline
+
+ There are many reasons why path characteristics might change in
+ normal operation. For example:
+
+ * If the path topology changes. For example, routing changes, which
+ can happen in normal operation, may result in traffic being
+ carried over a new path topology that is partially or entirely
+ disjointed from the previous path, especially if the new path
+ topology includes one or more path segments that are more heavily
+ loaded, offer lower total bandwidth, change the overall Path MTU
+ size, or simply cover more distance between the path endpoints.
+
+ * If cross traffic that also traverses part or all of the same path
+ topology increases or decreases, especially if this new cross
+ traffic is "inelastic" and does not respond to indications of path
+ congestion.
+
+ * Wireless links (Wi-Fi, 5G, LTE, etc.) may see rapid changes to
+ capacity from changes in radio interference and signal strength as
+ endpoints move.
+
+ To recognize that a path carrying streaming media has experienced a
+ change, maintaining a baseline that captures its prior properties is
+ fundamental. Analytics that aid in that recognition can be more or
+ less sophisticated and can usefully operate on several different time
+ scales, from milliseconds to hours or days.
+
+ Useful properties to monitor for changes can include the following:
+
+ * round-trip times
+
+ * loss rate (and explicit congestion notification (ECN) [RFC3168]
+ when in use)
+
+ * out-of-order packet rate
+
+ * packet and byte receive rate
+
+ * application-level goodput
+
+ * properties of other connections carrying competing traffic, in
+ addition to the connections carrying the streaming media
+
+ * externally provided measurements, for example, from network cards
+ or metrics collected by the operating system
+
+3.3. Path Requirements
+
+ The bitrate requirements in Section 3.1 are per end user actively
+ consuming a media feed, so in the worst case, the bitrate demands can
+ be multiplied by the number of simultaneous users to find the
+ bandwidth requirements for a delivery path with that number of users
+ downstream. For example, at a node with 10,000 downstream users
+ simultaneously consuming video streams, approximately 80 Gbps might
+ be necessary for all of them to get typical content at 1080p
+ resolution.
+
+ However, when there is some overlap in the feeds being consumed by
+ end users, it is sometimes possible to reduce the bandwidth
+ provisioning requirements for the network by performing some kind of
+ replication within the network. This can be achieved via object
+ caching with the delivery of replicated objects over individual
+ connections and/or by packet-level replication using multicast.
+
+ To the extent that replication of popular content can be performed,
+ bandwidth requirements at peering or ingest points can be reduced to
+ as low as a per-feed requirement instead of a per-user requirement.
+
+3.4. Caching Systems
+
+ When demand for content is relatively predictable, and especially
+ when that content is relatively static, caching content close to
+ requesters and preloading caches to respond quickly to initial
+ requests are often useful (for example, HTTP/1.1 caching is described
+ in [RFC9111]). This is subject to the usual considerations for
+ caching -- for example, how much data must be cached to make a
+ significant difference to the requester and how the benefit of
+ caching and preloading cache balances against the costs of tracking
+ stale content in caches and refreshing that content.
+
+ It is worth noting that not all high-demand content is "live"
+ content. One relevant example is when popular streaming content can
+ be staged close to a significant number of requesters, as can happen
+ when a new episode of a popular show is released. This content may
+ be largely stable and is therefore low-cost to maintain in multiple
+ places throughout the Internet. This can reduce demands for high
+ end-to-end bandwidth without having to use mechanisms like multicast.
+
+ Caching and preloading can also reduce exposure to peering point
+ congestion, since less traffic crosses the peering point exchanges if
+ the caches are placed in peer networks. This is especially true when
+ the content can be preloaded during off-peak hours and if the
+ transfer can make use of "A Lower-Effort Per-Hop Behavior (LE PHB)
+ for Differentiated Services" [RFC8622], "Low Extra Delay Background
+ Transport (LEDBAT)" [RFC6817], or similar mechanisms.
+
+ All of this depends, of course, on the ability of a streaming media
+ operator to predict usage and provision bandwidth, caching, and other
+ mechanisms to meet the needs of users. In some cases (Section 3.5),
+ this is relatively routine, but in other cases, it is more difficult
+ (Section 3.6).
+
+ With the emergence of ultra-low-latency streaming, responses have to
+ start streaming to the end user while still being transmitted to the
+ cache and while the cache does not yet know the size of the object.
+ Some of the popular caching systems were designed around a cache
+ footprint and had deeply ingrained assumptions about knowing the size
+ of objects that are being stored, so the change in design
+ requirements in long-established systems caused some errors in
+ production. Incidents occurred where a transmission error in the
+ connection from the upstream source to the cache could result in the
+ cache holding a truncated segment and transmitting it to the end
+ user's device. In this case, players rendering the stream often had
+ a playback freeze until the player was reset. In some cases, the
+ truncated object was even cached that way and served later to other
+ players as well, causing continued stalls at the same spot in the
+ media for all players playing the segment delivered from that cache
+ node.
+
+3.5. Predictable Usage Profiles
+
+ Historical data shows that users consume more videos, and these
+ videos are encoded at a bitrate higher than they were in the past.
+ Improvements in the codecs that help reduce the encoding bitrates
+ with better compression algorithms have not offset the increase in
+ the demand for the higher quality video (higher resolution, higher
+ frame rate, better color gamut, better dynamic range, etc.). In
+ particular, mobile data usage in cellular access networks has shown a
+ large jump over the years due to increased consumption of
+ entertainment and conversational video.
+
+3.6. Unpredictable Usage Profiles
+
+ It is also possible for usage profiles to change significantly and
+ suddenly. These changes are more difficult to plan for, but at a
+ minimum, recognizing that sudden changes are happening is critical.
+
+ The two examples that follow are instructive.
+
+3.6.1. Peer-to-Peer Applications
+
+ In the first example, described in "Report from the IETF Workshop on
+ Peer-to-Peer (P2P) Infrastructure, May 28, 2008" [RFC5594], when the
+ BitTorrent file sharing application came into widespread use in 2005,
+ sudden and unexpected growth in peer-to-peer traffic led to
+ complaints from ISP customers about the performance of delay-
+ sensitive traffic (Voice over IP (VoIP) and gaming). These
+ performance issues resulted from at least two causes:
+
+ * Many access networks for end users used underlying technologies
+ that are inherently asymmetric, favoring downstream bandwidth
+ (e.g., ADSL, cellular technologies, and most IEEE 802.11
+ variants), assuming that most users will need more downstream
+ bandwidth than upstream bandwidth. This is a good assumption for
+ client-server applications, such as streaming media or software
+ downloads, but BitTorrent rewarded peers that uploaded as much as
+ they downloaded, so BitTorrent users had much more symmetric usage
+ profiles, which interacted badly with these asymmetric access
+ network technologies.
+
+ * Some P2P systems also used distributed hash tables to organize
+ peers into a ring topology, where each peer knew its "next peer"
+ and "previous peer". There was no connection between the
+ application-level ring topology and the lower-level network
+ topology, so a peer's "next peer" might be anywhere on the
+ reachable Internet. Traffic models that expected most
+ communication to take place with a relatively small number of
+ servers were unable to cope with peer-to-peer traffic that was
+ much less predictable.
+
+ Especially as end users increase the use of video-based social
+ networking applications, it will be helpful for access network
+ providers to watch for increasing numbers of end users uploading
+ significant amounts of content.
+
+3.6.2. Impact of Global Pandemic
+
+ Early in 2020, the COVID-19 pandemic and resulting quarantines and
+ shutdowns led to significant changes in traffic patterns due to a
+ large number of people who suddenly started working and attending
+ school remotely and using more interactive applications (e.g.,
+ videoconferencing and streaming media). Subsequently, the Internet
+ Architecture Board (IAB) held a COVID-19 Network Impacts Workshop
+ [RFC9075] in November 2020. The following observations from the
+ workshop report are worth considering.
+
+ * Participants describing different types of networks reported
+ different kinds of impacts, but all types of networks saw impacts.
+
+ * Mobile networks saw traffic reductions, and residential networks
+ saw significant increases.
+
+ * Reported traffic increases from ISPs and Internet Exchange Points
+ (IXPs) over just a few weeks were as big as the traffic growth
+ over the course of a typical year, representing a 15-20% surge in
+ growth to land at a new normal that was much higher than
+ anticipated.
+
+ * At Deutscher Commercial Internet Exchange (DE-CIX) Frankfurt, the
+ world's largest IXP in terms of data throughput, the year 2020 has
+ seen the largest increase in peak traffic within a single year
+ since the IXP was founded in 1995.
+
+ * The usage pattern changed significantly as work-from-home and
+ videoconferencing usage peaked during normal work hours, which
+ would have typically been off-peak hours with adults at work and
+ children at school. One might expect that the peak would have had
+ more impact on networks if it had happened during typical evening
+ peak hours for streaming applications.
+
+ * The increase in daytime bandwidth consumption reflected both
+ significant increases in essential applications, such as
+ videoconferencing and virtual private networks (VPNs), and
+ entertainment applications as people watched videos or played
+ games.
+
+ * At the IXP level, it was observed that physical link utilization
+ increased. This phenomenon could probably be explained by a
+ higher level of uncacheable traffic, such as videoconferencing and
+ VPNs, from residential users as they stopped commuting and
+ switched to working at home.
+
+ Again, it will be helpful for streaming operators to monitor traffic
+ as described in Section 5.6, watching for sudden changes in
+ performance.
+
+4. Latency Considerations
+
+ Streaming media latency refers to the "glass-to-glass" time duration,
+ which is the delay between the real-life occurrence of an event and
+ the streamed media being appropriately played on an end user's
+ device. Note that this is different from the network latency
+ (defined as the time for a packet to cross a network from one end to
+ another end) because it includes media encoding/decoding and
+ buffering time and, for most cases, also the ingest to an
+ intermediate service, such as a CDN or other media distribution
+ service, rather than a direct connection to an end user.
+
+ The team working on this document found these rough categories to be
+ useful when considering a streaming media application's latency
+ requirements:
+
+ * ultra-low-latency (less than 1 second)
+
+ * low-latency live (less than 10 seconds)
+
+ * non-low-latency live (10 seconds to a few minutes)
+
+ * on-demand (hours or more)
+
+4.1. Ultra-Low-Latency
+
+ Ultra-low-latency delivery of media is defined here as having a
+ glass-to-glass delay target under 1 second.
+
+ Some media content providers aim to achieve this level of latency for
+ live media events. This introduces new challenges when compared to
+ the other latency categories described in Section 4, because ultra-
+ low-latency is on the same scale as commonly observed end-to-end
+ network latency variation, often due to bufferbloat [CoDel], Wi-Fi
+ error correction, or packet reordering. These effects can make it
+ difficult to achieve ultra-low-latency for many users and may require
+ accepting relatively frequent user-visible media artifacts. However,
+ for controlled environments that provide mitigations against such
+ effects, ultra-low-latency is potentially achievable with the right
+ provisioning and the right media transport technologies.
+
+ Most applications operating over IP networks and requiring latency
+ this low use the Real-time Transport Protocol (RTP) [RFC3550] or
+ WebRTC [RFC8825], which uses RTP as its media transport protocol,
+ along with several other protocols necessary for safe operation in
+ browsers.
+
+ It is worth noting that many applications for ultra-low-latency
+ delivery do not need to scale to as many users as applications for
+ low-latency and non-low-latency live delivery, which simplifies many
+ delivery considerations.
+
+ Recommended reading for applications adopting an RTP-based approach
+ also includes [RFC7656]. For increasing the robustness of the
+ playback by implementing adaptive playout methods, refer to [RFC4733]
+ and [RFC6843].
+
+4.1.1. Near-Real-Time Latency
+
+ Some Internet applications that incorporate media streaming have
+ specific interactivity or control-feedback requirements that drive
+ much lower glass-to-glass media latency targets than 1 second. These
+ include videoconferencing or voice calls; remote video gameplay;
+ remote control of hardware platforms like drones, vehicles, or
+ surgical robots; and many other envisioned or deployed interactive
+ applications.
+
+ Applications with latency targets in these regimes are out of scope
+ for this document.
+
+4.2. Low-Latency Live
+
+ Low-latency live delivery of media is defined here as having a glass-
+ to-glass delay target under 10 seconds.
+
+ This level of latency is targeted to have a user experience similar
+ to broadcast TV delivery. A frequently cited problem with failing to
+ achieve this level of latency for live sporting events is the user
+ experience failure from having crowds within earshot of one another
+ who react audibly to an important play or from users who learn of an
+ event in the match via some other channel, for example, social media,
+ before it has happened on the screen showing the sporting event.
+
+ Applications requiring low-latency live media delivery are generally
+ feasible at scale with some restrictions. This typically requires
+ the use of a premium service dedicated to the delivery of live media,
+ and some trade-offs may be necessary relative to what is feasible in
+ a higher-latency service. The trade-offs may include higher costs,
+ delivering a lower quality media, reduced flexibility for adaptive
+ bitrates, or reduced flexibility for available resolutions so that
+ fewer devices can receive an encoding tuned for their display. Low-
+ latency live delivery is also more susceptible to user-visible
+ disruptions due to transient network conditions than higher-latency
+ services.
+
+ Implementation of a low-latency live media service can be achieved
+ with the use of HTTP Live Streaming (HLS) [RFC8216] by using its low-
+ latency extension (called LL-HLS) [HLS-RFC8216BIS] or with Dynamic
+ Adaptive Streaming over HTTP (DASH) [MPEG-DASH] by using its low-
+ latency extension (called LL-DASH) [LL-DASH]. These extensions use
+ the Common Media Application Format (CMAF) standard [MPEG-CMAF] that
+ allows the media to be packaged into and transmitted in units smaller
+ than segments, which are called "chunks" in CMAF language. This way,
+ the latency can be decoupled from the duration of the media segments.
+ Without a CMAF-like packaging, lower latencies can only be achieved
+ by using very short segment durations. However, using shorter
+ segments means using more frequent intra-coded frames, and that is
+ detrimental to video encoding quality. The CMAF standard allows us
+ to still use longer segments (improving encoding quality) without
+ penalizing latency.
+
+ While an LL-HLS client retrieves each chunk with a separate HTTP GET
+ request, an LL-DASH client uses the chunked transfer encoding feature
+ of the HTTP [CMAF-CTE], which allows the LL-DASH client to fetch all
+ the chunks belonging to a segment with a single GET request. An HTTP
+ server can transmit the CMAF chunks to the LL-DASH client as they
+ arrive from the encoder/packager. A detailed comparison of LL-HLS
+ and LL-DASH is given in [MMSP20].
+
+4.3. Non-Low-Latency Live
+
+ Non-low-latency live delivery of media is defined here as a live
+ stream that does not have a latency target shorter than 10 seconds.
+
+ This level of latency is the historically common case for segmented
+ media delivery using HLS and DASH. This level of latency is often
+ considered adequate for content like news. This level of latency is
+ also sometimes achieved as a fallback state when some part of the
+ delivery system or the client-side players do not support low-latency
+ live streaming.
+
+ This level of latency can typically be achieved at scale with
+ commodity CDN services for HTTP(s) delivery, and in some cases, the
+ increased time window can allow for the production of a wider range
+ of encoding options relative to the requirements for a lower-latency
+ service without the need for increasing the hardware footprint, which
+ can allow for wider device interoperability.
+
+4.4. On-Demand
+
+ On-demand media streaming refers to the playback of pre-recorded
+ media based on a user's action. In some cases, on-demand media is
+ produced as a by-product of a live media production, using the same
+ segments as the live event but freezing the manifest that describes
+ the media available from the media server after the live event has
+ finished. In other cases, on-demand media is constructed out of pre-
+ recorded assets with no streaming necessarily involved during the
+ production of the on-demand content.
+
+ On-demand media generally is not subject to latency concerns, but
+ other timing-related considerations can still be as important or even
+ more important to the user experience than the same considerations
+ with live events. These considerations include the startup time, the
+ stability of the media stream's playback quality, and avoidance of
+ stalls and other media artifacts during the playback under all but
+ the most severe network conditions.
+
+ In some applications, optimizations are available to on-demand media
+ but are not always available to live events, such as preloading the
+ first segment for a startup time that does not have to wait for a
+ network download to begin.
+
+5. Adaptive Encoding, Adaptive Delivery, and Measurement Collection
+
+ This section describes one of the best-known ways to provide a good
+ user experience over a given network path, but one thing to keep in
+ mind is that application-level mechanisms cannot provide a better
+ experience than the underlying network path can support.
+
+5.1. Overview
+
+ A simple model of media playback can be described as a media stream
+ consumer, a buffer, and a transport mechanism that fills the buffer.
+ The consumption rate is fairly static and is represented by the
+ content bitrate. The size of the buffer is also commonly a fixed
+ size. The buffer fill process needs to be at least fast enough to
+ ensure that the buffer is never empty; however, it also can have
+ significant complexity when things like personalization or
+ advertising insertion workflows are introduced.
+
+ The challenges in filling the buffer in a timely way fall into two
+ broad categories:
+
+ * Content variation (also sometimes called a "bitrate ladder") is
+ the set of content renditions that are available at any given
+ selection point.
+
+ * Content selection comprises all of the steps a client uses to
+ determine which content rendition to play.
+
+ The mechanism used to select the bitrate is part of the content
+ selection, and the content variation is all of the different bitrate
+ renditions.
+
+ Adaptive bitrate streaming ("ABR streaming" or simply "ABR") is a
+ commonly used technique for dynamically adjusting the media quality
+ of a stream to match bandwidth availability. When this goal is
+ achieved, the media server will tend to send enough media that the
+ media player does not "stall", without sending so much media that the
+ media player cannot accept it.
+
+ ABR uses an application-level response strategy in which the
+ streaming client attempts to detect the available bandwidth of the
+ network path by first observing the successful application-layer
+ download speed; then, given the available bandwidth, the client
+ chooses a bitrate for each of the video, audio, subtitles, and
+ metadata (among a limited number of available options for each type
+ of media) that fits within that bandwidth, typically adjusting as
+ changes in available bandwidth occur in the network or changes in
+ capabilities occur during the playback (such as available memory,
+ CPU, display size, etc.).
+
+5.2. Adaptive Encoding
+
+ Media servers can provide media streams at various bitrates because
+ the media has been encoded at various bitrates. This is a so-called
+ "ladder" of bitrates that can be offered to media players as part of
+ the manifest so that the media player can select among the available
+ bitrate choices.
+
+ The media server may also choose to alter which bitrates are made
+ available to players by adding or removing bitrate options from the
+ ladder delivered to the player in subsequent manifests built and sent
+ to the player. This way, both the player, through its selection of
+ bitrate to request from the manifest, and the server, through its
+ construction of the bitrates offered in the manifest, are able to
+ affect network utilization.
+
+5.3. Adaptive Segmented Delivery
+
+ Adaptive segmented delivery attempts to optimize its own use of the
+ path between a media server and a media client. ABR playback is
+ commonly implemented by streaming clients using HLS [RFC8216] or DASH
+ [MPEG-DASH] to perform a reliable segmented delivery of media over
+ HTTP. Different implementations use different strategies
+ [ABRSurvey], often relying on proprietary algorithms (called rate
+ adaptation or bitrate selection algorithms) to perform available
+ bandwidth estimation/prediction and the bitrate selection.
+
+ Many systems will do an initial probe or a very simple throughput
+ speed test at the start of media playback. This is done to get a
+ rough sense of the highest (total) media bitrate that the network
+ between the server and player will likely be able to provide under
+ initial network conditions. After the initial testing, clients tend
+ to rely upon passive network observations and will make use of
+ player-side statistics, such as buffer fill rates, to monitor and
+ respond to changing network conditions.
+
+ The choice of bitrate occurs within the context of optimizing for one
+ or more metrics monitored by the client, such as the highest
+ achievable audiovisual quality or the lowest chances for a
+ rebuffering event (playback stall).
+
+5.4. Advertising
+
+ The inclusion of advertising alongside or interspersed with streaming
+ media content is common in today's media landscape.
+
+ Some commonly used forms of advertising can introduce potential user
+ experience issues for a media stream. This section provides a very
+ brief overview of a complex and rapidly evolving space.
+
+ The same techniques used to allow a media player to switch between
+ renditions of different bitrates at segment boundaries can also be
+ used to enable the dynamic insertion of advertisements (hereafter
+ referred to as "ads"), but this does not mean that the insertion of
+ ads has no effect on the user's quality of experience.
+
+ Ads may be inserted with either Client-side Ad Insertion (CSAI) or
+ Server-side Ad Insertion (SSAI). In CSAI, the ABR manifest will
+ generally include links to an external ad server for some segments of
+ the media stream, while in SSAI, the server will remain the same
+ during ads but will include media segments that contain the
+ advertising. In SSAI, the media segments may or may not be sourced
+ from an external ad server like with CSAI.
+
+ In general, the more targeted the ad request is, the more requests
+ the ad service needs to be able to handle concurrently. If
+ connectivity is poor to the ad service, this can cause rebuffering
+ even if the underlying media assets (both content and ads) can be
+ accessed quickly. The less targeted the ad request is, the more
+ likely that ad requests can be consolidated and that ads can be
+ cached similarly to the media content.
+
+ In some cases, especially with SSAI, advertising space in a stream is
+ reserved for a specific advertiser and can be integrated with the
+ video so that the segments share the same encoding properties, such
+ as bitrate, dynamic range, and resolution. However, in many cases,
+ ad servers integrate with a Supply Side Platform (SSP) that offers
+ advertising space in real-time auctions via an Ad Exchange, with bids
+ for the advertising space coming from Demand Side Platforms (DSPs)
+ that collect money from advertisers for delivering the ads. Most
+ such Ad Exchanges use application-level protocol specifications
+ published by the Interactive Advertising Bureau [IAB-ADS], an
+ industry trade organization.
+
+ This ecosystem balances several competing objectives, and integrating
+ with it naively can produce surprising user experience results. For
+ example, ad server provisioning and/or the bitrate of the ad segments
+ might be different from that of the main content, and either of these
+ differences can result in playback stalls. For another example,
+ since the inserted ads are often produced independently, they might
+ have a different base volume level than the main content, which can
+ make for a jarring user experience.
+
+ Another major source of competing objectives comes from user privacy
+ considerations vs. the advertiser's incentives to target ads to user
+ segments based on behavioral data. Multiple studies, for example,
+ [BEHAVE] and [BEHAVE2], have reported large improvements in ad
+ effectiveness when using behaviorally targeted ads, relative to
+ untargeted ads. This provides a strong incentive for advertisers to
+ gain access to the data necessary to perform behavioral targeting,
+ leading some to engage in what is indistinguishable from a pervasive
+ monitoring attack [RFC7258] based on user tracking in order to
+ collect the relevant data. A more complete review of issues in this
+ space is available in [BALANCING].
+
+ On top of these competing objectives, this market historically has
+ had incidents of misreporting of ad delivery to end users for
+ financial gain [ADFRAUD]. As a mitigation for concerns driven by
+ those incidents, some SSPs have required the use of specific media
+ players that include features like reporting of ad delivery or
+ providing additional user information that can be used for tracking.
+
+ In general, this is a rapidly developing space with many
+ considerations, and media streaming operators engaged in advertising
+ may need to research these and other concerns to find solutions that
+ meet their user experience, user privacy, and financial goals. For
+ further reading on mitigations, [BAP] has published some standards
+ and best practices based on user experience research.
+
+5.5. Bitrate Detection Challenges
+
+ This kind of bandwidth-measurement system can experience various
+ troubles that are affected by networking and transport protocol
+ issues. Because adaptive application-level response strategies are
+ often using rates as observed by the application layer, there are
+ sometimes inscrutable transport-level protocol behaviors that can
+ produce surprising measurement values when the application-level
+ feedback loop is interacting with a transport-level feedback loop.
+
+ A few specific examples of surprising phenomena that affect bitrate
+ detection measurements are described in the following subsections.
+ As these examples will demonstrate, it is common to encounter cases
+ that can deliver application-level measurements that are too low, too
+ high, and (possibly) correct but that vary more quickly than a lab-
+ tested selection algorithm might expect.
+
+ These effects and others that cause transport behavior to diverge
+ from lab modeling can sometimes have a significant impact on bitrate
+ selection and on user QoE, especially where players use naive
+ measurement strategies and selection algorithms that do not account
+ for the likelihood of bandwidth measurements that diverge from the
+ true path capacity.
+
+5.5.1. Idle Time between Segments
+
+ When the bitrate selection is chosen substantially below the
+ available capacity of the network path, the response to a segment
+ request will typically complete in much less absolute time than the
+ duration of the requested segment, leaving significant idle time
+ between segment downloads. This can have a few surprising
+ consequences:
+
+ * TCP slow-start, when restarting after idle, requires multiple RTTs
+ to re-establish a throughput at the network's available capacity.
+ When the active transmission time for segments is substantially
+ shorter than the time between segments, leaving an idle gap
+ between segments that triggers a restart of TCP slow-start, the
+ estimate of the successful download speed coming from the
+ application-visible receive rate on the socket can thus end up
+ much lower than the actual available network capacity. This, in
+ turn, can prevent a shift to the most appropriate bitrate.
+ [RFC7661] provides some mitigations for this effect at the TCP
+ transport layer for senders who anticipate a high incidence of
+ this problem.
+
+ * Mobile flow-bandwidth spectrum and timing mapping can be impacted
+ by idle time in some networks. The carrier capacity assigned to a
+ physical or virtual link can vary with activity. Depending on the
+ idle time characteristics, this can result in a lower available
+ bitrate than would be achievable with a steadier transmission in
+ the same network.
+
+ Some receiver-side ABR algorithms, such as [ELASTIC], are designed to
+ try to avoid this effect.
+
+ Another way to mitigate this effect is by the help of two
+ simultaneous TCP connections, as explained in [MMSys11] for Microsoft
+ Smooth Streaming. In some cases, the system-level TCP slow-start
+ restart can also be disabled, for example, as described in
+ [OReilly-HPBN].
+
+5.5.2. Noisy Measurements
+
+ In addition to smoothing over an appropriate time scale to handle
+ network jitter (see [RFC5481]), ABR systems relying on measurements
+ at the application layer also have to account for noise from the in-
+ order data transmission at the transport layer.
+
+ For instance, in the event of a lost packet on a TCP connection with
+ SACK support (a common case for segmented delivery in practice), loss
+ of a packet can provide a confusing bandwidth signal to the receiving
+ application. Because of the sliding window in TCP, many packets may
+ be accepted by the receiver without being available to the
+ application until the missing packet arrives. Upon the arrival of
+ the one missing packet after retransmit, the receiver will suddenly
+ get access to a lot of data at the same time.
+
+ To a receiver measuring bytes received per unit time at the
+ application layer and interpreting it as an estimate of the available
+ network bandwidth, this appears as a high jitter in the goodput
+ measurement, presenting as a stall followed by a sudden leap that can
+ far exceed the actual capacity of the transport path from the server
+ when the hole in the received data is filled by a later
+ retransmission.
+
+5.5.3. Wide and Rapid Variation in Path Capacity
+
+ As many end devices have moved to wireless connections for the final
+ hop (such as Wi-Fi, 5G, LTE, etc.), new problems in bandwidth
+ detection have emerged.
+
+ In most real-world operating environments, wireless links can often
+ experience sudden changes in capacity as the end user device moves
+ from place to place or encounters new sources of interference.
+ Microwave ovens, for example, can cause a throughput degradation in
+ Wi-Fi of more than a factor of 2 while active [Micro].
+
+ These swings in actual transport capacity can result in user
+ experience issues when interacting with ABR algorithms that are not
+ tuned to handle the capacity variation gracefully.
+
+5.6. Measurement Collection
+
+ Media players use measurements to guide their segment-by-segment
+ adaptive streaming requests but may also provide measurements to
+ streaming media providers.
+
+ In turn, media providers may base analytics on these measurements to
+ guide decisions, such as whether adaptive encoding bitrates in use
+ are the best ones to provide to media players or whether current
+ media content caching is providing the best experience for viewers.
+
+ To that effect, the Consumer Technology Association (CTA), who owns
+ the Web Application Video Ecosystem (WAVE) project, has published two
+ important specifications.
+
+ * CTA-2066: Streaming Quality of Experience Events, Properties and
+ Metrics
+
+ [CTA-2066] specifies a set of media player events, properties, QoE
+ metrics, and associated terminology for representing streaming media
+ QoE across systems, media players, and analytics vendors. While all
+ these events, properties, metrics, and associated terminology are
+ used across a number of proprietary analytics and measurement
+ solutions, they were used in slightly (or vastly) different ways that
+ led to interoperability issues. CTA-2066 attempts to address this
+ issue by defining common terminology and how each metric should be
+ computed for consistent reporting.
+
+ * CTA-5004: Web Application Video Ecosystem - Common Media Client
+ Data (CMCD)
+
+ Many assume that the CDNs have a holistic view of the health and
+ performance of the streaming clients. However, this is not the case.
+ The CDNs produce millions of log lines per second across hundreds of
+ thousands of clients, and they have no concept of a "session" as a
+ client would have, so CDNs are decoupled from the metrics the clients
+ generate and report. A CDN cannot tell which request belongs to
+ which playback session, the duration of any media object, the
+ bitrate, or whether any of the clients have stalled and are
+ rebuffering or are about to stall and will rebuffer. The consequence
+ of this decoupling is that a CDN cannot prioritize delivery for when
+ the client needs it most, prefetch content, or trigger alerts when
+ the network itself may be underperforming. One approach to couple
+ the CDN to the playback sessions is for the clients to communicate
+ standardized media-relevant information to the CDNs while they are
+ fetching data. [CTA-5004] was developed exactly for this purpose.
+
+6. Transport Protocol Behaviors and Their Implications for Media
+ Transport Protocols
+
+ Within this document, the term "media transport protocol" is used to
+ describe any protocol that carries media metadata and media segments
+ in its payload, and the term "transport protocol" describes any
+ protocol that carries a media transport protocol, or another
+ transport protocol, in its payload. This is easier to understand if
+ the reader assumes a protocol stack that looks something like this:
+
+ Media Segments
+ ---------------------------
+ Media Format
+ ---------------------------
+ Media Transport Protocol
+ ---------------------------
+ Transport Protocol(s)
+
+ where
+
+ * "Media segments" would be something like the output of a codec or
+ some other source of media segments, such as closed-captioning,
+
+ * "Media format" would be something like an RTP payload format
+ [RFC2736] or an ISO base media file format (ISOBMFF) profile
+ [ISOBMFF],
+
+ * "Media transport protocol" would be something like RTP [RFC3550]
+ or DASH [MPEG-DASH], and
+
+ * "Transport protocol" would be a protocol that provides appropriate
+ transport services, as described in Section 5 of [RFC8095].
+
+ Not all possible streaming media applications follow this model, but
+ for the ones that do, it seems useful to distinguish between the
+ protocol layer that is aware it is transporting media segments and
+ underlying protocol layers that are not aware.
+
+ As described in the abstract of [RFC8095], the IETF has standardized
+ a number of protocols that provide transport services. Although
+ these protocols, taken in total, provide a wide variety of transport
+ services, Section 6 will distinguish between two extremes:
+
+ * transport protocols used to provide reliable, in-order media
+ delivery to an endpoint, typically providing flow control and
+ congestion control (Section 6.1), and
+
+ * transport protocols used to provide unreliable, unordered media
+ delivery to an endpoint, without flow control or congestion
+ control (Section 6.2).
+
+ Because newly standardized transport protocols, such as QUIC
+ [RFC9000], that are typically implemented in user space can evolve
+ their transport behavior more rapidly than currently used transport
+ protocols that are typically implemented in operating system kernel
+ space, this document includes a description of how the path
+ characteristics that streaming media providers may see are likely to
+ evolve; see Section 6.3.
+
+ It is worth noting explicitly that the transport protocol layer might
+ include more than one protocol. For example, a specific media
+ transport protocol might run over HTTP, or over WebTransport, which
+ in turn runs over HTTP.
+
+ It is worth noting explicitly that more complex network protocol
+ stacks are certainly possible -- for instance, when packets with this
+ protocol stack are carried in a tunnel or in a VPN, the entire packet
+ would likely appear in the payload of other protocols. If these
+ environments are present, streaming media operators may need to
+ analyze their effects on applications as well.
+
+6.1. Media Transport over Reliable Transport Protocols
+
+ The HLS [RFC8216] and DASH [MPEG-DASH] media transport protocols are
+ typically carried over HTTP, and HTTP has used TCP as its only
+ standardized transport protocol until HTTP/3 [RFC9114]. These media
+ transport protocols use ABR response strategies as described in
+ Section 5 to respond to changing path characteristics, and underlying
+ transport protocols are also attempting to respond to changing path
+ characteristics.
+
+ The past success of the largely TCP-based Internet is evidence that
+ the various flow control and congestion control mechanisms that TCP
+ has used to achieve equilibrium quickly, at a point where TCP senders
+ do not interfere with other TCP senders for sustained periods of time
+ [RFC5681], have been largely successful. The Internet has continued
+ to work even when the specific TCP mechanisms used to reach
+ equilibrium changed over time [RFC7414]. Because TCP provided a
+ common tool to avoid contention, even when significant TCP-based
+ applications like FTP were largely replaced by other significant TCP-
+ based applications like HTTP, the transport behavior remained safe
+ for the Internet.
+
+ Modern TCP implementations [RFC9293] continue to probe for available
+ bandwidth and "back off" when a network path is saturated but may
+ also work to avoid growing queues along network paths, which can
+ prevent older TCP senders from quickly detecting when a network path
+ is becoming saturated. Congestion control mechanisms, such as Copa
+ [COPA18] and Bottleneck Bandwidth and Round-trip propagation time
+ (BBR) [BBR-CONGESTION-CONTROL], make these decisions based on
+ measured path delays, assuming that if the measured path delay is
+ increasing, the sender is injecting packets onto the network path
+ faster than the network can forward them (or the receiver can accept
+ them), so the sender should adjust its sending rate accordingly.
+
+ Although common TCP behavior has changed significantly since the days
+ of [Jacobson-Karels] and [RFC2001], even with adding new congestion
+ controllers such as CUBIC [RFC8312], the common practice of
+ implementing TCP as part of an operating system kernel has acted to
+ limit how quickly TCP behavior can change. Even with the widespread
+ use of automated operating system update installation on many end-
+ user systems, streaming media providers could have a reasonable
+ expectation that they could understand TCP transport protocol
+ behaviors and that those behaviors would remain relatively stable in
+ the short term.
+
+6.2. Media Transport over Unreliable Transport Protocols
+
+ Because UDP does not provide any feedback mechanism to senders to
+ help limit impacts on other users, UDP-based application-level
+ protocols have been responsible for the decisions that TCP-based
+ applications have delegated to TCP, i.e., what to send, how much to
+ send, and when to send it. Because UDP itself has no transport-layer
+ feedback mechanisms, UDP-based applications that send and receive
+ substantial amounts of information are expected to provide their own
+ feedback mechanisms and to respond to the feedback the application
+ receives. This expectation is most recently codified as a Best
+ Current Practice [RFC8085].
+
+ In contrast to adaptive segmented delivery over a reliable transport
+ as described in Section 5.3, some applications deliver streaming
+ media segments using an unreliable transport and rely on a variety of
+ approaches, including:
+
+ * media encapsulated in a raw MPEG Transport Stream (MPEG-TS)
+ [MPEG-TS] over UDP, which makes no attempt to account for
+ reordering or loss in the transport,
+
+ * RTP [RFC3550], which can notice packet loss and repair some
+ limited reordering,
+
+ * the Stream Control Transmission Protocol (SCTP) [RFC9260], which
+ can use partial reliability [RFC3758] to recover from some loss
+ but can abandon recovery to limit head-of-line blocking, and
+
+ * the Secure Reliable Transport (SRT) [SRT], which can use forward
+ error correction and time-bound retransmission to recover from
+ loss within certain limits but can abandon recovery to limit head-
+ of-line blocking.
+
+ Under congestion and loss, approaches like the above generally
+ experience transient media artifacts more often and delay of playback
+ effects less often, as compared with reliable segment transport.
+ Often, one of the key goals of using a UDP-based transport that
+ allows some unreliability is to reduce latency and better support
+ applications like videoconferencing or other live-action video with
+ interactive components, such as some sporting events.
+
+ Congestion avoidance strategies for deployments using unreliable
+ transport protocols vary widely in practice, ranging from being
+ entirely unresponsive to responding by using strategies, including:
+
+ * feedback signaling to change encoder settings (as in [RFC5762]),
+
+ * fewer enhancement layers (as in [RFC6190]), and
+
+ * proprietary methods to detect QoE issues and turn off video to
+ allow less bandwidth-intensive media, such as audio, to be
+ delivered.
+
+ RTP relies on RTCP sender and receiver reports [RFC3550] as its own
+ feedback mechanism and even includes circuit breakers for unicast RTP
+ sessions [RFC8083] for situations when normal RTP congestion control
+ has not been able to react sufficiently to RTP flows sending at rates
+ that result in sustained packet loss.
+
+ The notion of "circuit breakers" has also been applied to other UDP
+ applications in [RFC8084], such as tunneling packets over UDP that
+ are potentially not congestion controlled (for example,
+ "encapsulating MPLS in UDP", as described in [RFC7510]). If
+ streaming media segments are carried in tunnels encapsulated in UDP,
+ these media streams may encounter "tripped circuit breakers", with
+ resulting user-visible impacts.
+
+6.3. QUIC and Changing Transport Protocol Behavior
+
+ The QUIC protocol, developed from a proprietary protocol into an IETF
+ Standards Track protocol [RFC9000], behaves differently than the
+ transport protocols characterized in Sections 6.1 and 6.2.
+
+ Although QUIC provides an alternative to the TCP and UDP transport
+ protocols, QUIC is itself encapsulated in UDP. As noted elsewhere in
+ Section 7.1, the QUIC protocol encrypts almost all of its transport
+ parameters and all of its payload, so any intermediaries that network
+ operators may be using to troubleshoot HTTP streaming media
+ performance issues, perform analytics, or even intercept exchanges in
+ current applications will not work for QUIC-based applications
+ without making changes to their networks. Section 7 describes the
+ implications of media encryption in more detail.
+
+ While QUIC is designed as a general-purpose transport protocol and
+ can carry different application-layer protocols, the current
+ standardized mapping is for HTTP/3 [RFC9114], which describes how
+ QUIC transport services are used for HTTP. The convention is for
+ HTTP/3 to run over UDP port 443 [Port443], but this is not a strict
+ requirement.
+
+ When HTTP/3 is encapsulated in QUIC, which is then encapsulated in
+ UDP, streaming operators (and network operators) might see UDP
+ traffic patterns that are similar to HTTP(S) over TCP. UDP ports may
+ be blocked for any port numbers that are not commonly used, such as
+ UDP 53 for DNS. Even when UDP ports are not blocked and QUIC packets
+ can flow, streaming operators (and network operators) may severely
+ rate-limit this traffic because they do not expect to see legitimate
+ high-bandwidth traffic, such as streaming media over the UDP ports
+ that HTTP/3 is using.
+
+ As noted in Section 5.5.2, because TCP provides a reliable, in-order
+ delivery service for applications, any packet loss for a TCP
+ connection causes head-of-line blocking so that no TCP segments
+ arriving after a packet is lost will be delivered to the receiving
+ application until retransmission of the lost packet has been
+ received, allowing in-order delivery to the application to continue.
+ As described in [RFC9000], QUIC connections can carry multiple
+ streams, and when packet losses do occur, only the streams carried in
+ the lost packet are delayed.
+
+ A QUIC extension currently being specified [RFC9221] adds the
+ capability for "unreliable" delivery, similar to the service provided
+ by UDP, but these datagrams are still subject to the QUIC
+ connection's congestion controller, providing some transport-level
+ congestion avoidance measures, which UDP does not.
+
+ As noted in Section 6.1, there is an increasing interest in
+ congestion control algorithms that respond to delay measurements
+ instead of responding to packet loss. These algorithms may deliver
+ an improved user experience, but in some cases, they have not
+ responded to sustained packet loss, which exhausts available buffers
+ along the end-to-end path that may affect other users sharing that
+ path. The QUIC protocol provides a set of congestion control hooks
+ that can be used for algorithm agility, and [RFC9002] defines a basic
+ congestion control algorithm that is roughly similar to TCP NewReno
+ [RFC6582]. However, QUIC senders can and do unilaterally choose to
+ use different algorithms, such as loss-based CUBIC [RFC8312], delay-
+ based Copa or BBR, or even something completely different.
+
+ The Internet community does have experience with deploying new
+ congestion controllers without causing congestion collapse on the
+ Internet. As noted in [RFC8312], both the CUBIC congestion
+ controller and its predecessor BIC have significantly different
+ behavior from Reno-style congestion controllers, such as TCP NewReno
+ [RFC6582]; both were added to the Linux kernel to allow
+ experimentation and analysis, both were then selected as the default
+ TCP congestion controllers in Linux, and both were deployed globally.
+
+ The point mentioned in Section 6.1 about TCP congestion controllers
+ being implemented in operating system kernels is different with QUIC.
+ Although QUIC can be implemented in operating system kernels, one of
+ the design goals when this work was chartered was "QUIC is expected
+ to support rapid, distributed development and testing of features";
+ to meet this expectation, many implementers have chosen to implement
+ QUIC in user space, outside the operating system kernel, and to even
+ distribute QUIC libraries with their own applications. It is worth
+ noting that streaming operators using HTTP/3, carried over QUIC, can
+ expect more frequent deployment of new congestion controller behavior
+ than has been the case with HTTP/1 and HTTP/2, carried over TCP.
+
+ It is worth considering that if TCP-based HTTP traffic and UDP-based
+ HTTP/3 traffic are allowed to enter operator networks on roughly
+ equal terms, questions of fairness and contention will be heavily
+ dependent on interactions between the congestion controllers in use
+ for TCP-based HTTP traffic and UDP-based HTTP/3 traffic.
+
+7. Streaming Encrypted Media
+
+ "Encrypted Media" has at least three meanings:
+
+ * Media encrypted at the application layer, typically using some
+ sort of Digital Rights Management (DRM) system or other object
+ encryption/security mechanism and typically remaining encrypted at
+ rest when senders and receivers store it.
+
+ * Media encrypted by the sender at the transport layer and remaining
+ encrypted until it reaches the ultimate media consumer (in this
+ document, it is referred to as end-to-end media encryption).
+
+ * Media encrypted by the sender at the transport layer and remaining
+ encrypted until it reaches some intermediary that is _not_ the
+ ultimate media consumer but has credentials allowing decryption of
+ the media content. This intermediary may examine and even
+ transform the media content in some way, before forwarding re-
+ encrypted media content (in this document, it is referred to as
+ hop-by-hop media encryption).
+
+ This document focuses on media encrypted at the transport layer,
+ whether encryption is performed hop by hop or end to end. Because
+ media encrypted at the application layer will only be processed by
+ application-level entities, this encryption does not have transport-
+ layer implications. Of course, both hop-by-hop and end-to-end
+ encrypted transport may carry media that is, in addition, encrypted
+ at the application layer.
+
+ Each of these encryption strategies is intended to achieve a
+ different goal. For instance, application-level encryption may be
+ used for business purposes, such as avoiding piracy or enforcing
+ geographic restrictions on playback, while transport-layer encryption
+ may be used to prevent media stream manipulation or to protect
+ manifests.
+
+ This document does not take a position on whether those goals are
+ valid.
+
+ Both end-to-end and hop-by-hop media encryption have specific
+ implications for streaming operators. These are described in
+ Sections 7.2 and 7.3.
+
+7.1. General Considerations for Streaming Media Encryption
+
+ The use of strong encryption does provide confidentiality for
+ encrypted streaming media, from the sender to either the ultimate
+ media consumer or to an intermediary that possesses credentials
+ allowing decryption. This does prevent deep packet inspection (DPI)
+ by any on-path intermediary that does not possess credentials
+ allowing decryption. However, even encrypted content streams may be
+ vulnerable to traffic analysis. An on-path observer that can
+ identify that encrypted traffic contains a media stream could
+ "fingerprint" this encrypted media stream and then compare it against
+ "fingerprints" of known content. The protection provided by strong
+ encryption can be further lessened if a streaming media operator is
+ repeatedly encrypting the same content. "Identifying HTTPS-Protected
+ Netflix Videos in Real-Time" [CODASPY17] is an example of what is
+ possible when identifying HTTPS-protected videos over TCP transport,
+ based either on the length of entire resources being transferred or
+ on characteristic packet patterns at the beginning of a resource
+ being transferred. If traffic analysis is successful at identifying
+ encrypted content and associating it with specific users, this tells
+ an on-path observer what resource is being streamed, and by who,
+ almost as certainly as examining decrypted traffic.
+
+ Because HTTPS has historically layered HTTP on top of TLS, which is
+ in turn layered on top of TCP, intermediaries have historically had
+ access to unencrypted TCP-level transport information, such as
+ retransmissions, and some carriers exploited this information in
+ attempts to improve transport-layer performance [RFC3135]. The most
+ recent standardized version of HTTPS, HTTP/3 [RFC9114], uses the QUIC
+ protocol [RFC9000] as its transport layer. QUIC relies on the TLS
+ 1.3 initial handshake [RFC8446] only for key exchange [RFC9001] and
+ encrypts almost all transport parameters itself, except for a few
+ invariant header fields. In the QUIC short header, the only
+ transport-level parameter that is sent "in the clear" is the
+ Destination Connection ID [RFC8999], and even in the QUIC long
+ header, the only transport-level parameters sent "in the clear" are
+ the version, Destination Connection ID, and Source Connection ID.
+ For these reasons, HTTP/3 is significantly more "opaque" than HTTPS
+ with HTTP/1 or HTTP/2.
+
+ [RFC9312] discusses the manageability of the QUIC transport protocol
+ that is used to encapsulate HTTP/3, focusing on the implications of
+ QUIC's design and wire image on network operations involving QUIC
+ traffic. It discusses what network operators can consider in some
+ detail.
+
+ More broadly, "Considerations around Transport Header
+ Confidentiality, Network Operations, and the Evolution of Internet
+ Transport Protocols" [RFC9065] describes the impact of increased
+ encryption of transport headers in general terms.
+
+ It is also worth noting that considerations for heavily encrypted
+ transport protocols also come into play when streaming media is
+ carried over IP-level VPNs and tunnels, with the additional
+ consideration that an intermediary that does not possess credentials
+ allowing decryption will not have visibility to the source and
+ destination IP addresses of the packets being carried inside the
+ tunnel.
+
+7.2. Considerations for Hop-by-Hop Media Encryption
+
+ Hop-by-hop media encryption offers the benefits described in
+ Section 7.1 between the streaming media operator and authorized
+ intermediaries, among authorized intermediaries, and between
+ authorized intermediaries and the ultimate media consumer; however,
+ it does not provide these benefits end to end. The streaming media
+ operator and ultimate media consumer must trust the authorized
+ intermediaries, and if these intermediaries cannot be trusted, the
+ benefits of encryption are lost.
+
+ Although the IETF has put considerable emphasis on end-to-end
+ streaming media encryption, there are still important use cases that
+ require the insertion of intermediaries.
+
+ There are a variety of ways to involve intermediaries, and some are
+ much more intrusive than others.
+
+ From a streaming media operator's perspective, a number of
+ considerations are in play. The first question is likely whether the
+ streaming media operator intends that intermediaries are explicitly
+ addressed from endpoints or whether the streaming media operator is
+ willing to allow intermediaries to "intercept" streaming content
+ transparently, with no awareness or permission from either endpoint.
+
+ If a streaming media operator does not actively work to avoid
+ interception by on-path intermediaries, the effect will be
+ indistinguishable from "impersonation attacks", and endpoints cannot
+ be assured of any level of confidentiality and cannot trust that the
+ content received came from the expected sender.
+
+ Assuming that a streaming media operator does intend to allow
+ intermediaries to participate in content streaming and does intend to
+ provide some level of privacy for endpoints, there are a number of
+ possible tools, either already available or still being specified.
+ These include the following:
+
+ Server and Network Assisted DASH [MPEG-DASH-SAND]:
+ This specification introduces explicit messaging between DASH
+ clients and DASH-aware network elements or among various DASH-
+ aware network elements for the purpose of improving the efficiency
+ of streaming sessions by providing information about real-time
+ operational characteristics of networks, servers, proxies, caches,
+ CDNs, as well as a DASH client's performance and status.
+
+ "Double Encryption Procedures for the Secure Real-Time Transport
+ Protocol (SRTP)" [RFC8723]:
+ This specification provides a cryptographic transform for the SRTP
+ that provides both hop-by-hop and end-to-end security guarantees.
+
+ Secure Frames [SFRAME]:
+ [RFC8723] is closely tied to SRTP, and this close association
+ impeded widespread deployment, because it could not be used for
+ the most common media content delivery mechanisms. A more recent
+ proposal, Secure Frames [SFRAME], also provides both hop-by-hop
+ and end-to-end security guarantees but can be used with other
+ media transport protocols beyond SRTP.
+
+ A streaming media operator's choice of whether to involve
+ intermediaries requires careful consideration. As an example, when
+ ABR manifests were commonly sent unencrypted, some access network
+ operators would modify manifests during peak hours by removing high-
+ bitrate renditions to prevent players from choosing those renditions,
+ thus reducing the overall bandwidth consumed for delivering these
+ media streams and thereby reducing the network load and improving the
+ average user experience for their customers. Now that ubiquitous
+ encryption typically prevents this kind of modification, a streaming
+ media operator who used intermediaries in the past, and who now
+ wishes to maintain the same level of network health and user
+ experience, must choose between adding intermediaries who are
+ authorized to change the manifests or adding some other form of
+ complexity to their service.
+
+ Some resources that might inform other similar considerations are
+ further discussed in [RFC8824] (for WebRTC) and [RFC9312] (for HTTP/3
+ and QUIC).
+
+7.3. Considerations for End-to-End Media Encryption
+
+ End-to-end media encryption offers the benefits described in
+ Section 7.1 from the streaming media operator to the ultimate media
+ consumer.
+
+ End-to-end media encryption has become much more widespread in the
+ years since the IETF issued "Pervasive Monitoring Is an Attack"
+ [RFC7258] as a Best Current Practice, describing pervasive monitoring
+ as a much greater threat than previously appreciated. After the
+ Snowden disclosures, many content providers made the decision to use
+ HTTPS protection -- HTTP over TLS -- for most or all content being
+ delivered as a routine practice, rather than in exceptional cases for
+ content that was considered sensitive.
+
+ However, as noted in [RFC7258], there is no way to prevent pervasive
+ monitoring by an attacker while allowing monitoring by a more benign
+ entity who only wants to use DPI to examine HTTP requests and
+ responses to provide a better user experience. If a modern encrypted
+ transport protocol is used for end-to-end media encryption,
+ unauthorized on-path intermediaries are unable to examine transport
+ and application protocol behavior. As described in Section 7.2, only
+ an intermediary explicitly authorized by the streaming media operator
+ who is to examine packet payloads, rather than intercepting packets
+ and examining them without authorization, can continue these
+ practices.
+
+ [RFC7258] states that "[t]he IETF will strive to produce
+ specifications that mitigate pervasive monitoring attacks", so
+ streaming operators should expect the IETF's direction toward
+ preventing unauthorized monitoring of IETF protocols to continue for
+ the foreseeable future.
+
+8. Additional Resources for Streaming Media
+
+ The Media Operations (MOPS) community maintains a list of references
+ and resources; for further reading, see [MOPS-RESOURCES].
+
+9. IANA Considerations
+
+ This document has no IANA actions.
+
+10. Security Considerations
+
+ Security is an important matter for streaming media applications, and
+ the topic of media encryption was explained in Section 7. This
+ document itself introduces no new security issues.
+
+11. Informative References
+
+ [ABRSurvey]
+ Bentaleb, A., Taani, B., Begen, A. C., Timmerer, C., and
+ R. Zimmermann, "A survey on bitrate adaptation schemes for
+ streaming media over HTTP", IEEE Communications Surveys &
+ Tutorials, vol. 21/1, pp. 562-585, Firstquarter 2019,
+ DOI 10.1109/COMST.2018.2862938,
+ <https://doi.org/10.1109/COMST.2018.2862938>.
+
+ [ADFRAUD] Sadeghpour, S. and N. Vlajic, "Ads and Fraud: A
+ Comprehensive Survey of Fraud in Online Advertising",
+ Journal of Cybersecurity and Privacy 1, no. 4, pp.
+ 804-832, DOI 10.3390/jcp1040039, December 2021,
+ <https://doi.org/10.3390/jcp1040039>.
+
+ [BALANCING]
+ Berger, D., "Balancing Consumer Privacy with Behavioral
+ Targeting", Santa Clara High Technology Law Journal, Vol.
+ 27, Issue 1, Article 2, 2010,
+ <https://digitalcommons.law.scu.edu/chtlj/vol27/iss1/2/>.
+
+ [BAP] Coalition for Better Ads, "Making Online Ads Better for
+ Everyone", <https://www.betterads.org/>.
+
+ [BBR-CONGESTION-CONTROL]
+ Cardwell, N., Cheng, Y., Yeganeh, S. H., Swett, I., and V.
+ Jacobson, "BBR Congestion Control", Work in Progress,
+ Internet-Draft, draft-cardwell-iccrg-bbr-congestion-
+ control-02, 7 March 2022,
+ <https://datatracker.ietf.org/doc/html/draft-cardwell-
+ iccrg-bbr-congestion-control-02>.
+
+ [BEHAVE] Yan, J., Liu, N., Wang, G., Zhang, W., Jiang, Y., and Z.
+ Chen, "How much can behavioral targeting help online
+ advertising?", WWW '09: Proceedings of the 18th
+ international conference on World wide web, pp. 261-270,
+ DOI 10.1145/1526709.1526745, April 2009,
+ <https://dl.acm.org/doi/abs/10.1145/1526709.1526745>.
+
+ [BEHAVE2] Goldfarb, A. and C. E. Tucker, "Online advertising,
+ behavioral targeting, and privacy", Communications of the
+ ACM, Volume 54, Issue 5, pp. 25-27,
+ DOI 10.1145/1941487.1941498, May 2011,
+ <https://dl.acm.org/doi/abs/10.1145/1941487.1941498>.
+
+ [CMAF-CTE] Bentaleb, A., Akcay, M., Lim, M., Begen, A., and R.
+ Zimmermann, "Catching the Moment With LoL+ in Twitch-Like
+ Low-Latency Live Streaming Platforms", IEEE Trans.
+ Multimedia, Vol. 24, pp. 2300-2314,
+ DOI 10.1109/TMM.2021.3079288, May 2021,
+ <https://doi.org/10.1109/TMM.2021.3079288>.
+
+ [CODASPY17]
+ Reed, A. and M. Kranch, "Identifying HTTPS-Protected
+ Netflix Videos in Real-Time", ACM CODASPY,
+ DOI 10.1145/3029806.3029821, March 2017,
+ <https://dl.acm.org/doi/10.1145/3029806.3029821>.
+
+ [CoDel] Nichols, K. and V. Jacobson, "Controlling queue delay",
+ Communications of the ACM, Volume 55, Issue 7, pp. 42-50",
+ DOI 10.1145/2209249.2209264, July 2012,
+ <https://doi.org/10.1145/2209249.2209264>.
+
+ [COPA18] Arun, V. and H. Balakrishnan, "Copa: Practical Delay-Based
+ Congestion Control for the Internet", USENIX NSDI, April
+ 2018, <https://web.mit.edu/copa/>.
+
+ [CTA-2066] Consumer Technology Association, "Streaming Quality of
+ Experience Events, Properties and Metrics", CTA-2066,
+ March 2020, <https://shop.cta.tech/products/streaming-
+ quality-of-experience-events-properties-and-metrics>.
+
+ [CTA-5004] Consumer Technology Association, "Web Application Video
+ Ecosystem - Common Media Client Data", CTA-5004, September
+ 2020, <https://shop.cta.tech/products/web-application-
+ video-ecosystem-common-media-client-data-cta-5004>.
+
+ [CVNI] Cisco, "Cisco Visual Networking Index: Forecast and
+ Trends, 2017–2022", 2018.
+
+ [ELASTIC] De Cicco, L., Caldaralo, V., Palmisano, V., and S.
+ Mascolo, "ELASTIC: A Client-Side Controller for Dynamic
+ Adaptive Streaming over HTTP (DASH)", Packet Video
+ Workshop, DOI 10.1109/PV.2013.6691442, December 2013,
+ <https://ieeexplore.ieee.org/document/6691442>.
+
+ [Encodings]
+ Apple Developer, "HTTP Live Streaming (HLS) Authoring
+ Specification for Apple Devices", June 2020,
+ <https://developer.apple.com/documentation/
+ http_live_streaming/
+ hls_authoring_specification_for_apple_devices>.
+
+ [HLS-RFC8216BIS]
+ Pantos, R., Ed., "HTTP Live Streaming 2nd Edition", Work
+ in Progress, Internet-Draft, draft-pantos-hls-rfc8216bis-
+ 11, 11 May 2022, <https://www.ietf.org/archive/id/draft-
+ pantos-hls-rfc8216bis-11.txt>.
+
+ [IAB-ADS] "IAB", <https://www.iab.com/>.
+
+ [ISOBMFF] ISO, "Information technology - Coding of audio-visual
+ objects - Part 12: ISO base media file format", ISO/
+ IEC 14496-12:2022, January 2022,
+ <https://www.iso.org/standard/83102.html>.
+
+ [Jacobson-Karels]
+ Jacobson, V. and M. Karels, "Congestion Avoidance and
+ Control", November 1988,
+ <https://ee.lbl.gov/papers/congavoid.pdf>.
+
+ [LL-DASH] DASH-IF, "Low-latency Modes for DASH", March 2020,
+ <https://dashif.org/docs/CR-Low-Latency-Live-r8.pdf>.
+
+ [Micro] Taher, T. M., Misurac, M. J., LoCicero, J. L., and D. R.
+ Ucci, "Microwave Oven Signal Interference Mitigation For
+ Wi-Fi Communication Systems", 2008 5th IEEE Consumer
+ Communications and Networking Conference, pp. 67-68,
+ DOI 10.1109/ccnc08.2007.21, January 2008,
+ <https://doi.org/10.1109/ccnc08.2007.21>.
+
+ [MMSP20] Durak, K. et al., "Evaluating the Performance of Apple's
+ Low-Latency HLS", IEEE MMSP,
+ DOI 10.1109/MMSP48831.2020.9287117, September 2020,
+ <https://ieeexplore.ieee.org/document/9287117>.
+
+ [MMSys11] Akhshabi, S., Begen, A. C., and C. Dovrolis, "An
+ experimental evaluation of rate-adaptation algorithms in
+ adaptive streaming over HTTP", ACM MMSys,
+ DOI 10.1145/1943552.1943574, February 2011,
+ <https://dl.acm.org/doi/10.1145/1943552.1943574>.
+
+ [MOPS-RESOURCES]
+ "rfc9317-additional-resources", September 2022,
+ <https://wiki.ietf.org/group/mops/rfc9317-additional-
+ resources>.
+
+ [MPEG-CMAF]
+ ISO, "Information technology - Multimedia application
+ format (MPEG-A) - Part 19: Common media application format
+ (CMAF) for segmented media", ISO/IEC 23000-19:2020, March
+ 2020, <https://www.iso.org/standard/79106.html>.
+
+ [MPEG-DASH]
+ ISO, "Information technology - Dynamic adaptive streaming
+ over HTTP (DASH) - Part 1: Media presentation description
+ and segment formats", ISO/IEC 23009-1:2022, August 2022,
+ <https://www.iso.org/standard/83314.html>.
+
+ [MPEG-DASH-SAND]
+ ISO, "Information technology - Dynamic adaptive streaming
+ over HTTP (DASH) - Part 5: Server and network assisted
+ DASH (SAND)", ISO/IEC 23009-5:2017, February 2017,
+ <https://www.iso.org/standard/69079.html>.
+
+ [MPEG-TS] ITU-T, "Information technology - Generic coding of moving
+ pictures and associated audio information: Systems", ITU-T
+ Recommendation H.222.0, June 2021,
+ <https://www.itu.int/rec/T-REC-H.222.0>.
+
+ [MPEGI] Boyce, J. M. et al., "MPEG Immersive Video Coding
+ Standard", Proceedings of the IEEE, Vol. 109, Issue 9, pp.
+ 1521-1536, DOI 10.1109/JPROC.2021.3062590,
+ <https://ieeexplore.ieee.org/document/9374648>.
+
+ [OReilly-HPBN]
+ Grigorik, I., "High Performance Browser Networking -
+ Chapter 2: Building Blocks of TCP", May 2021,
+ <https://hpbn.co/building-blocks-of-tcp/>.
+
+ [PCC] Schwarz, S. et al., "Emerging MPEG Standards for Point
+ Cloud Compression", IEEE Journal on Emerging and Selected
+ Topics in Circuits and Systems,
+ DOI 10.1109/JETCAS.2018.2885981, March 2019,
+ <https://ieeexplore.ieee.org/document/8571288>.
+
+ [Port443] IANA, "Service Name and Transport Protocol Port Number
+ Registry", <https://www.iana.org/assignments/service-
+ names-port-numbers>.
+
+ [RFC2001] Stevens, W., "TCP Slow Start, Congestion Avoidance, Fast
+ Retransmit, and Fast Recovery Algorithms", RFC 2001,
+ DOI 10.17487/RFC2001, January 1997,
+ <https://www.rfc-editor.org/info/rfc2001>.
+
+ [RFC2736] Handley, M. and C. Perkins, "Guidelines for Writers of RTP
+ Payload Format Specifications", BCP 36, RFC 2736,
+ DOI 10.17487/RFC2736, December 1999,
+ <https://www.rfc-editor.org/info/rfc2736>.
+
+ [RFC3135] Border, J., Kojo, M., Griner, J., Montenegro, G., and Z.
+ Shelby, "Performance Enhancing Proxies Intended to
+ Mitigate Link-Related Degradations", RFC 3135,
+ DOI 10.17487/RFC3135, June 2001,
+ <https://www.rfc-editor.org/info/rfc3135>.
+
+ [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
+ of Explicit Congestion Notification (ECN) to IP",
+ RFC 3168, DOI 10.17487/RFC3168, September 2001,
+ <https://www.rfc-editor.org/info/rfc3168>.
+
+ [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
+ Jacobson, "RTP: A Transport Protocol for Real-Time
+ Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
+ July 2003, <https://www.rfc-editor.org/info/rfc3550>.
+
+ [RFC3758] Stewart, R., Ramalho, M., Xie, Q., Tuexen, M., and P.
+ Conrad, "Stream Control Transmission Protocol (SCTP)
+ Partial Reliability Extension", RFC 3758,
+ DOI 10.17487/RFC3758, May 2004,
+ <https://www.rfc-editor.org/info/rfc3758>.
+
+ [RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF
+ Digits, Telephony Tones, and Telephony Signals", RFC 4733,
+ DOI 10.17487/RFC4733, December 2006,
+ <https://www.rfc-editor.org/info/rfc4733>.
+
+ [RFC5481] Morton, A. and B. Claise, "Packet Delay Variation
+ Applicability Statement", RFC 5481, DOI 10.17487/RFC5481,
+ March 2009, <https://www.rfc-editor.org/info/rfc5481>.
+
+ [RFC5594] Peterson, J. and A. Cooper, "Report from the IETF Workshop
+ on Peer-to-Peer (P2P) Infrastructure, May 28, 2008",
+ RFC 5594, DOI 10.17487/RFC5594, July 2009,
+ <https://www.rfc-editor.org/info/rfc5594>.
+
+ [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion
+ Control", RFC 5681, DOI 10.17487/RFC5681, September 2009,
+ <https://www.rfc-editor.org/info/rfc5681>.
+
+ [RFC5762] Perkins, C., "RTP and the Datagram Congestion Control
+ Protocol (DCCP)", RFC 5762, DOI 10.17487/RFC5762, April
+ 2010, <https://www.rfc-editor.org/info/rfc5762>.
+
+ [RFC6190] Wenger, S., Wang, Y.-K., Schierl, T., and A.
+ Eleftheriadis, "RTP Payload Format for Scalable Video
+ Coding", RFC 6190, DOI 10.17487/RFC6190, May 2011,
+ <https://www.rfc-editor.org/info/rfc6190>.
+
+ [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The
+ NewReno Modification to TCP's Fast Recovery Algorithm",
+ RFC 6582, DOI 10.17487/RFC6582, April 2012,
+ <https://www.rfc-editor.org/info/rfc6582>.
+
+ [RFC6817] Shalunov, S., Hazel, G., Iyengar, J., and M. Kuehlewind,
+ "Low Extra Delay Background Transport (LEDBAT)", RFC 6817,
+ DOI 10.17487/RFC6817, December 2012,
+ <https://www.rfc-editor.org/info/rfc6817>.
+
+ [RFC6843] Clark, A., Gross, K., and Q. Wu, "RTP Control Protocol
+ (RTCP) Extended Report (XR) Block for Delay Metric
+ Reporting", RFC 6843, DOI 10.17487/RFC6843, January 2013,
+ <https://www.rfc-editor.org/info/rfc6843>.
+
+ [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an
+ Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May
+ 2014, <https://www.rfc-editor.org/info/rfc7258>.
+
+ [RFC7414] Duke, M., Braden, R., Eddy, W., Blanton, E., and A.
+ Zimmermann, "A Roadmap for Transmission Control Protocol
+ (TCP) Specification Documents", RFC 7414,
+ DOI 10.17487/RFC7414, February 2015,
+ <https://www.rfc-editor.org/info/rfc7414>.
+
+ [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black,
+ "Encapsulating MPLS in UDP", RFC 7510,
+ DOI 10.17487/RFC7510, April 2015,
+ <https://www.rfc-editor.org/info/rfc7510>.
+
+ [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and
+ B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms
+ for Real-Time Transport Protocol (RTP) Sources", RFC 7656,
+ DOI 10.17487/RFC7656, November 2015,
+ <https://www.rfc-editor.org/info/rfc7656>.
+
+ [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating
+ TCP to Support Rate-Limited Traffic", RFC 7661,
+ DOI 10.17487/RFC7661, October 2015,
+ <https://www.rfc-editor.org/info/rfc7661>.
+
+ [RFC8083] Perkins, C. and V. Singh, "Multimedia Congestion Control:
+ Circuit Breakers for Unicast RTP Sessions", RFC 8083,
+ DOI 10.17487/RFC8083, March 2017,
+ <https://www.rfc-editor.org/info/rfc8083>.
+
+ [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers",
+ BCP 208, RFC 8084, DOI 10.17487/RFC8084, March 2017,
+ <https://www.rfc-editor.org/info/rfc8084>.
+
+ [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
+ Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
+ March 2017, <https://www.rfc-editor.org/info/rfc8085>.
+
+ [RFC8095] Fairhurst, G., Ed., Trammell, B., Ed., and M. Kuehlewind,
+ Ed., "Services Provided by IETF Transport Protocols and
+ Congestion Control Mechanisms", RFC 8095,
+ DOI 10.17487/RFC8095, March 2017,
+ <https://www.rfc-editor.org/info/rfc8095>.
+
+ [RFC8216] Pantos, R., Ed. and W. May, "HTTP Live Streaming",
+ RFC 8216, DOI 10.17487/RFC8216, August 2017,
+ <https://www.rfc-editor.org/info/rfc8216>.
+
+ [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and
+ R. Scheffenegger, "CUBIC for Fast Long-Distance Networks",
+ RFC 8312, DOI 10.17487/RFC8312, February 2018,
+ <https://www.rfc-editor.org/info/rfc8312>.
+
+ [RFC8404] Moriarty, K., Ed. and A. Morton, Ed., "Effects of
+ Pervasive Encryption on Operators", RFC 8404,
+ DOI 10.17487/RFC8404, July 2018,
+ <https://www.rfc-editor.org/info/rfc8404>.
+
+ [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol
+ Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
+ <https://www.rfc-editor.org/info/rfc8446>.
+
+ [RFC8622] Bless, R., "A Lower-Effort Per-Hop Behavior (LE PHB) for
+ Differentiated Services", RFC 8622, DOI 10.17487/RFC8622,
+ June 2019, <https://www.rfc-editor.org/info/rfc8622>.
+
+ [RFC8723] Jennings, C., Jones, P., Barnes, R., and A.B. Roach,
+ "Double Encryption Procedures for the Secure Real-Time
+ Transport Protocol (SRTP)", RFC 8723,
+ DOI 10.17487/RFC8723, April 2020,
+ <https://www.rfc-editor.org/info/rfc8723>.
+
+ [RFC8824] Minaburo, A., Toutain, L., and R. Andreasen, "Static
+ Context Header Compression (SCHC) for the Constrained
+ Application Protocol (CoAP)", RFC 8824,
+ DOI 10.17487/RFC8824, June 2021,
+ <https://www.rfc-editor.org/info/rfc8824>.
+
+ [RFC8825] Alvestrand, H., "Overview: Real-Time Protocols for
+ Browser-Based Applications", RFC 8825,
+ DOI 10.17487/RFC8825, January 2021,
+ <https://www.rfc-editor.org/info/rfc8825>.
+
+ [RFC8834] Perkins, C., Westerlund, M., and J. Ott, "Media Transport
+ and Use of RTP in WebRTC", RFC 8834, DOI 10.17487/RFC8834,
+ January 2021, <https://www.rfc-editor.org/info/rfc8834>.
+
+ [RFC8835] Alvestrand, H., "Transports for WebRTC", RFC 8835,
+ DOI 10.17487/RFC8835, January 2021,
+ <https://www.rfc-editor.org/info/rfc8835>.
+
+ [RFC8999] Thomson, M., "Version-Independent Properties of QUIC",
+ RFC 8999, DOI 10.17487/RFC8999, May 2021,
+ <https://www.rfc-editor.org/info/rfc8999>.
+
+ [RFC9000] Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
+ Multiplexed and Secure Transport", RFC 9000,
+ DOI 10.17487/RFC9000, May 2021,
+ <https://www.rfc-editor.org/info/rfc9000>.
+
+ [RFC9001] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure
+ QUIC", RFC 9001, DOI 10.17487/RFC9001, May 2021,
+ <https://www.rfc-editor.org/info/rfc9001>.
+
+ [RFC9002] Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection
+ and Congestion Control", RFC 9002, DOI 10.17487/RFC9002,
+ May 2021, <https://www.rfc-editor.org/info/rfc9002>.
+
+ [RFC9065] Fairhurst, G. and C. Perkins, "Considerations around
+ Transport Header Confidentiality, Network Operations, and
+ the Evolution of Internet Transport Protocols", RFC 9065,
+ DOI 10.17487/RFC9065, July 2021,
+ <https://www.rfc-editor.org/info/rfc9065>.
+
+ [RFC9075] Arkko, J., Farrell, S., Kühlewind, M., and C. Perkins,
+ "Report from the IAB COVID-19 Network Impacts Workshop
+ 2020", RFC 9075, DOI 10.17487/RFC9075, July 2021,
+ <https://www.rfc-editor.org/info/rfc9075>.
+
+ [RFC9111] Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke,
+ Ed., "HTTP Caching", STD 98, RFC 9111,
+ DOI 10.17487/RFC9111, June 2022,
+ <https://www.rfc-editor.org/info/rfc9111>.
+
+ [RFC9114] Bishop, M., Ed., "HTTP/3", RFC 9114, DOI 10.17487/RFC9114,
+ June 2022, <https://www.rfc-editor.org/info/rfc9114>.
+
+ [RFC9221] Pauly, T., Kinnear, E., and D. Schinazi, "An Unreliable
+ Datagram Extension to QUIC", RFC 9221,
+ DOI 10.17487/RFC9221, March 2022,
+ <https://www.rfc-editor.org/info/rfc9221>.
+
+ [RFC9260] Stewart, R., Tüxen, M., and K. Nielsen, "Stream Control
+ Transmission Protocol", RFC 9260, DOI 10.17487/RFC9260,
+ June 2022, <https://www.rfc-editor.org/info/rfc9260>.
+
+ [RFC9293] Eddy, W., Ed., "Transmission Control Protocol (TCP)",
+ STD 7, RFC 9293, DOI 10.17487/RFC9293, August 2022,
+ <https://www.rfc-editor.org/info/rfc9293>.
+
+ [RFC9312] Kühlewind, M. and B. Trammell, "Manageability of the QUIC
+ Transport Protocol", RFC 9312, DOI 10.17487/RFC9312,
+ September 2022, <https://www.rfc-editor.org/info/rfc9312>.
+
+ [SFRAME] IETF, "Secure Frame (sframe)",
+ <https://datatracker.ietf.org/doc/draft-ietf-sframe-enc/>.
+
+ [SRT] Sharabayko, M., "SRT Protocol Overview", April 2020,
+ <https://datatracker.ietf.org/meeting/interim-2020-mops-
+ 01/materials/slides-interim-2020-mops-01-sessa-srt-
+ protocol-overview-00>.
+
+ [Survey360]
+ Yaqoob, A., Bi, T., and G. Muntean, "A Survey on Adaptive
+ 360° Video Streaming: Solutions, Challenges and
+ Opportunities", IEEE Communications Surveys & Tutorials,
+ Volume 22, Issue 4, DOI 10.1109/COMST.2020.3006999, July
+ 2020, <https://ieeexplore.ieee.org/document/9133103>.
+
+Acknowledgments
+
+ Thanks to Nancy Cam-Winget, Leslie Daigle, Roman Danyliw, Glenn Deen,
+ Martin Duke, Linda Dunbar, Lars Eggert, Mike English, Roni Even,
+ Aaron Falk, Alexandre Gouaillard, Erik Kline, Renan Krishna, Warren
+ Kumari, Will Law, Chris Lemmons, Kiran Makhjani, Sanjay Mishra, Mark
+ Nottingham, Dave Oran, Lucas Pardue, Tommy Pauly, Kyle Rose, Zahed
+ Sarker, Michael Scharf, John Scudder, Valery Smyslov, Matt Stock,
+ Éric Vyncke, and Robert Wilton for very helpful suggestions, reviews,
+ and comments.
+
+Authors' Addresses
+
+ Jake Holland
+ Akamai Technologies, Inc.
+ 150 Broadway
+ Cambridge, MA 02144
+ United States of America
+ Email: jakeholland.net@gmail.com
+
+
+ Ali Begen
+ Networked Media
+ Turkey
+ Email: ali.begen@networked.media
+
+
+ Spencer Dawkins
+ Tencent America LLC
+ United States of America
+ Email: spencerdawkins.ietf@gmail.com