diff options
Diffstat (limited to 'doc/rfc/rfc9318.txt')
-rw-r--r-- | doc/rfc/rfc9318.txt | 1724 |
1 files changed, 1724 insertions, 0 deletions
diff --git a/doc/rfc/rfc9318.txt b/doc/rfc/rfc9318.txt new file mode 100644 index 0000000..4c0c822 --- /dev/null +++ b/doc/rfc/rfc9318.txt @@ -0,0 +1,1724 @@ + + + + +Internet Architecture Board (IAB) W. Hardaker +Request for Comments: 9318 +Category: Informational O. Shapira +ISSN: 2070-1721 October 2022 + + + IAB Workshop Report: Measuring Network Quality for End-Users + +Abstract + + The Measuring Network Quality for End-Users workshop was held + virtually by the Internet Architecture Board (IAB) on September + 14-16, 2021. This report summarizes the workshop, the topics + discussed, and some preliminary conclusions drawn at the end of the + workshop. + + Note that this document is a report on the proceedings of the + workshop. The views and positions documented in this report are + those of the workshop participants and do not necessarily reflect IAB + views and positions. + +Status of This Memo + + This document is not an Internet Standards Track specification; it is + published for informational purposes. + + This document is a product of the Internet Architecture Board (IAB) + and represents information that the IAB has deemed valuable to + provide for permanent record. It represents the consensus of the + Internet Architecture Board (IAB). Documents approved for + publication by the IAB are not candidates for any level of Internet + Standard; see Section 2 of RFC 7841. + + Information about the current status of this document, any errata, + and how to provide feedback on it may be obtained at + https://www.rfc-editor.org/info/rfc9318. + +Copyright Notice + + Copyright (c) 2022 IETF Trust and the persons identified as the + document authors. All rights reserved. + + This document is subject to BCP 78 and the IETF Trust's Legal + Provisions Relating to IETF Documents + (https://trustee.ietf.org/license-info) in effect on the date of + publication of this document. Please review these documents + carefully, as they describe your rights and restrictions with respect + to this document. + +Table of Contents + + 1. Introduction + 1.1. Problem Space + 2. Workshop Agenda + 3. Position Papers + 4. Workshop Topics and Discussion + 4.1. Introduction and Overviews + 4.1.1. Key Points from the Keynote by Vint Cerf + 4.1.2. Introductory Talks + 4.1.3. Introductory Talks - Key Points + 4.2. Metrics Considerations + 4.2.1. Common Performance Metrics + 4.2.2. Availability Metrics + 4.2.3. Capacity Metrics + 4.2.4. Latency Metrics + 4.2.5. Measurement Case Studies + 4.2.6. Metrics Key Points + 4.3. Cross-Layer Considerations + 4.3.1. Separation of Concerns + 4.3.2. Security and Privacy Considerations + 4.3.3. Metric Measurement Considerations + 4.3.4. Towards Improving Future Cross-Layer Observability + 4.3.5. Efficient Collaboration between Hardware and Transport + Protocols + 4.3.6. Cross-Layer Key Points + 4.4. Synthesis + 4.4.1. Measurement and Metrics Considerations + 4.4.2. End-User Metrics Presentation + 4.4.3. Synthesis Key Points + 5. Conclusions + 5.1. General Statements + 5.2. Specific Statements about Detailed Protocols/Techniques + 5.3. Problem Statements and Concerns + 5.4. No-Consensus-Reached Statements + 6. Follow-On Work + 7. IANA Considerations + 8. Security Considerations + 9. Informative References + Appendix A. Program Committee + Appendix B. Workshop Chairs + Appendix C. Workshop Participants + IAB Members at the Time of Approval + Acknowledgments + Contributors + Authors' Addresses + +1. Introduction + + The Internet Architecture Board (IAB) holds occasional workshops + designed to consider long-term issues and strategies for the + Internet, and to suggest future directions for the Internet + architecture. This long-term planning function of the IAB is + complementary to the ongoing engineering efforts performed by working + groups of the Internet Engineering Task Force (IETF). + + The Measuring Network Quality for End-Users workshop [WORKSHOP] was + held virtually by the Internet Architecture Board (IAB) on September + 14-16, 2021. This report summarizes the workshop, the topics + discussed, and some preliminary conclusions drawn at the end of the + workshop. + +1.1. Problem Space + + The Internet in 2021 is quite different from what it was 10 years + ago. Today, it is a crucial part of everyone's daily life. People + use the Internet for their social life, for their daily jobs, for + routine shopping, and for keeping up with major events. An + increasing number of people can access a gigabit connection, which + would be hard to imagine a decade ago. Additionally, thanks to + improvements in security, people trust the Internet for financial + banking transactions, purchasing goods, and everyday bill payments. + + At the same time, some aspects of the end-user experience have not + improved as much. Many users have typical connection latencies that + remain at decade-old levels. Despite significant reliability + improvements in data center environments, end users also still often + see interruptions in service. Despite algorithmic advances in the + field of control theory, one still finds that the queuing delays in + the last-mile equipment exceeds the accumulated transit delays. + Transport improvements, such as QUIC, Multipath TCP, and TCP Fast + Open, are still not fully supported in some networks. Likewise, + various advances in the security and privacy of user data are not + widely supported, such as encrypted DNS to the local resolver. + + Some of the major factors behind this lack of progress is the popular + perception that throughput is often the sole measure of the quality + of Internet connectivity. With such a narrow focus, the Measuring + Network Quality for End-Users workshop aimed to discuss various + topics: + + * What is user latency under typical working conditions? + + * How reliable is connectivity across longer time periods? + + * Do networks allow the use of a broad range of protocols? + + * What services can be run by network clients? + + * What kind of IPv4, NAT, or IPv6 connectivity is offered, and are + there firewalls? + + * What security mechanisms are available for local services, such as + DNS? + + * To what degree are the privacy, confidentiality, integrity, and + authenticity of user communications guarded? + + * Improving these aspects of network quality will likely depend on + measuring and exposing metrics in a meaningful way to all involved + parties, including to end users. Such measurement and exposure of + the right metrics will allow service providers and network + operators to concentrate focus on their users' experience and will + simultaneously empower users to choose the Internet Service + Providers (ISPs) that can deliver the best experience based on + their needs. + + * What are the fundamental properties of a network that contributes + to a good user experience? + + * What metrics quantify these properties, and how can we collect + such metrics in a practical way? + + * What are the best practices for interpreting those metrics and + incorporating them in a decision-making process? + + * What are the best ways to communicate these properties to service + providers and network operators? + + * How can these metrics be displayed to users in a meaningful way? + +2. Workshop Agenda + + The Measuring Network Quality for End-Users workshop was divided into + the following main topic areas; see further discussion in Sections 4 + and 5: + + * Introduction overviews and a keynote by Vint Cerf + + * Metrics considerations + + * Cross-layer considerations + + * Synthesis + + * Group conclusions + +3. Position Papers + + The following position papers were received for consideration by the + workshop attendees. The workshop's web page [WORKSHOP] contains + archives of the papers, presentations, and recorded videos. + + * Ahmed Aldabbagh. "Regulatory perspective on measuring network + quality for end users" [Aldabbagh2021] + + * Al Morton. "Dream-Pipe or Pipe-Dream: What Do Users Want (and how + can we assure it)?" [Morton2021] + + * Alexander Kozlov. "The 2021 National Internet Segment Reliability + Research" + + * Anna Brunstrom. "Measuring network quality - the MONROE + experience" + + * Bob Briscoe, Greg White, Vidhi Goel, and Koen De Schepper. "A + Single Common Metric to Characterize Varying Packet Delay" + [Briscoe2021] + + * Brandon Schlinker. "Internet Performance from Facebook's Edge" + [Schlinker2019] + + * Christoph Paasch, Kristen McIntyre, Randall Meyer, Stuart + Cheshire, and Omer Shapira. "An end-user approach to the Internet + Score" [McIntyre2021] + + * Christoph Paasch, Randall Meyer, Stuart Cheshire, and Omer + Shapira. "Responsiveness under Working Conditions" [Paasch2021] + + * Dave Reed and Levi Perigo. "Measuring ISP Performance in + Broadband America: A Study of Latency Under Load" [Reed2021] + + * Eve M. Schooler and Rick Taylor. "Non-traditional Network + Metrics" + + * Gino Dion. "Focusing on latency, not throughput, to provide + better internet experience and network quality" [Dion2021] + + * Gregory Mirsky, Xiao Min, Gyan Mishra, and Liuyan Han. "The error + performance metric in a packet-switched network" [Mirsky2021] + + * Jana Iyengar. "The Internet Exists In Its Use" [Iyengar2021] + + * Jari Arkko and Mirja Kuehlewind. "Observability is needed to + improve network quality" [Arkko2021] + + * Joachim Fabini. "Network Quality from an End User Perspective" + [Fabini2021] + + * Jonathan Foulkes. "Metrics helpful in assessing Internet Quality" + [Foulkes2021] + + * Kalevi Kilkki and Benajamin Finley. "In Search of Lost QoS" + [Kilkki2021] + + * Karthik Sundaresan, Greg White, and Steve Glennon. "Latency + Measurement: What is latency and how do we measure it?" + + * Keith Winstein. "Five Observations on Measuring Network Quality + for Users of Real-Time Media Applications" + + * Ken Kerpez, Jinous Shafiei, John Cioffi, Pete Chow, and Djamel + Bousaber. "Wi-Fi and Broadband Data" [Kerpez2021] + + * Kenjiro Cho. "Access Network Quality as Fitness for Purpose" + + * Koen De Schepper, Olivier Tilmans, and Gino Dion. "Challenges and + opportunities of hardware support for Low Queuing Latency without + Packet Loss" [DeSchepper2021] + + * Kyle MacMillian and Nick Feamster. "Beyond Speed Test: Measuring + Latency Under Load Across Different Speed Tiers" [MacMillian2021] + + * Lucas Pardue and Sreeni Tellakula. "Lower-layer performance not + indicative of upper-layer success" [Pardue2021] + + * Matt Mathis. "Preliminary Longitudinal Study of Internet + Responsiveness" [Mathis2021] + + * Michael Welzl. "A Case for Long-Term Statistics" [Welzl2021] + + * Mikhail Liubogoshchev. "Cross-layer Cooperation for Better + Network Service" [Liubogoshchev2021] + + * Mingrui Zhang, Vidhi Goel, and Lisong Xu. "User-Perceived Latency + to Measure CCAs" [Zhang2021] + + * Neil Davies and Peter Thompson. "Measuring Network Impact on + Application Outcomes Using Quality Attenuation" [Davies2021] + + * Olivier Bonaventure and Francois Michel. "Packet delivery time as + a tie-breaker for assessing Wi-Fi access points" [Michel2021] + + * Pedro Casas. "10 Years of Internet-QoE Measurements. Video, + Cloud, Conferencing, Web and Apps. What do we Need from the + Network Side?" [Casas2021] + + * Praveen Balasubramanian. "Transport Layer Statistics for Network + Quality" [Balasubramanian2021] + + * Rajat Ghai. "Using TCP Connect Latency for measuring CX and + Network Optimization" [Ghai2021] + + * Robin Marx and Joris Herbots. "Merge Those Metrics: Towards + Holistic (Protocol) Logging" [Marx2021] + + * Sandor Laki, Szilveszter Nadas, Balazs Varga, and Luis M. + Contreras. "Incentive-Based Traffic Management and QoS + Measurements" [Laki2021] + + * Satadal Sengupta, Hyojoon Kim, and Jennifer Rexford. "Fine- + Grained RTT Monitoring Inside the Network" [Sengupta2021] + + * Stuart Cheshire. "The Internet is a Shared Network" + [Cheshire2021] + + * Toerless Eckert and Alex Clemm. "network-quality-eckert-clemm- + 00.4" + + * Vijay Sivaraman, Sharat Madanapalli, and Himal Kumar. "Measuring + Network Experience Meaningfully, Accurately, and Scalably" + [Sivaraman2021] + + * Yaakov (J) Stein. "The Futility of QoS" [Stein2021] + +4. Workshop Topics and Discussion + + The agenda for the three-day workshop was broken into four separate + sections that each played a role in framing the discussions. The + workshop started with a series of introduction and problem space + presentations (Section 4.1), followed by metrics considerations + (Section 4.2), cross-layer considerations (Section 4.3), and a + synthesis discussion (Section 4.4). After the four subsections + concluded, a follow-on discussion was held to draw conclusions that + could be agreed upon by workshop participants (Section 5). + +4.1. Introduction and Overviews + + The workshop started with a broad focus on the state of user Quality + of Service (QoS) and Quality of Experience (QoE) on the Internet + today. The goal of the introductory talks was to set the stage for + the workshop by describing both the problem space and the current + solutions in place and their limitations. + + The introduction presentations provided views of existing QoS and QoE + measurements and their effectiveness. Also discussed was the + interaction between multiple users within the network, as well as the + interaction between multiple layers of the OSI stack. Vint Cerf + provided a keynote describing the history and importance of the + topic. + +4.1.1. Key Points from the Keynote by Vint Cerf + + We may be operating in a networking space with dramatically different + parameters compared to 30 years ago. This differentiation justifies + reconsidering not only the importance of one metric over the other + but also reconsidering the entire metaphor. + + It is time for the experts to look at not only adjusting TCP but also + exploring other protocols, such as QUIC has done lately. It's + important that we feel free to consider alternatives to TCP. TCP is + not a teddy bear, and one should not be afraid to replace it with a + transport layer with better properties that better benefit its users. + + A suggestion: we should consider exercises to identify desirable + properties. As we are looking at the parametric spaces, one can + identify "desirable properties", as opposed to "fundamental + properties", for example, a low-latency property. An example coming + from the Advanced Research Projects Agency (ARPA): you want to know + where the missile is now, not where it was. Understanding drives + particular parameter creation and selection in the design space. + + When parameter values are changed in extreme, such as connectiveness, + alternative designs will emerge. One case study of note is the + interplanetary protocol, where "ping" is no longer indicative of + anything useful. While we look at responsiveness, we should not + ignore connectivity. + + Unfortunately, maintaining backward compatibility is painful. The + work on designing IPv6 so as to transition from IPv4 could have been + done better if the backward compatibility was considered. It is too + late for IPv6, but it is not too late to consider this issue for + potential future problems. + + IPv6 is still not implemented fully everywhere. It's been a long + road to deployment since starting work in 1996, and we are still not + there. In 1996, the thinking was that it was quite easy to implement + IPv6, but that failed to hold true. In 1996, the dot-com boom began, + where a lot of money was spent quickly, and the moment was not caught + in time while the market expanded exponentially. This should serve + as a cautionary tale. + + One last point: consider performance across multiple hops in the + Internet. We've not seen many end-to-end metrics, as successfully + developing end-to-end measurements across different network and + business boundaries is quite hard to achieve. A good question to ask + when developing new protocols is "will the new protocol work across + multiple network hops?" + + Multi-hop networks are being gradually replaced by humongous, flat + networks with sufficient connectivity between operators so that + systems become 1 hop, or 2 hops at most, away from each other (e.g., + Google, Facebook, and Amazon). The fundamental architecture of the + Internet is changing. + +4.1.2. Introductory Talks + + The Internet is a shared network built on IP protocols using packet + switching to interconnect multiple autonomous networks. The + Internet's departure from circuit-switching technologies allowed it + to scale beyond any other known network design. On the other hand, + the lack of in-network regulation made it difficult to ensure the + best experience for every user. + + As Internet use cases continue to expand, it becomes increasingly + more difficult to predict which network characteristics correlate + with better user experiences. Different application classes, e.g., + video streaming and teleconferencing, can affect user experience in + ways that are complex and difficult to measure. Internet utilization + shifts rapidly during the course of each day, week, and year, which + further complicates identifying key metrics capable of predicting a + good user experience. + + QoS initiatives attempted to overcome these difficulties by strictly + prioritizing different types of traffic. However, QoS metrics do not + always correlate with user experience. The utility of the QoS metric + is further limited by the difficulties in building solutions with the + desired QoS characteristics. + + QoE initiatives attempted to integrate the psychological aspects of + how quality is perceived and create statistical models designed to + optimize the user experience. Despite these high modeling efforts, + the QoE approach proved beneficial in certain application classes. + Unfortunately, generalizing the models proved to be difficult, and + the question of how different applications affect each other when + sharing the same network remains an open problem. + + The industry's focus on giving the end user more throughput/bandwidth + led to remarkable advances. In many places around the world, a home + user enjoys gigabit speeds to their ISP. This is so remarkable that + it would have been brushed off as science fiction a decade ago. + However, the focus on increased capacity came at the expense of + neglecting another important core metric: latency. As a result, end + users whose experience is negatively affected by high latency were + advised to upgrade their equipment to get more throughput instead. + [MacMillian2021] showed that sometimes such an upgrade can lead to + latency improvements, due to the economical reasons of overselling + the "value-priced" data plans. + + As the industry continued to give end users more throughput, while + mostly neglecting latency concerns, application designs started to + employ various latency and short service disruption hiding + techniques. For example, a user's web browser performance experience + is closely tied to the content in the browser's local cache. While + such techniques can clearly improve the user experience when using + stale data is possible, this development further decouples user + experience from core metrics. + + In the most recent 10 years, efforts by Dave Taht and the bufferbloat + society have led to significant progress in updating queuing + algorithms to reduce latencies under load compared to simpler FIFO + queues. Unfortunately, the home router industry has yet to implement + these algorithms, mostly due to marketing and cost concerns. Most + home router manufacturers depend on System on a Chip (SoC) + acceleration to create products with a desired throughput. SoC + manufacturers opt for simpler algorithms and aggressive aggregation, + reasoning that a higher-throughput chip will have guaranteed demand. + Because consumers are offered choices primarily among different high- + throughput devices, the perception that a higher throughput leads to + higher a QoS continues to strengthen. + + The home router is not the only place that can benefit from clearer + indications of acceptable performance for users. Since users + perceive the Internet via the lens of applications, it is important + that we call upon application vendors to adopt solutions that stress + lower latencies. Unfortunately, while bandwidth is straightforward + to measure, responsiveness is trickier. Many applications have found + a set of metrics that are helpful to their realm but do not + generalize well and cannot become universally applicable. + Furthermore, due to the highly competitive application space, vendors + may have economic reasons to avoid sharing their most useful metrics. + +4.1.3. Introductory Talks - Key Points + + 1. Measuring bandwidth is necessary but is not alone sufficient. + + 2. In many cases, Internet users don't need more bandwidth but + rather need "better bandwidth", i.e., they need other + connectivity improvements. + + 3. Users perceive the quality of their Internet connection based on + the applications they use, which are affected by a combination of + factors. There's little value in exposing a typical user to the + entire spectrum of possible reasons for the poor performance + perceived in their application-centric view. + + 4. Many factors affecting user experience are outside the users' + sphere of control. It's unclear whether exposing users to these + other factors will help them understand the state of their + network performance. In general, users prefer simple, + categorical choices (e.g., "good", "better", and "best" options). + + 5. The Internet content market is highly competitive, and many + applications develop their own "secret sauce". + +4.2. Metrics Considerations + + In the second agenda section, the workshop continued its discussion + about metrics that can be used instead of or in addition to available + bandwidth. Several workshop attendees presented deep-dive studies on + measurement methodology. + +4.2.1. Common Performance Metrics + + Losing Internet access entirely is, of course, the worst user + experience. Unfortunately, unless rebooting the home router restores + connectivity, there is little a user can do other than contacting + their service provider. Nevertheless, there is value in the + systematic collection of availability metrics on the client side; + these can help the user's ISP localize and resolve issues faster + while enabling users to better choose between ISPs. One can measure + availability directly by simply attempting connections from the + client side to distant locations of interest. For example, Ookla's + [Speedtest] uses a large number of Android devices to measure network + and cellular availability around the globe. Ookla collects hundreds + of millions of data points per day and uses these for accurate + availability reporting. An alternative approach is to derive + availability from the failure rates of other tests. For example, + [FCC_MBA] and [FCC_MBA_methodology] use thousands of off-the-shelf + routers, with measurement software developed by [SamKnows]. These + routers perform an array of network tests and report availability + based on whether test connections were successful or not. + + Measuring available capacity can be helpful to end users, but it is + even more valuable for service providers and application developers. + High-definition video streaming requires significantly more capacity + than any other type of traffic. At the time of the workshop, video + traffic constituted 90% of overall Internet traffic and contributed + to 95% of the revenues from monetization (via subscriptions, fees, or + ads). As a result, video streaming services, such as Netflix, need + to continuously cope with rapid changes in available capacity. The + ability to measure available capacity in real time leverages the + different adaptive bitrate (ABR) compression algorithms to ensure the + best possible user experience. Measuring aggregated capacity demand + allows ISPs to be ready for traffic spikes. For example, during the + end-of-year holiday season, the global demand for capacity has been + shown to be 5-7 times higher than during other seasons. For end + users, knowledge of their capacity needs can help them select the + best data plan given their intended usage. In many cases, however, + end users have more than enough capacity, and adding more bandwidth + will not improve their experience -- after a point, it is no longer + the limiting factor in user experience. Finally, the ability to + differentiate between the "throughput" and the "goodput" can be + helpful in identifying when the network is saturated. + + In measuring network quality, latency is defined as the time it takes + a packet to traverse a network path from one end to the other. At + the time of this report, users in many places worldwide can enjoy + Internet access that has adequately high capacity and availability + for their current needs. For these users, latency improvements, + rather than bandwidth improvements, can lead to the most significant + improvements in QoE. The established latency metric is a round-trip + time (RTT), commonly measured in milliseconds. However, users often + find RTT values unintuitive since, unlike other performance metrics, + high RTT values indicate poor latency and users typically understand + higher scores to be better. To address this, [Paasch2021] and + [Mathis2021] present an inverse metric, called "Round-trips Per + Minute" (RPM). + + There is an important distinction between "idle latency" and "latency + under working conditions". The former is measured when the network + is underused and reflects a best-case scenario. The latter is + measured when the network is under a typical workload. Until + recently, typical tools reported a network's idle latency, which can + be misleading. For example, data presented at the workshop shows + that idle latencies can be up to 25 times lower than the latency + under typical working loads. Because of this, it is essential to + make a clear distinction between the two when presenting latency to + end users. + + Data shows that rapid changes in capacity affect latency. + [Foulkes2021] attempts to quantify how often a rapid change in + capacity can cause network connectivity to become "unstable" (i.e., + having high latency with very little throughput). Such changes in + capacity can be caused by infrastructure failures but are much more + often caused by in-network phenomena, like changing traffic + engineering policies or rapid changes in cross-traffic. + + Data presented at the workshop shows that 36% of measured lines have + capacity metrics that vary by more than 10% throughout the day and + across multiple days. These differences are caused by many + variables, including local connectivity methods (Wi-Fi vs. Ethernet), + competing LAN traffic, device load/configuration, time of day, and + local loop/backhaul capacity. These factor variations make measuring + capacity using only an end-user device or other end-network + measurement difficult. A network router seeing aggregated traffic + from multiple devices provides a better vantage point for capacity + measurements. Such a test can account for the totality of local + traffic and perform an independent capacity test. However, various + factors might still limit the accuracy of such a test. Accurate + capacity measurement requires multiple samples. + + As users perceive the Internet through the lens of applications, it + may be difficult to correlate changes in capacity and latency with + the quality of the end-user experience. For example, web browsers + rely on cached page versions to shorten page load times and mitigate + connectivity losses. In addition, social networking applications + often rely on prefetching their "feed" items. These techniques make + the core in-network metrics less indicative of the users' experience + and necessitates collecting data from the end-user applications + themselves. + + It is helpful to distinguish between applications that operate on a + "fixed latency budget" from those that have more tolerance to latency + variance. Cloud gaming serves as an example application that + requires a "fixed latency budget", as a sudden latency spike can + decide the "win/lose" ratio for a player. Companies that compete in + the lucrative cloud gaming market make significant infrastructure + investments, such as building entire data centers closer to their + users. These data centers highlight the economic benefit that lower + numbers of latency spikes outweigh the associated deployment costs. + On the other hand, applications that are more tolerant to latency + spikes can continue to operate reasonably well through short spikes. + Yet, even those applications can benefit from consistently low + latency depending on usage shifts. For example, Video-on-Demand + (VOD) apps can work reasonably well when the video is consumed + linearly, but once the user tries to "switch a channel" or to "skip + ahead", the user experience suffers unless the latency is + sufficiently low. + + Finally, as applications continue to evolve, in-application metrics + are gaining in importance. For example, VOD applications can assess + the QoE by application-specific metrics, such as whether the video + player is able to use the highest possible resolution, identifying + when the video is smooth or freezing, or other similar metrics. + Application developers can then effectively use these metrics to + prioritize future work. All popular video platforms (YouTube, + Instagram, Netflix, and others) have developed frameworks to collect + and analyze VOD metrics at scale. One example is the Scuba framework + used by Meta [Scuba]. + + Unfortunately, in-application metrics can be challenging to use for + comparative research purposes. First, different applications often + use different metrics to measure the same phenomena. For example, + application A may measure the smoothness of video via "mean time to + rebuffer", while application B may rely on the "probability of + rebuffering per second" for the same purpose. A different challenge + with in-application metrics is that VOD is a significant source of + revenue for companies, such as YouTube, Facebook, and Netflix, + placing a proprietary incentive against exchanging the in-application + data. A final concern centers on the privacy issues resulting from + in-application metrics that accurately describe the activities and + preferences of an individual end user. + +4.2.2. Availability Metrics + + Availability is simply defined as whether or not a packet can be sent + and then received by its intended recipient. Availability is naively + thought to be the simplest to measure, but it is more complex when + considering that continual, instantaneous measurements would be + needed to detect the smallest of outages. Also difficult is + determining the root cause of infallibility: was the user's line + down, was something in the middle of the network, or was it the + service with which the user was attempting to communicate? + +4.2.3. Capacity Metrics + + If the network capacity does not meet user demands, the network + quality will be impacted. Once the capacity meets the demands, + increasing capacity won't lead to further quality improvements. + + The actual network connection capacity is determined by the equipment + and the lines along the network path, and it varies throughout the + day and across multiple days. Studies involving DSL lines in North + America indicate that over 30% of the DSL lines have capacity metrics + that vary by more than 10% throughout the day and across multiple + days. + + Some factors that affect the actual capacity are: + + 1. Presence of a competing traffic, either in the LAN or in the WAN + environments. In the LAN setting, the competing traffic reflects + the multiple devices that share the Internet connection. In the + WAN setting, the competing traffic often originates from the + unrelated network flows that happen to share the same network + path. + + 2. Capabilities of the equipment along the path of the network + connection, including the data transfer rate and the amount of + memory used for buffering. + + 3. Active traffic management measures, such as traffic shapers and + policers that are often used by the network providers. + + There are other factors that can negatively affect the actual line + capacities. + + The user demands of the traffic follow the usage patterns and + preferences of the particular users. For example, large data + transfers can use any available capacity, while the media streaming + applications require limited capacity to function correctly. + Videoconferencing applications typically need less capacity than + high-definition video streaming. + +4.2.4. Latency Metrics + + End-to-end latency is the time that a particular packet takes to + traverse the network path from the user to their destination and + back. The end-to-end latency comprises several components: + + 1. The propagation delay, which reflects the path distance and the + individual link technologies (e.g., fiber vs. satellite). The + propagation doesn't depend on the utilization of the network, to + the extent that the network path remains constant. + + 2. The buffering delay, which reflects the time segments spent in + the memory of the network equipment that connect the individual + network links, as well as in the memory of the transmitting + endpoint. The buffering delay depends on the network + utilization, as well as on the algorithms that govern the queued + segments. + + 3. The transport protocol delays, which reflect the time spent in + retransmission and reassembly, as well as the time spent when the + transport is "head-of-line blocked". + + 4. Some of the workshop submissions that have explicitly called out + the application delay, which reflects the inefficiencies in the + application layer. + + Typically, end-to-end latency is measured when the network is idle. + Results of such measurements mostly reflect the propagation delay but + not other kinds of delay. This report uses the term "idle latency" + to refer to results achieved under idle network conditions. + + Alternatively, if the latency is measured when the network is under + its typical working conditions, the results reflect multiple types of + delays. This report uses the term "working latency" to refer to such + results. Other sources use the term "latency under load" (LUL) as a + synonym. + + Data presented at the workshop reveals a substantial difference + between the idle latency and the working latency. Depending on the + traffic direction and the technology type, the working latency is + between 6 to 25 times higher than the idle latency: + + +============+============+========+=========+============+=========+ + | Direction | Technology |Working | Idle | Working - |Working /| + | | Type |Latency | Latency | Idle |Idle | + | | | | | Difference |Ratio | + +============+============+========+=========+============+=========+ + | Downstream | FTTH |148 | 10 | 138 |15 | + +------------+------------+--------+---------+------------+---------+ + | Downstream | Cable |103 | 13 | 90 |8 | + +------------+------------+--------+---------+------------+---------+ + | Downstream | DSL |194 | 10 | 184 |19 | + +------------+------------+--------+---------+------------+---------+ + | Upstream | FTTH |207 | 12 | 195 |17 | + +------------+------------+--------+---------+------------+---------+ + | Upstream | Cable |176 | 27 | 149 |6 | + +------------+------------+--------+---------+------------+---------+ + | Upstream | DSL |686 | 27 | 659 |25 | + +------------+------------+--------+---------+------------+---------+ + + Table 1 + + While historically the tooling available for measuring latency + focused on measuring the idle latency, there is a trend in the + industry to start measuring the working latency as well, e.g., + Apple's [NetworkQuality]. + +4.2.5. Measurement Case Studies + + The participants have proposed several concrete methodologies for + measuring the network quality for the end users. + + [Paasch2021] introduced a methodology for measuring working latency + from the end-user vantage point. The suggested method incrementally + adds network flows between the user device and a server endpoint + until a bottleneck capacity is reached. From these measurements, a + round-trip latency is measured and reported to the end user. The + authors chose to report results with the RPM metric. The methodology + had been implemented in Apple's macOS Monterey. + + [Mathis2021] applied the RPM metric to the results of more than 4 + billion download tests that M-Lab performed from 2010-2021. During + this time frame, the M-Lab measurement platform underwent several + upgrades that allowed the research team to compare the effect of + different TCP congestion control algorithms (CCAs) on the measured + end-to-end latency. The study showed that the use of cubic CCA leads + to increased working latency, which is attributed to its use of + larger queues. + + [Schlinker2019] presented a large-scale study that aimed to establish + a correlation between goodput and QoE on a large social network. The + authors performed the measurements at multiple data centers from + which video segments of set sizes were streamed to a large number of + end users. The authors used the goodput and throughput metrics to + determine whether particular paths were congested. + + [Reed2021] presented the analysis of working latency measurements + collected as part of the Measuring Broadband America (MBA) program by + the Federal Communication Commission (FCC). The FCC does not include + working latency in its yearly report but does offer it in the raw + data files. The authors used a subset of the raw data to identify + important differences in the working latencies across different ISPs. + + [MacMillian2021] presented analysis of working latency across + multiple service tiers. They found that, unsurprisingly, "premium" + tier users experienced lower working latency compared to a "value" + tier. The data demonstrated that working latency varies + significantly within each tier; one possible explanation is the + difference in equipment deployed in the homes. + + These studies have stressed the importance of measurement of working + latency. At the time of this report, many home router manufacturers + rely on hardware-accelerated routing that uses FIFO queues. Focusing + on measuring the working latency measurements on these devices and + making the consumer aware of the effect of choosing one manufacturer + vs. another can help improve the home router situation. The ideal + test would be able to identify the working latency and pinpoint the + source of the delay (home router, ISP, server side, or some network + node in between). + + Another source of high working latency comes from network routers + exposed to cross-traffic. As [Schlinker2019] indicated, these can + become saturated during the peak hours of the day. Systematic + testing of the working latency in routers under load can help improve + both our understanding of latency and the impact of deployed + infrastructure. + +4.2.6. Metrics Key Points + + The metrics for network quality can be roughly grouped into the + following: + + 1. Availability metrics, which indicate whether the user can access + the network at all. + + 2. Capacity metrics, which indicate whether the actual line capacity + is sufficient to meet the user's demands. + + 3. Latency metrics, which indicate if the user gets the data in a + timely fashion. + + 4. Higher-order metrics, which include both the network metrics, + such as inter-packet arrival time, and the application metrics, + such as the mean time between rebuffering for video streaming. + + The availability metrics can be seen as a derivative of either the + capacity (zero capacity leading to zero availability) or the latency + (infinite latency leading to zero availability). + + Key points from the presentations and discussions included the + following: + + 1. Availability and capacity are "hygienic factors" -- unless an + application is capable of using extra capacity, end users will + see little benefit from using over-provisioned lines. + + 2. Working latency has a stronger correlation with the user + experience than latency under an idle network load. Working + latency can exceed the idle latency by order of magnitude. + + 3. The RPM metric is a stable metric, with positive values being + better, that may be more effective when communicating latency to + end users. + + 4. The relationship between throughput and goodput can be effective + in finding the saturation points, both in client-side + [Paasch2021] and server-side [Schlinker2019] settings. + + 5. Working latency depends on the algorithm choice for addressing + endpoint congestion control and router queuing. + + Finally, it was commonly agreed to that the best metrics are those + that are actionable. + +4.3. Cross-Layer Considerations + + In the cross-layer segment of the workshop, participants presented + material on and discussed how to accurately measure exactly where + problems occur. Discussion centered especially on the differences + between physically wired and wireless connections and the + difficulties of accurately determining problem spots when multiple + different types of network segments are responsible for the quality. + As an example, [Kerpez2021] showed that a limited bandwidth of 2.4 + Ghz Wi-Fi bottlenecks the most frequently. In comparison, the wider + bandwidth of the 5 Ghz Wi-Fi has only bottlenecked in 20% of + observations. + + The participants agreed that no single component of a network + connection has all the data required to measure the effects of the + network performance on the quality of the end-user experience. + + * Applications that are running on the end-user devices have the + best insight into their respective performance but have limited + visibility into the behavior of the network itself and are unable + to act based on their limited perspective. + + * ISPs have good insight into QoS considerations but are not able to + infer the effect of the QoS metrics on the quality of end-user + experiences. + + * Content providers have good insight into the aggregated behavior + of the end users but lack the insight on what aspects of network + performance are leading indicators of user behavior. + + The workshop had identified the need for a standard and extensible + way to exchange network performance characteristics. Such an + exchange standard should address (at least) the following: + + * A scalable way to capture the performance of multiple (potentially + thousands of) endpoints. + + * The data exchange format should prevent data manipulation so that + the different participants won't be able to game the mechanisms. + + * Preservation of end-user privacy. In particular, federated + learning approaches should be preferred so that no centralized + entity has the access to the whole picture. + + * A transparent model for giving the different actors on a network + connection an incentive to share the performance data they + collect. + + * An accompanying set of tools to analyze the data. + +4.3.1. Separation of Concerns + + Commonly, there's a tight coupling between collecting performance + metrics, interpreting those metrics, and acting upon the + interpretation. Unfortunately, such a model is not the best for + successfully exchanging cross-layer data, as: + + * actors that are able to collect particular performance metrics + (e.g., the TCP RTT) do not necessarily have the context necessary + for a meaningful interpretation, + + * the actors that have the context and the computational/storage + capacity to interpret metrics do not necessarily have the ability + to control the behavior of the network/application, and + + * the actors that can control the behavior of networks and/or + applications typically do not have access to complete measurement + data. + + The participants agreed that it is important to separate the above + three aspects, so that: + + * the different actors that have the data, but not the ability to + interpret and/or act upon it, should publish their measured data + and + + * the actors that have the expertise in interpreting and + synthesizing performance data should publish the results of their + interpretations. + +4.3.2. Security and Privacy Considerations + + Preserving the privacy of Internet end users is a difficult + requirement to meet when addressing this problem space. There is an + intrinsic trade-off between collecting more data about user + activities and infringing on their privacy while doing so. + Participants agreed that observability across multiple layers is + necessary for an accurate measurement of the network quality, but + doing so in a way that minimizes privacy leakage is an open question. + +4.3.3. Metric Measurement Considerations + + * The following TCP protocol metrics have been found to be effective + and are available for passive measurement: + + - TCP connection latency measured using selective acknowledgment + (SACK) or acknowledgment (ACK) timing, as well as the timing + between TCP retransmission events, are good proxies for end-to- + end RTT measurements. + + - On the Linux platform, the tcp_info structure is the de facto + standard for an application to inspect the performance of + kernel-space networking. However, there is no equivalent de + facto standard for user-space networking. + + * The QUIC and MASQUE protocols make passive performance + measurements more challenging. + + - An approach that uses federated measurement/hierarchical + aggregation may be more valuable for these protocols. + + - The QLOG format seems to be the most mature candidate for such + an exchange. + +4.3.4. Towards Improving Future Cross-Layer Observability + + The ownership of the Internet is spread across multiple + administrative domains, making measurement of end-to-end performance + data difficult. Furthermore, the immense scale of the Internet makes + aggregation and analysis of this difficult. [Marx2021] presented a + simple logging format that could potentially be used to collect and + aggregate data from different layers. + + Another aspect of the cross-layer collaboration hampering measurement + is that the majority of current algorithms do not explicitly provide + performance data that can be used in cross-layer analysis. The IETF + community could be more diligent in identifying each protocol's key + performance indicators and exposing them as part of the protocol + specification. + + Despite all these challenges, it should still be possible to perform + limited-scope studies in order to have a better understanding of how + user quality is affected by the interaction of the different + components that constitute the Internet. Furthermore, recent + development of federated learning algorithms suggests that it might + be possible to perform cross-layer performance measurements while + preserving user privacy. + +4.3.5. Efficient Collaboration between Hardware and Transport Protocols + + With the advent of the low latency, low loss, and scalable throughput + (L4S) congestion notification and control, there is an even higher + need for the transport protocols and the underlying hardware to work + in unison. + + At the time of the workshop, the typical home router uses a single + FIFO queue that is large enough to allow amortizing the lower-layer + header overhead across multiple transport PDUs. These designs worked + well with the cubic congestion control algorithm, yet the newer + generation of algorithms can operate on much smaller queues. To + fully support latencies less than 1 ms, the home router needs to work + efficiently on sequential transmissions of just a few segments vs. + being optimized for large packet bursts. + + Another design trait common in home routers is the use of packet + aggregation to further amortize the overhead added by the lower-layer + headers. Specifically, multiple IP datagrams are combined into a + single, large transfer frame. However, this aggregation can add up + to 10 ms to the packet sojourn delay. + + Following the famous "you can't improve what you don't measure" + adage, it is important to expose these aggregation delays in a way + that would allow identifying the source of the bottlenecks and making + hardware more suitable for the next generation of transport + protocols. + +4.3.6. Cross-Layer Key Points + + * Significant differences exist in the characteristics of metrics to + be measured and the required optimizations needed in wireless vs. + wired networks. + + * Identification of an issue's root cause is hampered by the + challenges in measuring multi-segment network paths. + + * No single component of a network connection has all the data + required to measure the effects of the complete network + performance on the quality of the end-user experience. + + * Actionable results require both proper collection and + interpretation. + + * Coordination among network providers is important to successfully + improve the measurement of end-user experiences. + + * Simultaneously providing accurate measurements while preserving + end-user privacy is challenging. + + * Passive measurements from protocol implementations may provide + beneficial data. + +4.4. Synthesis + + Finally, in the synthesis section of the workshop, the presentations + and discussions concentrated on the next steps likely needed to make + forward progress. Of particular concern is how to bring forward + measurements that can make sense to end users trying to select + between various networking subscription options. + +4.4.1. Measurement and Metrics Considerations + + One important consideration is how decisions can be made and what + actions can be taken based on collected metrics. Measurements must + be integrated with applications in order to get true application + views of congestion, as measurements over different infrastructure or + via other applications may return incorrect results. Congestion + itself can be a temporary problem, and mitigation strategies may need + to be different depending on whether it is expected to be a short- + term or long-term phenomenon. A significant challenge exists in + measuring short-term problems, driving the need for continuous + measurements to ensure critical moments and long-term trends are + captured. For short-term problems, workshop participants debated + whether an issue that goes away is indeed a problem or is a sign that + a network is properly adapting and self-recovering. + + Important consideration must be taken when constructing metrics in + order to understand the results. Measurements can also be affected + by individual packet characteristics -- differently sized packets + typically have a linear relationship with their delay. With this in + mind, measurements can be divided into a delay based on geographical + distances, a packet-size serialization delay, and a variable (noise) + delay. Each of these three sub-component delays can be different and + individually measured across each segment in a multi-hop path. + Variable delay can also be significantly impacted by external + factors, such as bufferbloat, routing changes, network load sharing, + and other local or remote changes in performance. Network + measurements, especially load-specific tests, must also be run long + enough to ensure that any problems associated with buffering, + queuing, etc. are captured. Measurement technologies should also + distinguish between upstream and downstream measurements, as well as + measure the difference between end-to-end paths and sub-path + measurements. + +4.4.2. End-User Metrics Presentation + + Determining end-user needs requires informative measurements and + metrics. How do we provide the users with the service they need or + want? Is it possible for users to even voice their desires + effectively? Only high-level, simplistic answers like "reliability", + "capacity", and "service bundling" are typical answers given in end- + user surveys. Technical requirements that operators can consume, + like "low-latency" and "congestion avoidance", are not terms known to + and used by end users. + + Example metrics useful to end users might include the number of users + supported by a service and the number of applications or streams that + a network can support. An example solution to combat networking + issues include incentive-based traffic management strategies (e.g., + an application requesting lower latency may also mean accepting lower + bandwidth). User-perceived latency must be considered, not just + network latency -- user experience in-application to in-server + latency and network-to-network measurements may only be studying the + lowest-level latency. Thus, picking the right protocol to use in a + measurement is critical in order to match user experience (for + example, users do not transmit data over ICMP, even though it is a + common measurement tool). + + In-application measurements should consider how to measure different + types of applications, such as video streaming, file sharing, multi- + user gaming, and real-time voice communications. It may be that + asking users for what trade-offs they are willing to accept would be + a helpful approach: would they rather have a network with low latency + or a network with higher bandwidth? Gamers may make different + decisions than home office users or content producers, for example. + + Furthermore, how can users make these trade-offs in a fair manner + that does not impact other users? There is a tension between + solutions in this space vs. the cost associated with solving these + problems, as well as which customers are willing to front these + improvement costs. + + Challenges in providing higher-priority traffic to users centers + around the ability for networks to be willing to listen to client + requests for higher incentives, even though commercial interests may + not flow to them without a cost incentive. Shared mediums in general + are subject to oversubscribing, such that the number of users a + network can support is either accurate on an underutilized network or + may assume an average bandwidth or other usage metric that fails to + be accurate during utilization spikes. Individual metrics are also + affected by in-home devices from cheap routers to microwaves and by + (multi-)user behaviors during tests. Thus, a single metric alone or + a single reading without context may not be useful in assisting a + user or operator to determine where the problem source actually is. + + User comprehension of a network remains a challenging problem. + Multiple workshop participants argued for a single number + (potentially calculated with a weighted aggregation formula) or a + small number of measurements per expected usage (e.g., a "gaming" + score vs. a "content producer" score). Many agreed that some users + may instead prefer to consume simplified or color-coded ratings + (e.g., good/better/best, red/yellow/green, or bronze/gold/platinum). + +4.4.3. Synthesis Key Points + + * Some proposed metrics: + + - Round-trips Per Minute (RPM) + + - users per network + + - latency + + - 99% latency and bandwidth + + * Median and mean measurements are distractions from the real + problems. + + * Shared network usage greatly affects quality. + + * Long measurements are needed to capture all facets of potential + network bottlenecks. + + * Better-funded research in all these areas is needed for progress. + + * End users will best understand a simplified score or ranking + system. + +5. Conclusions + + During the final hour of the three-day workshop, statements that the + group deemed to be summary statements were gathered. Later, any + statements that were in contention were discarded (listed further + below for completeness). For this document, the authors took the + original list and divided it into rough categories, applied some + suggested edits discussed on the mailing list, and further edited for + clarity and to provide context. + +5.1. General Statements + + 1. Bandwidth is necessary but not alone sufficient. + + 2. In many cases, Internet users don't need more bandwidth but + rather need "better bandwidth", i.e., they need other + improvements to their connectivity. + + 3. We need both active and passive measurements -- passive + measurements can provide historical debugging. + + 4. We need passive measurements to be continuous, archivable, and + queriable, including reliability/connectivity measurements. + + 5. A really meaningful metric for users is whether their application + will work properly or fail because of a lack of a network with + sufficient characteristics. + + 6. A useful metric for goodness must actually incentivize goodness + -- good metrics should be actionable to help drive industries + towards improvement. + + 7. A lower-latency Internet, however achieved, would benefit all end + users. + +5.2. Specific Statements about Detailed Protocols/Techniques + + 1. Round-trips Per Minute (RPM) is a useful, consumable metric. + + 2. We need a usable tool that fills the current gap between network + reachability, latency, and speed tests. + + 3. End users that want to be involved in QoS decisions should be + able to voice their needs and desires. + + 4. Applications are needed that can perform and report good quality + measurements in order to identify insufficient points in network + access. + + 5. Research done by regulators indicate that users/consumers prefer + a simple metric per application, which frequently resolves to + whether the application will work properly or not. + + 6. New measurements and QoS or QoE techniques should not rely only + or depend on reading TCP headers. + + 7. It is clear from developers of interactive applications and from + network operators that lower latency is a strong factor in user + QoE. However, metrics are lacking to support this statement + directly. + +5.3. Problem Statements and Concerns + + 1. Latency mean and medians are distractions from better + measurements. + + 2. It is frustrating to only measure network services without + simultaneously improving those services. + + 3. Stakeholder incentives aren't aligned for easy wins in this + space. Incentives are needed to motivate improvements in public + network access. Measurements may be one step towards driving + competitive market incentives. + + 4. For future-proof networking, it is important to measure the + ecological impact of material and energy usage. + + 5. We do not have incontrovertible evidence that any one metric + (e.g., latency or speed) is more important than others to + persuade device vendors to concentrate on any one optimization. + +5.4. No-Consensus-Reached Statements + + Additional statements were discussed and recorded that did not have + consensus of the group at the time, but they are listed here for + completeness: + + 1. We do not have incontrovertible evidence that bufferbloat is a + prevalent problem. + + 2. The measurement needs to support reporting localization in order + to find problems. Specifically: + + * Detecting a problem is not sufficient if you can't find the + location. + + * Need more than just English -- different localization + concerns. + + 3. Stakeholder incentives aren't aligned for easy wins in this + space. + +6. Follow-On Work + + There was discussion during the workshop about where future work + should be performed. The group agreed that some work could be done + more immediately within existing IETF working groups (e.g., IPPM, + DetNet, and RAW), while other longer-term research may be needed in + IRTF groups. + +7. IANA Considerations + + This document has no IANA actions. + +8. Security Considerations + + A few security-relevant topics were discussed at the workshop, + including but not limited to: + + * what prioritization techniques can work without invading the + privacy of the communicating parties and + + * how oversubscribed networks can essentially be viewed as a DDoS + attack. + +9. Informative References + + [Aldabbagh2021] + Aldabbagh, A., "Regulatory perspective on measuring + network quality for end-users", September 2021, + <https://www.iab.org/wp-content/IAB- + uploads/2021/09/2021-09-07-Aldabbagh-Ofcom-presentationt- + to-IAB-1v00-1.pdf>. + + [Arkko2021] + Arkko, J. and M. Kühlewind, "Observability is needed to + improve network quality", August 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/iab- + position-paper-observability.pdf>. + + [Balasubramanian2021] + Balasubramanian, P., "Transport Layer Statistics for + Network Quality", February 2021, <https://www.iab.org/wp- + content/IAB-uploads/2021/09/transportstatsquality.pdf>. + + [Briscoe2021] + Briscoe, B., White, G., Goel, V., and K. De Schepper, "A + Single Common Metric to Characterize Varying Packet + Delay", September 2021, <https://www.iab.org/wp-content/ + IAB-uploads/2021/09/single-delay-metric-1.pdf>. + + [Casas2021] + Casas, P., "10 Years of Internet-QoE Measurements Video, + Cloud, Conferencing, Web and Apps. What do we need from + the Network Side?", August 2021, <https://www.iab.org/wp- + content/IAB-uploads/2021/09/ + net_quality_internet_qoe_CASAS.pdf>. + + [Cheshire2021] + Cheshire, S., "The Internet is a Shared Network", August + 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + draft-cheshire-internet-is-shared-00b.pdf>. + + [Davies2021] + Davies, N. and P. Thompson, "Measuring Network Impact on + Application Outcomes Using Quality Attenuation", September + 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + PNSol-et-al-Submission-to-Measuring-Network-Quality-for- + End-Users-1.pdf>. + + [DeSchepper2021] + De Schepper, K., Tilmans, O., and G. Dion, "Challenges and + opportunities of hardware support for Low Queuing Latency + without Packet Loss", February 2021, <https://www.iab.org/ + wp-content/IAB-uploads/2021/09/Nokia-IAB-Measuring- + Network-Quality-Low-Latency-measurement-workshop- + 20210802.pdf>. + + [Dion2021] Dion, G., De Schepper, K., and O. Tilmans, "Focusing on + latency, not throughput, to provide a better internet + experience and network quality", August 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/Nokia- + IAB-Measuring-Network-Quality-Improving-and-focusing-on- + latency-.pdf>. + + [Fabini2021] + Fabini, J., "Network Quality from an End User + Perspective", February 2021, <https://www.iab.org/wp- + content/IAB-uploads/2021/09/Fabini-IAB- + NetworkQuality.txt>. + + [FCC_MBA] FCC, "Measuring Broadband America", + <https://www.fcc.gov/general/measuring-broadband-america>. + + [FCC_MBA_methodology] + FCC, "Measuring Broadband America - Open Methodology", + <https://www.fcc.gov/general/measuring-broadband-america- + open-methodology>. + + [Foulkes2021] + Foulkes, J., "Metrics helpful in assessing Internet + Quality", September 2021, <https://www.iab.org/wp-content/ + IAB-uploads/2021/09/ + IAB_Metrics_helpful_in_assessing_Internet_Quality.pdf>. + + [Ghai2021] Ghai, R., "Using TCP Connect Latency for measuring CX and + Network Optimization", February 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + xfinity-wifi-ietf-iab-v2-1.pdf>. + + [Iyengar2021] + Iyengar, J., "The Internet Exists In Its Use", August + 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + The-Internet-Exists-In-Its-Use.pdf>. + + [Kerpez2021] + Shafiei, J., Kerpez, K., Cioffi, J., Chow, P., and D. + Bousaber, "Wi-Fi and Broadband Data", September 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/Wi-Fi- + Report-ASSIA.pdf>. + + [Kilkki2021] + Kilkki, K. and B. Finley, "In Search of Lost QoS", + February 2021, <https://www.iab.org/wp-content/IAB- + uploads/2021/09/Kilkki-In-Search-of-Lost-QoS.pdf>. + + [Laki2021] Nadas, S., Varga, B., Contreras, L.M., and S. Laki, + "Incentive-Based Traffic Management and QoS Measurements", + February 2021, <https://www.iab.org/wp-content/IAB- + uploads/2021/11/CamRdy- + IAB_user_meas_WS_Nadas_et_al_IncentiveBasedTMwQoS.pdf>. + + [Liubogoshchev2021] + Liubogoshchev, M., "Cross-layer Cooperation for Better + Network Service", February 2021, <https://www.iab.org/wp- + content/IAB-uploads/2021/09/Cross-layer-Cooperation-for- + Better-Network-Service-2.pdf>. + + [MacMillian2021] + MacMillian, K. and N. Feamster, "Beyond Speed Test: + Measuring Latency Under Load Across Different Speed + Tiers", February 2021, <https://www.iab.org/wp-content/ + IAB-uploads/2021/09/2021_nqw_lul.pdf>. + + [Marx2021] Marx, R. and J. Herbots, "Merge Those Metrics: Towards + Holistic (Protocol) Logging", February 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + MergeThoseMetrics_Marx_Jul2021.pdf>. + + [Mathis2021] + Mathis, M., "Preliminary Longitudinal Study of Internet + Responsiveness", August 2021, <https://www.iab.org/wp- + content/IAB-uploads/2021/09/Preliminary-Longitudinal- + Study-of-Internet-Responsiveness-1.pdf>. + + [McIntyre2021] + Paasch, C., McIntyre, K., Shapira, O., Meyer, R., and S. + Cheshire, "An end-user approach to an Internet Score", + September 2021, <https://www.iab.org/wp-content/IAB- + uploads/2021/09/Internet-Score-2.pdf>. + + [Michel2021] + Michel, F. and O. Bonaventure, "Packet delivery time as a + tie-breaker for assessing Wi-Fi access points", February + 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + camera_ready_Packet_delivery_time_as_a_tie_breaker_for_ass + essing_Wi_Fi_access_points.pdf>. + + [Mirsky2021] + Mirsky, G., Min, X., Mishra, G., and L. Han, "The error + performance metric in a packet-switched network", February + 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + IAB-worshop-Error-performance-measurement-in-packet- + switched-networks.pdf>. + + [Morton2021] + Morton, A. C., "Dream-Pipe or Pipe-Dream: What Do Users + Want (and how can we assure it)?", Work in Progress, + Internet-Draft, draft-morton-ippm-pipe-dream-01, 6 + September 2021, <https://datatracker.ietf.org/doc/html/ + draft-morton-ippm-pipe-dream-01>. + + [NetworkQuality] + Apple, "Network Quality", + <https://support.apple.com/en-gb/HT212313>. + + [Paasch2021] + Paasch, C., Meyer, R., Cheshire, S., and O. Shapira, + "Responsiveness under Working Conditions", Work in + Progress, Internet-Draft, draft-cpaasch-ippm- + responsiveness-01, 25 October 2021, + <https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm- + responsiveness-01>. + + [Pardue2021] + Pardue, L. and S. Tellakula, "Lower-layer performance is + not indicative of upper-layer success", February 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/Lower- + layer-performance-is-not-indicative-of-upper-layer- + success-20210906-00-1.pdf>. + + [Reed2021] Reed, D.P. and L. Perigo, "Measuring ISP Performance in + Broadband America: A Study of Latency Under Load", + February 2021, <https://www.iab.org/wp-content/IAB- + uploads/2021/09/Camera_Ready_-Measuring-ISP-Performance- + in-Broadband-America.pdf>. + + [SamKnows] "SamKnows", <https://www.samknows.com/>. + + [Schlinker2019] + Schlinker, B., Cunha, I., Chiu, Y., Sundaresan, S., and E. + Katz-Basset, "Internet Performance from Facebook's Edge", + February 2019, <https://www.iab.org/wp-content/IAB- + uploads/2021/09/Internet-Performance-from-Facebooks- + Edge.pdf>. + + [Scuba] Abraham, L. et al., "Scuba: Diving into Data at Facebook", + <https://research.facebook.com/publications/scuba-diving- + into-data-at-facebook/>. + + [Sengupta2021] + Sengupta, S., Kim, H., and J. Rexford, "Fine-Grained RTT + Monitoring Inside the Network", February 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + Camera_Ready__Fine- + Grained_RTT_Monitoring_Inside_the_Network.pdf>. + + [Sivaraman2021] + Sivaraman, V., Madanapalli, S., and H. Kumar, "Measuring + Network Experience Meaningfully, Accurately, and + Scalably", February 2021, <https://www.iab.org/wp-content/ + IAB-uploads/2021/09/CanopusPositionPaperCameraReady.pdf>. + + [Speedtest] + Ookla, "Speedtest", <https://www.speedtest.net>. + + [Stein2021] + Stein, Y., "The Futility of QoS", August 2021, + <https://www.iab.org/wp-content/IAB-uploads/2021/09/QoS- + futility.pdf>. + + [Welzl2021] + Welzl, M., "A Case for Long-Term Statistics", February + 2021, <https://www.iab.org/wp-content/IAB-uploads/2021/09/ + iab-longtermstats_cameraready.docx-1.pdf>. + + [WORKSHOP] IAB, "IAB Workshop: Measuring Network Quality for End- + Users, 2021", September 2021, + <https://www.iab.org/activities/workshops/network- + quality>. + + [Zhang2021] + Zhang, M., Goel, V., and L. Xu, "User-Perceived Latency to + Measure CCAs", September 2021, <https://www.iab.org/wp- + content/IAB-uploads/2021/09/User_Perceived_Latency-1.pdf>. + +Appendix A. Program Committee + + The program committee consisted of: + + Jari Arkko + Olivier Bonaventure + Vint Cerf + Stuart Cheshire + Sam Crowford + Nick Feamster + Jim Gettys + Toke Hoiland-Jorgensen + Geoff Huston + Cullen Jennings + Katarzyna Kosek-Szott + Mirja Kühlewind + Jason Livingood + Matt Mathis + Randall Meyer + Kathleen Nichols + Christoph Paasch + Tommy Pauly + Greg White + Keith Winstein + +Appendix B. Workshop Chairs + + The workshop chairs consisted of: + + Wes Hardaker + Evgeny Khorov + Omer Shapira + +Appendix C. Workshop Participants + + The following is a list of participants who attended the workshop + over a remote connection: + + Ahmed Aldabbagh + Jari Arkko + Praveen Balasubramanian + Olivier Bonaventure + Djamel Bousaber + Bob Briscoe + Rich Brown + Anna Brunstrom + Pedro Casas + Vint Cerf + Stuart Cheshire + Kenjiro Cho + Steve Christianson + John Cioffi + Alexander Clemm + Luis M. Contreras + Sam Crawford + Neil Davies + Gino Dion + Toerless Eckert + Lars Eggert + Joachim Fabini + Gorry Fairhurst + Nick Feamster + Mat Ford + Jonathan Foulkes + Jim Gettys + Rajat Ghai + Vidhi Goel + Wes Hardaker + Joris Herbots + Geoff Huston + Toke Høiland-Jørgensen + Jana Iyengar + Cullen Jennings + Ken Kerpez + Evgeny Khorov + Kalevi Kilkki + Joon Kim + Zhenbin Li + Mikhail Liubogoshchev + Jason Livingood + Kyle MacMillan + Sharat Madanapalli + Vesna Manojlovic + Robin Marx + Matt Mathis + Jared Mauch + Kristen McIntyre + Randall Meyer + François Michel + Greg Mirsky + Cindy Morgan + Al Morton + Szilveszter Nadas + Kathleen Nichols + Lai Yi Ohlsen + Christoph Paasch + Lucas Pardue + Tommy Pauly + Levi Perigo + David Reed + Alvaro Retana + Roberto + Koen De Schepper + David Schinazi + Brandon Schlinker + Eve Schooler + Satadal Sengupta + Jinous Shafiei + Shapelez + Omer Shapira + Dan Siemon + Vijay Sivaraman + Karthik Sundaresan + Dave Taht + Rick Taylor + Bjørn Ivar Teigen + Nicolas Tessares + Peter Thompson + Balazs Varga + Bren Tully Walsh + Michael Welzl + Greg White + Russ White + Keith Winstein + Lisong Xu + Jiankang Yao + Gavin Young + Mingrui Zhang + +IAB Members at the Time of Approval + + Internet Architecture Board members at the time this document was + approved for publication were: + + Jari Arkko + Deborah Brungard + Lars Eggert + Wes Hardaker + Cullen Jennings + Mallory Knodel + Mirja Kühlewind + Zhenbin Li + Tommy Pauly + David Schinazi + Russ White + Qin Wu + Jiankang Yao + +Acknowledgments + + The authors would like to thank the workshop participants, the + members of the IAB, and the program committee for creating and + participating in many interesting discussions. + +Contributors + + Thank you to the people that contributed edits to this document: + + Erik Auerswald + Simon Leinen + Brian Trammell + +Authors' Addresses + + Wes Hardaker + Email: ietf@hardakers.net + + + Omer Shapira + Email: omer_shapira@apple.com |