1 files changed, 1627 insertions, 0 deletions
diff --git a/doc/rfc/rfc6808.txt b/doc/rfc/rfc6808.txt
new file mode 100644
index 0000000..2846d8f
--- /dev/null
+++ b/doc/rfc/rfc6808.txt
@@ -0,0 +1,1627 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF)                     L. Ciavattone
+Request for Comments: 6808                                     AT&T Labs
+Category: Informational                                          R. Geib
+ISSN: 2070-1721                                         Deutsche Telekom
+                                                               A. Morton
+                                                               AT&T Labs
+                                                               M. Wieser
+                                          Technical University Darmstadt
+                                                           December 2012
+
+
+            Test Plan and Results Supporting Advancement of
+                    RFC 2679 on the Standards Track
+
+Abstract
+
+   This memo provides the supporting test plan and results to advance
+   RFC 2679 on one-way delay metrics along the Standards Track,
+   following the process in RFC 6576.  Observing that the metric
+   definitions themselves should be the primary focus rather than the
+   implementations of metrics, this memo describes the test procedures
+   to evaluate specific metric requirement clauses to determine if the
+   requirement has been interpreted and implemented as intended.  Two
+   completely independent implementations have been tested against the
+   key specifications of RFC 2679.  This memo also provides direct input
+   for development of a revision of RFC 2679.
+
+Status of This Memo
+
+   This document is not an Internet Standards Track specification; it is
+   published for informational purposes.
+
+   This document is a product of the Internet Engineering Task Force
+   (IETF).  It represents the consensus of the IETF community.  It has
+   received public review and has been approved for publication by the
+   Internet Engineering Steering Group (IESG).  Not all documents
+   approved by the IESG are a candidate for any level of Internet
+   Standard; see Section 2 of RFC 5741.
+
+   Information about the current status of this document, any errata,
+   and how to provide feedback on it may be obtained at
+   http://www.rfc-editor.org/info/rfc6808.
+
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 1]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+Copyright Notice
+
+   Copyright (c) 2012 IETF Trust and the persons identified as the
+   document authors.  All rights reserved.
+
+   This document is subject to BCP 78 and the IETF Trust's Legal
+   Provisions Relating to IETF Documents
+   (http://trustee.ietf.org/license-info) in effect on the date of
+   publication of this document.  Please review these documents
+   carefully, as they describe your rights and restrictions with respect
+   to this document.  Code Components extracted from this document must
+   include Simplified BSD License text as described in Section 4.e of
+   the Trust Legal Provisions and are provided without warranty as
+   described in the Simplified BSD License.
+
+   This document may contain material from IETF Documents or IETF
+   Contributions published or made publicly available before November
+   10, 2008.  The person(s) controlling the copyright in some of this
+   material may not have granted the IETF Trust the right to allow
+   modifications of such material outside the IETF Standards Process.
+   Without obtaining an adequate license from the person(s) controlling
+   the copyright in such materials, this document may not be modified
+   outside the IETF Standards Process, and derivative works of it may
+   not be created outside the IETF Standards Process, except to format
+   it for publication as an RFC or to translate it into languages other
+   than English.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 2]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+Table of Contents
+
+   1. Introduction ....................................................3
+      1.1. Requirements Language ......................................5
+   2. A Definition-Centric Metric Advancement Process .................5
+   3. Test Configuration ..............................................5
+   4. Error Calibration, RFC 2679 .....................................9
+      4.1. NetProbe Error and Type-P .................................10
+      4.2. Perfas+ Error and Type-P ..................................12
+   5. Predetermined Limits on Equivalence ............................12
+   6. Tests to Evaluate RFC 2679 Specifications ......................13
+      6.1. One-Way Delay, ADK Sample Comparison: Same- and Cross-
+           Implementation ............................................13
+           6.1.1. NetProbe Same-Implementation Results ...............15
+           6.1.2. Perfas+ Same-Implementation Results ................16
+           6.1.3. One-Way Delay, Cross-Implementation ADK
+                  Comparison .........................................16
+           6.1.4. Conclusions on the ADK Results for One-Way Delay ...17
+           6.1.5. Additional Investigations ..........................17
+      6.2. One-Way Delay, Loss Threshold, RFC 2679 ...................20
+           6.2.1. NetProbe Results for Loss Threshold ................21
+           6.2.2. Perfas+ Results for Loss Threshold .................21
+           6.2.3. Conclusions for Loss Threshold .....................21
+      6.3. One-Way Delay, First Bit to Last Bit, RFC 2679 ............21
+           6.3.1. NetProbe and Perfas+ Results for Serialization .....22
+           6.3.2. Conclusions for Serialization ......................23
+      6.4. One-Way Delay, Difference Sample Metric ...................24
+           6.4.1. NetProbe Results for Differential Delay ............24
+           6.4.2. Perfas+ Results for Differential Delay .............25
+           6.4.3. Conclusions for Differential Delay .................25
+      6.5. Implementation of Statistics for One-Way Delay ............25
+   7. Conclusions and RFC 2679 Errata ................................26
+   8. Security Considerations ........................................26
+   9. Acknowledgements ...............................................27
+   10. References ....................................................27
+      10.1. Normative References .....................................27
+      10.2. Informative References ...................................28
+
+1.  Introduction
+
+   The IETF IP Performance Metrics (IPPM) working group has considered
+   how to advance their metrics along the Standards Track since 2001,
+   with the initial publication of Bradner/Paxson/Mankin's memo
+   [METRICS-TEST].  The original proposal was to compare the performance
+   of metric implementations.  This was similar to the usual procedures
+   for advancing protocols, which did not directly apply.  It was found
+   to be difficult to achieve consensus on exactly how to compare
+   implementations, since there were many legitimate sources of
+
+
+
+Ciavattone, et al.            Informational                     [Page 3]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   variation that would emerge in the results despite the best attempts
+   to keep the network paths equal, and because considerable variation
+   was allowed in the parameters (and therefore implementation) of each
+   metric.  Flexibility in metric definitions, essential for
+   customization and broad appeal, made the comparison task quite
+   difficult.
+
+   A renewed work effort investigated ways in which the measurement
+   variability could be reduced and thereby simplify the problem of
+   comparison for equivalence.
+
+   The consensus process documented in [RFC6576] is that metric
+   definitions rather than the implementations of metrics should be the
+   primary focus of evaluation.  Equivalent test results are deemed to
+   be evidence that the metric specifications are clear and unambiguous.
+   This is now the metric specification equivalent of protocol
+   interoperability.  The [RFC6576] advancement process either produces
+   confidence that the metric definitions and supporting material are
+   clearly worded and unambiguous, or it identifies ways in which the
+   metric definitions should be revised to achieve clarity.
+
+   The metric RFC advancement process requires documentation of the
+   testing and results.  [RFC6576] retains the testing requirement of
+   the original Standards Track advancement process described in
+   [RFC2026] and [RFC5657], because widespread deployment is
+   insufficient to determine whether RFCs that define performance
+   metrics result in consistent implementations.
+
+   The process also permits identification of options that were not
+   implemented, so that they can be removed from the advancing
+   specification (this is a similar aspect to protocol advancement along
+   the Standards Track).  All errata must also be considered.
+
+   This memo's purpose is to implement the advancement process of
+   [RFC6576] for [RFC2679].  It supplies the documentation that
+   accompanies the protocol action request submitted to the Area
+   Director, including description of the test setup, results for each
+   implementation, evaluation of each metric specification, and
+   conclusions.
+
+   In particular, this memo documents the consensus on the extent of
+   tolerable errors when assessing equivalence in the results.  The IPPM
+   working group agreed that the test plan and procedures should include
+   the threshold for determining equivalence, and that this aspect
+   should be decided in advance of cross-implementation comparisons.
+   This memo includes procedures for same-implementation comparisons
+   that may influence the equivalence threshold.
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 4]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   Although the conclusion reached through testing is that [RFC2679]
+   should be advanced on the Standards Track with modifications, the
+   revised text of RFC 2679 is not yet ready for review.  Therefore,
+   this memo documents the information to support [RFC2679] advancement,
+   and the approval of a revision of RFC 2769 is left for future action.
+
+1.1.  Requirements Language
+
+   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+   document are to be interpreted as described in RFC 2119 [RFC2119].
+
+2.  A Definition-Centric Metric Advancement Process
+
+   As a first principle, the process described in Section 3.5 of
+   [RFC6576] takes the fact that the metric definitions (embodied in the
+   text of the RFCs) are the objects that require evaluation and
+   possible revision in order to advance to the next step on the
+   Standards Track.  This memo follows that process.
+
+3.  Test Configuration
+
+   One metric implementation used was NetProbe version 5.8.5 (an earlier
+   version is used in AT&T's IP network performance measurement system
+   and deployed worldwide [WIPM]).  NetProbe uses UDP packets of
+   variable size, and it can produce test streams with Periodic
+   [RFC3432] or Poisson [RFC2330] sample distributions.
+
+   The other metric implementation used was Perfas+ version 3.1,
+   developed by Deutsche Telekom [Perfas].  Perfas+ uses UDP unicast
+   packets of variable size (but also supports TCP and multicast).  Test
+   streams with Periodic, Poisson, or uniform sample distributions may
+   be used.
+
+   Figure 1 shows a view of the test path as each implementation's test
+   flows pass through the Internet and the Layer 2 Tunneling Protocol,
+   version 3 (L2TPv3) tunnel IDs (1 and 2), based on Figures 2 and 3 of
+   [RFC6576].
+
+
+
+
+
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 5]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+           +----+  +----+                                +----+  +----+
+           |Imp1|  |Imp1|           ,---.                |Imp2|  |Imp2|
+           +----+  +----+          /     \    +-------+  +----+  +----+
+             | V100 | V200        /       \   | Tunnel|   | V300  | V400
+             |      |            (         )  | Head  |   |       |
+            +--------+  +------+ |         |__| Router|  +----------+
+            |Ethernet|  |Tunnel| |Internet |  +---B---+  |Ethernet  |
+            |Switch  |--|Head  |-|         |      |      |Switch    |
+            +-+--+---+  |Router| |         |  +---+---+--+--+--+----+
+              |__|      +--A---+ (         )  |Network|     |__|
+                                  \       /   |Emulat.|
+            U-turn                 \     /    |"netem"|     U-turn
+            V300 to V400            `-+-'     +-------+     V100 to V200
+
+
+
+           Implementations                  ,---.       +--------+
+                               +~~~~~~~~~~~/     \~~~~~~| Remote |
+            +------->-----F2->-|          /       \     |->---.  |
+            | +---------+      | Tunnel  (         )    |     |  |
+            | | transmit|-F1->-|   ID 1  (         )    |->.  |  |
+            | | Imp 1   |      +~~~~~~~~~|         |~~~~|  |  |  |
+            | | receive |-<--+           (         )    | F1  F2 |
+            | +---------+    |           |Internet |    |  |  |  |
+            *-------<-----+  F1          |         |    |  |  |  |
+              +---------+ |  | +~~~~~~~~~|         |~~~~|  |  |  |
+              | transmit|-*  *-|         |         |    |<-*  |  |
+              | Imp 2   |      | Tunnel  (         )    |     |  |
+              | receive |-<-F2-|   ID 2   \       /     |<----*  |
+              +---------+      +~~~~~~~~~~~\     /~~~~~~| Switch |
+                                            `-+-'       +--------+
+
+   Illustrations of a test setup with a bidirectional tunnel.  The upper
+   diagram emphasizes the VLAN connectivity and geographical location.
+   The lower diagram shows example flows traveling between two
+   measurement implementations (for simplicity, only two flows are
+   shown).
+
+                                 Figure 1
+
+   The testing employs the Layer 2 Tunneling Protocol, version 3
+   (L2TPv3) [RFC3931] tunnel between test sites on the Internet.  The
+   tunnel IP and L2TPv3 headers are intended to conceal the test
+   equipment addresses and ports from hash functions that would tend to
+   spread different test streams across parallel network resources, with
+   likely variation in performance as a result.
+
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 6]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   At each end of the tunnel, one pair of VLANs encapsulated in the
+   tunnel are looped back so that test traffic is returned to each test
+   site.  Thus, test streams traverse the L2TP tunnel twice, but appear
+   to be one-way tests from the test equipment point of view.
+
+   The network emulator is a host running Fedora 14 Linux [Fedora14]
+   with IP forwarding enabled and the "netem" Network emulator [netem]
+   loaded and operating as part of the Fedora Kernel 2.6.35.11.
+   Connectivity across the netem/Fedora host was accomplished by
+   bridging Ethernet VLAN interfaces together with "brctl" commands
+   (e.g., eth1.100 <-> eth2.100).  The netem emulator was activated on
+   one interface (eth1) and only operates on test streams traveling in
+   one direction.  In some tests, independent netem instances operated
+   separately on each VLAN.
+
+   The links between the netem emulator host and router and switch were
+   found to be 100baseTx-HD (100 Mbps half duplex) when the testing was
+   complete.  Use of half duplex was not intended, but probably added a
+   small amount of delay variation that could have been avoided in full
+   duplex mode.
+
+   Each individual test was run with common packet rates (1 pps, 10 pps)
+   Poisson/Periodic distributions, and IP packet sizes of 64, 340, and
+   500 Bytes.  These sizes cover a reasonable range while avoiding
+   fragmentation and the complexities it causes, thus complying with the
+   notion of "standard formed packets" described in Section 15 of
+   [RFC2330].
+
+   For these tests, a stream of at least 300 packets were sent from
+   Source to Destination in each implementation.  Periodic streams (as
+   per [RFC3432]) with 1 second spacing were used, except as noted.
+
+   With the L2TPv3 tunnel in use, the metric name for the testing
+   configured here (with respect to the IP header exposed to Internet
+   processing) is:
+
+   Type-IP-protocol-115-One-way-Delay-<StreamType>-Stream
+
+   With (Section 4.2 of [RFC2679]) Metric Parameters:
+
+   + Src, the IP address of a host (12.3.167.16 or 193.159.144.8)
+
+   + Dst, the IP address of a host (193.159.144.8 or 12.3.167.16)
+
+   + T0, a time
+
+   + Tf, a time
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 7]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   + lambda, a rate in reciprocal seconds
+
+   + Thresh, a maximum waiting time in seconds (see Section 3.8.2 of
+   [RFC2679] and Section 4.3 of [RFC2679])
+
+   Metric Units: A sequence of pairs; the elements of each pair are:
+
+   + T, a time, and
+
+   + dT, either a real number or an undefined number of seconds.
+
+   The values of T in the sequence are monotonic increasing.  Note that
+   T would be a valid parameter to Type-P-One-way-Delay and that dT
+   would be a valid value of Type-P-One-way-Delay.
+
+   Also, Section 3.8.4 of [RFC2679] recommends that the path SHOULD be
+   reported.  In this test setup, most of the path details will be
+   concealed from the implementations by the L2TPv3 tunnels; thus, a
+   more informative path trace route can be conducted by the routers at
+   each location.
+
+   When NetProbe is used in production, a traceroute is conducted in
+   parallel with, and at the outset of, measurements.
+
+   Perfas+ does not support traceroute.
+
+ IPLGW#traceroute 193.159.144.8
+
+ Type escape sequence to abort.
+ Tracing the route to 193.159.144.8
+
+   1 12.126.218.245 [AS 7018] 0 msec 0 msec 4 msec
+   2 cr84.n54ny.ip.att.net (12.123.2.158) [AS 7018] 4 msec 4 msec
+     cr83.n54ny.ip.att.net (12.123.2.26) [AS 7018] 4 msec
+   3 cr1.n54ny.ip.att.net (12.122.105.49) [AS 7018] 4 msec
+     cr2.n54ny.ip.att.net (12.122.115.93) [AS 7018] 0 msec
+     cr1.n54ny.ip.att.net (12.122.105.49) [AS 7018] 0 msec
+   4 n54ny02jt.ip.att.net (12.122.80.225) [AS 7018] 4 msec 0 msec
+     n54ny02jt.ip.att.net (12.122.80.237) [AS 7018] 4 msec
+   5 192.205.34.182 [AS 7018] 0 msec
+     192.205.34.150 [AS 7018] 0 msec
+     192.205.34.182 [AS 7018] 4 msec
+   6 da-rg12-i.DA.DE.NET.DTAG.DE (62.154.1.30) [AS 3320] 88 msec 88 msec
+ 88 msec
+   7 217.89.29.62 [AS 3320] 88 msec 88 msec 88 msec
+   8 217.89.29.55 [AS 3320] 88 msec 88 msec 88 msec
+   9  *  *  *
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 8]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   It was only possible to conduct the traceroute for the measured path
+   on one of the tunnel-head routers (the normal trace facilities of the
+   measurement systems are confounded by the L2TPv3 tunnel
+   encapsulation).
+
+4.  Error Calibration, RFC 2679
+
+   An implementation is required to report on its error calibration in
+   Section 3.8 of [RFC2679] (also required in Section 4.8 for sample
+   metrics).  Sections 3.6, 3.7, and 3.8 of [RFC2679] give the detailed
+   formulation of the errors and uncertainties for calibration.  In
+   summary, Section 3.7.1 of [RFC2679] describes the total time-varying
+   uncertainty as:
+
+   Esynch(t)+ Rsource + Rdest
+
+   where:
+
+   Esynch(t) denotes an upper bound on the magnitude of clock
+   synchronization uncertainty.
+
+   Rsource and Rdest denote the resolution of the source clock and the
+   destination clock, respectively.
+
+   Further, Section 3.7.2 of [RFC2679] describes the total wire-time
+   uncertainty as:
+
+   Hsource + Hdest
+
+   referring to the upper bounds on host-time to wire-time for source
+   and destination, respectively.
+
+   Section 3.7.3 of [RFC2679] describes a test with small packets over
+   an isolated minimal network where the results can be used to estimate
+   systematic and random components of the sum of the above errors or
+   uncertainties.  In a test with hundreds of singletons, the median is
+   the systematic error and when the median is subtracted from all
+   singletons, the remaining variability is the random error.
+
+   The test context, or Type-P of the test packets, must also be
+   reported, as required in Section 3.8 of [RFC2679] and all metrics
+   defined there.  Type-P is defined in Section 13 of [RFC2330] (as are
+   many terms used below).
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                     [Page 9]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+4.1.  NetProbe Error and Type-P
+
+   Type-P for this test was IP-UDP with Best Effort Differentiated
+   Services Code Point (DSCP).  These headers were encapsulated
+   according to the L2TPv3 specifications [RFC3931]; thus, they may not
+   influence the treatment received as the packets traversed the
+   Internet.
+
+   In general, NetProbe error is dependent on the specific version and
+   installation details.
+
+   NetProbe operates using host-time above the UDP layer, which is
+   different from the wire-time preferred in [RFC2330], but it can be
+   identified as a source of error according to Section 3.7.2 of
+   [RFC2679].
+
+   Accuracy of NetProbe measurements is usually limited by NTP
+   synchronization performance (which is typically taken as ~+/-1 ms
+   error or greater), although the installation used in this testing
+   often exhibits errors much less than typical for NTP.  The primary
+   stratum 1 NTP server is closely located on a sparsely utilized
+   network management LAN; thus, it avoids many concerns raised in
+   Section 10 of [RFC2330] (in fact, smooth adjustment, long-term drift
+   analysis and compensation, and infrequent adjustment all lead to
+   stability during measurement intervals, the main concern).
+
+   The resolution of the reported results is 1 us (us = microsecond) in
+   the version of NetProbe tested here, which contributes to at least
+   +/-1 us error.
+
+   NetProbe implements a timekeeping sanity check on sending and
+   receiving time-stamping processes.  When a significant process
+   interruption takes place, individual test packets are flagged as
+   possibly containing unusual time errors, and they are excluded from
+   the sample used for all "time" metrics.
+
+   We performed a NetProbe calibration of the type described in Section
+   3.7.3 of [RFC2679], using 64-Byte packets over a cross-connect cable.
+   The results estimate systematic and random components of the sum of
+   the Hsource + Hdest errors or uncertainties.  In a test with 300
+   singletons conducted over 30 seconds (periodic sample with 100 ms
+   spacing), the median is the systematic error and the remaining
+   variability is the random error.  One set of results is tabulated
+   below:
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 10]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   (Results from the "R" software environment for statistical computing
+   and graphics - http://www.r-project.org/ )
+   > summary(XD4CAL)
+         CAL1            CAL2             CAL3
+    Min.   : 89.0   Min.   : 68.00   Min.   : 54.00
+    1st Qu.: 99.0   1st Qu.: 77.00   1st Qu.: 63.00
+    Median :110.0   Median : 79.00   Median : 65.00
+    Mean   :116.8   Mean   : 83.74   Mean   : 69.65
+    3rd Qu.:127.0   3rd Qu.: 88.00   3rd Qu.: 74.00
+    Max.   :205.0   Max.   :177.00   Max.   :163.00
+   >
+   NetProbe Calibration with Cross-Connect Cable, one-way delay values
+   in microseconds (us)
+
+   The median or systematic error can be as high as 110 us, and the
+   range of the random error is also on the order of 116 us for all
+   streams.
+
+   Also, anticipating the Anderson-Darling K-sample (ADK) [ADK]
+   comparisons to follow, we corrected the CAL2 values for the
+   difference between the means of CAL2 and CAL3 (as permitted in
+   Section 3.2 of [RFC6576]), and found strong support (for the Null
+   Hypothesis) that the samples are from the same distribution
+   (resolution of 1 us and alpha equal 0.05 and 0.01)
+
+   > XD4CVCAL2 <- XD4CAL$CAL2 - (mean(XD4CAL$CAL2)-mean(XD4CAL$CAL3))
+   > boxplot(XD4CVCAL2,XD4CAL$CAL3)
+   > XD4CV2_ADK <- adk.test(XD4CVCAL2, XD4CAL$CAL3)
+   > XD4CV2_ADK
+   Anderson-Darling k-sample test.
+
+   Number of samples:  2
+   Sample sizes: 300 300
+   Total number of values: 600
+   Number of unique values: 97
+
+   Mean of Anderson Darling Criterion: 1
+   Standard deviation of Anderson Darling Criterion: 0.75896
+
+   T = (Anderson-Darling Criterion - mean)/sigma
+
+   Null Hypothesis: All samples come from a common population.
+
+                        t.obs P-value extrapolation
+   not adj. for ties  0.71734 0.17042             0
+   adj. for ties     -0.39553 0.44589             1
+   >
+   using [Rtool] and [Radk].
+
+
+
+Ciavattone, et al.            Informational                    [Page 11]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+4.2.  Perfas+ Error and Type-P
+
+   Perfas+ is configured to use GPS synchronization and uses NTP
+   synchronization as a fall-back or default.  GPS synchronization
+   worked throughout this test with the exception of the calibration
+   stated here (one implementation was NTP synchronized only).  The time
+   stamp accuracy typically is 0.1 ms.
+
+   The resolution of the results reported by Perfas+ is 1 us (us =
+   microsecond) in the version tested here, which contributes to at
+   least +/-1 us error.
+
+   Port    5001 5002 5003
+   Min.    -227 -226  294
+   Median  -169 -167  323
+   Mean    -159 -157  335
+   Max.       6  -52  376
+   s        102  102   93
+   Perfas+ Calibration with Cross-Connect Cable, one-way delay values in
+   microseconds (us)
+
+   The median or systematic error can be as high as 323 us, and the
+   range of the random error is also less than 232 us for all streams.
+
+5.  Predetermined Limits on Equivalence
+
+   This section provides the numerical limits on comparisons between
+   implementations, in order to declare that the results are equivalent
+   and therefore, the tested specification is clear.  These limits have
+   their basis in Section 3.1 of [RFC6576] and the Appendix of
+   [RFC2330], with additional limits representing IP Performance Metrics
+   (IPPM) consensus prior to publication of results.
+
+   A key point is that the allowable errors, corrections, and confidence
+   levels only need to be sufficient to detect misinterpretation of the
+   tested specification resulting in diverging implementations.
+
+   Also, the allowable error must be sufficient to compensate for
+   measured path differences.  It was simply not possible to measure
+   fully identical paths in the VLAN-loopback test configuration used,
+   and this practical compromise must be taken into account.
+
+   For Anderson-Darling K-sample (ADK) comparisons, the required
+   confidence factor for the cross-implementation comparisons SHALL be
+   the smallest of:
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 12]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   o  0.95 confidence factor at 1 ms resolution, or
+
+   o  the smallest confidence factor (in combination with resolution) of
+      the two same-implementation comparisons for the same test
+      conditions.
+
+   A constant time accuracy error of as much as +/-0.5 ms MAY be removed
+   from one implementation's distributions (all singletons) before the
+   ADK comparison is conducted.
+
+   A constant propagation delay error (due to use of different sub-nets
+   between the switch and measurement devices at each location) of as
+   much as +2 ms MAY be removed from one implementation's distributions
+   (all singletons) before the ADK comparison is conducted.
+
+   For comparisons involving the mean of a sample or other central
+   statistics, the limits on both the time accuracy error and the
+   propagation delay error constants given above also apply.
+
+6.  Tests to Evaluate RFC 2679 Specifications
+
+   This section describes some results from real-world (cross-Internet)
+   tests with measurement devices implementing IPPM metrics and a
+   network emulator to create relevant conditions, to determine whether
+   the metric definitions were interpreted consistently by implementors.
+
+   The procedures are slightly modified from the original procedures
+   contained in Appendix A.1 of [RFC6576].  The modifications include
+   the use of the mean statistic for comparisons.
+
+   Note that there are only five instances of the requirement term
+   "MUST" in [RFC2679] outside of the boilerplate and [RFC2119]
+   reference.
+
+6.1.  One-Way Delay, ADK Sample Comparison: Same- and Cross-
+      Implementation
+
+   This test determines if implementations produce results that appear
+   to come from a common delay distribution, as an overall evaluation of
+   Section 4 of [RFC2679], "A Definition for Samples of One-way Delay".
+   Same-implementation comparison results help to set the threshold of
+   equivalence that will be applied to cross-implementation comparisons.
+
+   This test is intended to evaluate measurements in Sections 3 and 4 of
+   [RFC2679].
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 13]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   By testing the extent to which the distributions of one-way delay
+   singletons from two implementations of [RFC2679] appear to be from
+   the same distribution, we economize on comparisons, because comparing
+   a set of individual summary statistics (as defined in Section 5 of
+   [RFC2679]) would require another set of individual evaluations of
+   equivalence.  Instead, we can simply check which statistics were
+   implemented, and report on those facts.
+
+   1.  Configure an L2TPv3 path between test sites, and each pair of
+       measurement devices to operate tests in their designated pair of
+       VLANs.
+
+   2.  Measure a sample of one-way delay singletons with two or more
+       implementations, using identical options and network emulator
+       settings (if used).
+
+   3.  Measure a sample of one-way delay singletons with *four*
+       instances of the *same* implementations, using identical options,
+       noting that connectivity differences SHOULD be the same as for
+       the cross-implementation testing.
+
+   4.  Apply the ADK comparison procedures (see Appendices A and B of
+       [RFC6576]) and determine the resolution and confidence factor for
+       distribution equivalence of each same-implementation comparison
+       and each cross-implementation comparison.
+
+   5.  Take the coarsest resolution and confidence factor for
+       distribution equivalence from the same-implementation pairs, or
+       the limit defined in Section 5 above, as a limit on the
+       equivalence threshold for these experimental conditions.
+
+   6.  Apply constant correction factors to all singletons of the sample
+       distributions, as described and limited in Section 5 above.
+
+   7.  Compare the cross-implementation ADK performance with the
+       equivalence threshold determined in step 5 to determine if
+       equivalence can be declared.
+
+   The common parameters used for tests in this section are:
+
+   o  IP header + payload = 64 octets
+
+   o  Periodic sampling at 1 packet per second
+
+   o  Test duration = 300 seconds (March 29, 2011)
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 14]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   The netem emulator was set for 100 ms average delay, with uniform
+   delay variation of +/-50 ms.  In this experiment, the netem emulator
+   was configured to operate independently on each VLAN; thus, the
+   emulator itself is a potential source of error when comparing streams
+   that traverse the test path in different directions.
+
+   In the result analysis of this section:
+
+   o  All comparisons used 1 microsecond resolution.
+
+   o  No correction factors were applied.
+
+   o  The 0.95 confidence factor (1.960 for paired stream comparison)
+      was used.
+
+6.1.1.  NetProbe Same-Implementation Results
+
+   A single same-implementation comparison fails the ADK criterion (s1
+   <-> sB).  We note that these streams traversed the test path in
+   opposite directions, making the live network factors a possibility to
+   explain the difference.
+
+   All other pair comparisons pass the ADK criterion.
+
+          +------------------------------------------------------+
+          |            |             |             |             |
+          | ti.obs (P) |     s1      |     s2      |     sA      |
+          |            |             |             |             |
+          .............|.............|.............|.............|
+          |            |             |             |             |
+          |    s2      | 0.25 (0.28) |             |             |
+          |            |             |             |             |
+          ...........................|.............|.............|
+          |            |             |             |             |
+          |    sA      | 0.60 (0.19) |-0.80 (0.57) |             |
+          |            |             |             |             |
+          ...........................|.............|.............|
+          |            |             |             |             |
+          |    sB      | 2.64 (0.03) | 0.07 (0.31) |-0.52 (0.48) |
+          |            |             |             |             |
+          +------------+-------------+-------------+-------------+
+
+               NetProbe ADK results for same-implementation
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 15]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+6.1.2.  Perfas+ Same-Implementation Results
+
+   All pair comparisons pass the ADK criterion.
+
+          +------------------------------------------------------+
+          |            |             |             |             |
+          | ti.obs (P) |     p1      |     p2      |     p3      |
+          |            |             |             |             |
+          .............|.............|.............|.............|
+          |            |             |             |             |
+          |    p2      | 0.06 (0.32) |             |             |
+          |            |             |             |             |
+          .........................................|.............|
+          |            |             |             |             |
+          |    p3      | 1.09 (0.12) | 0.37 (0.24) |             |
+          |            |             |             |             |
+          ...........................|.............|.............|
+          |            |             |             |             |
+          |    p4      |-0.81 (0.57) |-0.13 (0.37) | 1.36 (0.09) |
+          |            |             |             |             |
+          +------------+-------------+-------------+-------------+
+
+                Perfas+ ADK results for same-implementation
+
+6.1.3.  One-Way Delay, Cross-Implementation ADK Comparison
+
+   The cross-implementation results are compared using a combined ADK
+   analysis [Radk], where all NetProbe results are compared with all
+   Perfas+ results after testing that the combined same-implementation
+   results pass the ADK criterion.
+
+   When 4 (same) samples are compared, the ADK criterion for 0.95
+   confidence is 1.915, and when all 8 (cross) samples are compared it
+   is 1.85.
+
+   Combination of Anderson-Darling K-Sample Tests.
+
+   Sample sizes within each data set:
+   Data set 1 :  299 297 298 300 (NetProbe)
+   Data set 2 :  300 300 298 300 (Perfas+)
+   Total sample size per data set: 1194 1198
+   Number of unique values per data set: 1188 1192
+   ...
+   Null Hypothesis:
+   All samples within a data set come from a common distribution.
+   The common distribution may change between data sets.
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 16]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   NetProbe           ti.obs P-value extrapolation
+   not adj. for ties 0.64999 0.21355             0
+   adj. for ties     0.64833 0.21392             0
+   Perfas+
+   not adj. for ties 0.55968 0.23442             0
+   adj. for ties     0.55840 0.23473             0
+
+   Combined Anderson-Darling Criterion:
+                      tc.obs P-value extrapolation
+   not adj. for ties 0.85537 0.17967             0
+   adj. for ties     0.85329 0.18010             0
+
+   The combined same-implementation samples and the combined cross-
+   implementation comparison all pass the ADK criterion at P>=0.18 and
+   support the Null Hypothesis (both data sets come from a common
+   distribution).
+
+   We also see that the paired ADK comparisons are rather critical.
+   Although the NetProbe s1-sB comparison failed, the combined data set
+   from four streams passed the ADK criterion easily.
+
+6.1.4.  Conclusions on the ADK Results for One-Way Delay
+
+   Similar testing was repeated many times in the months of March and
+   April 2011.  There were many experiments where a single test stream
+   from NetProbe or Perfas+ proved to be different from the others in
+   paired comparisons (even same-implementation comparisons).  When the
+   outlier stream was removed from the comparison, the remaining streams
+   passed combined ADK criterion.  Also, the application of correction
+   factors resulted in higher comparison success.
+
+   We conclude that the two implementations are capable of producing
+   equivalent one-way delay distributions based on their interpretation
+   of [RFC2679].
+
+6.1.5.  Additional Investigations
+
+   On the final day of testing, we performed a series of measurements to
+   evaluate the amount of emulated delay variation necessary to achieve
+   successful ADK comparisons.  The need for correction factors (as
+   permitted by Section 5) and the size of the measurement sample
+   (obtained as sub-sets of the complete measurement sample) were also
+   evaluated.
+
+   The common parameters used for tests in this section are:
+
+   o  IP header + payload = 64 octets
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 17]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   o  Periodic sampling at 1 packet per second
+
+   o  Test duration = 300 seconds at each delay variation setting, for a
+      total of 1200 seconds (May 2, 2011 at 1720 UTC)
+
+   The netem emulator was set for 100 ms average delay, with (emulated)
+   uniform delay variation of:
+
+   o  +/-7.5 ms
+
+   o  +/-5.0 ms
+
+   o  +/-2.5 ms
+
+   o  0 ms
+
+   In this experiment, the netem emulator was configured to operate
+   independently on each VLAN; thus, the emulator itself is a potential
+   source of error when comparing streams that traverse the test path in
+   different directions.
+
+   In the result analysis of this section:
+
+   o  All comparisons used 1 microsecond resolution.
+
+   o  Correction factors *were* applied as noted (under column heading
+      "mean adj").  The difference between each sample mean and the
+      lowest mean of the NetProbe or Perfas+ stream samples was
+      subtracted from all values in the sample. ("raw" indicates no
+      correction factors were used.)  All correction factors applied met
+      the limits described in Section 5.
+
+   o  The 0.95 confidence factor (1.960 for paired stream comparison)
+      was used.
+
+   When 8 (cross) samples are compared, the ADK criterion for 0.95
+   confidence is 1.85.  The Combined ADK test statistic ("TC observed")
+   must be less than 1.85 to accept the Null Hypothesis (all samples in
+   the data set are from a common distribution).
+
+
+
+
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 18]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   Emulated Delay                        Sub-Sample size
+   Variation     0ms
+   adk.combined (all)           300 values             75 values
+   Adj. for ties           raw         mean adj    raw        mean adj
+   TC observed             226.6563    67.51559    54.01359   21.56513
+   P-value                         0          0           0          0
+   Mean std dev (all),us         719                    635
+   Mean diff of means,us         649          0         606          0
+
+   Variation +/- 2.5ms
+   adk.combined (all)           300 values             75 values
+   Adj. for ties            raw        mean adj     raw       mean adj
+   TC observed              14.50436   -1.60196     3.15935   -1.72104
+   P-value                         0     0.873      0.00799    0.89038
+   Mean std dev (all),us        1655                   1702
+   Mean diff of means,us         471          0         513          0
+
+   Variation +/- 5ms
+   adk.combined (all)           300 values             75 values
+   Adj. for ties            raw        mean adj     raw       mean adj
+   TC observed               8.29921   -1.28927     0.37878   -1.81881
+   P-value                         0    0.81601     0.29984    0.90305
+   Mean std dev (all),us        3023                   2991
+   Mean diff of means,us         582          0         513          0
+
+   Variation +/- 7.5ms
+   adk.combined (all)           300 values             75 values
+   Adj. for ties            raw        mean adj     raw       mean adj
+   TC observed              2.53759    -0.72985     0.29241   -1.15840
+   P-value                  0.01950     0.66942     0.32585    0.78686
+   Mean std dev (all),us        4449                   4506
+   Mean diff of means,us         426          0         856          0
+
+
+   From the table above, we conclude the following:
+
+   1.  None of the raw or mean adjusted results pass the ADK criterion
+       with 0 ms emulated delay variation.  Use of the 75 value sub-
+       sample yielded the same conclusion.  (We note the same results
+       when comparing same-implementation samples for both NetProbe and
+       Perfas+.)
+
+   2.  When the smallest emulated delay variation was inserted (+/-2.5
+       ms), the mean adjusted samples pass the ADK criterion and the
+       high P-value supports the result.  The raw results do not pass.
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 19]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   3.  At higher values of emulated delay variation (+/-5.0 ms and
+       +/-7.5 ms), again the mean adjusted values pass ADK.  We also see
+       that the 75-value sub-sample passed the ADK in both raw and mean
+       adjusted cases.  This indicates that sample size may have played
+       a role in our results, as noted in the Appendix of [RFC2330] for
+       Goodness-of-Fit testing.
+
+   We note that 150 value sub-samples were also evaluated, with ADK
+   conclusions that followed the results for 300 values.  Also, same-
+   implementation analysis was conducted with results similar to the
+   above, except that more of the "raw" or uncorrected samples passed
+   the ADK criterion.
+
+6.2.  One-Way Delay, Loss Threshold, RFC 2679
+
+   This test determines if implementations use the same configured
+   maximum waiting time delay from one measurement to another under
+   different delay conditions, and correctly declare packets arriving in
+   excess of the waiting time threshold as lost.
+
+   See the requirements of Section 3.5 of [RFC2679], third bullet point,
+   and also Section 3.8.2 of [RFC2679].
+
+   1.  configure an L2TPv3 path between test sites, and each pair of
+       measurement devices to operate tests in their designated pair of
+       VLANs.
+
+   2.  configure the network emulator to add 1.0 sec. one-way constant
+       delay in one direction of transmission.
+
+   3.  measure (average) one-way delay with two or more implementations,
+       using identical waiting time thresholds (Thresh) for loss set at
+       3 seconds.
+
+   4.  configure the network emulator to add 3 sec. one-way constant
+       delay in one direction of transmission equivalent to 2 seconds of
+       additional one-way delay (or change the path delay while test is
+       in progress, when there are sufficient packets at the first delay
+       setting).
+
+   5.  repeat/continue measurements.
+
+   6.  observe that the increase measured in step 5 caused all packets
+       with 2 sec. additional delay to be declared lost, and that all
+       packets that arrive successfully in step 3 are assigned a valid
+       one-way delay.
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 20]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   The common parameters used for tests in this section are:
+
+   o  IP header + payload = 64 octets
+
+   o  Poisson sampling at lambda = 1 packet per second
+
+   o  Test duration = 900 seconds total (March 21, 2011)
+
+   The netem emulator was set to add constant delays as specified in the
+   procedure above.
+
+6.2.1.  NetProbe Results for Loss Threshold
+
+   In NetProbe, the Loss Threshold is implemented uniformly over all
+   packets as a post-processing routine.  With the Loss Threshold set at
+   3 seconds, all packets with one-way delay >3 seconds are marked
+   "Lost" and included in the Lost Packet list with their transmission
+   time (as required in Section 3.3 of [RFC2680]).  This resulted in 342
+   packets designated as lost in one of the test streams (with average
+   delay = 3.091 sec.).
+
+6.2.2.  Perfas+ Results for Loss Threshold
+
+   Perfas+ uses a fixed Loss Threshold that was not adjustable during
+   this study.  The Loss Threshold is approximately one minute, and
+   emulation of a delay of this size was not attempted.  However, it is
+   possible to implement any delay threshold desired with a post-
+   processing routine and subsequent analysis.  Using this method, 195
+   packets would be declared lost (with average delay = 3.091 sec.).
+
+6.2.3.  Conclusions for Loss Threshold
+
+   Both implementations assume that any constant delay value desired can
+   be used as the Loss Threshold, since all delays are stored as a pair
+   <Time, Delay> as required in [RFC2679].  This is a simple way to
+   enforce the constant loss threshold envisioned in [RFC2679] (see
+   specific section references above).  We take the position that the
+   assumption of post-processing is compliant and that the text of the
+   RFC should be revised slightly to include this point.
+
+6.3.  One-Way Delay, First Bit to Last Bit, RFC 2679
+
+   This test determines if implementations register the same relative
+   change in delay from one packet size to another, indicating that the
+   first-to-last time-stamping convention has been followed.  This test
+   tends to cancel the sources of error that may be present in an
+   implementation.
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 21]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   See the requirements of Section 3.7.2 of [RFC2679], and Section 10.2
+   of [RFC2330].
+
+   1.  configure an L2TPv3 path between test sites, and each pair of
+       measurement devices to operate tests in their designated pair of
+       VLANs, and ideally including a low-speed link (it was not
+       possible to change the link configuration during testing, so the
+       lowest speed link present was the basis for serialization time
+       comparisons).
+
+   2.  measure (average) one-way delay with two or more implementations,
+       using identical options and equal size small packets (64-octet IP
+       header and payload).
+
+   3.  maintain the same path with additional emulated 100 ms one-way
+       delay.
+
+   4.  measure (average) one-way delay with two or more implementations,
+       using identical options and equal size large packets (500 octet
+       IP header and payload).
+
+   5.  observe that the increase measured between steps 2 and 4 is
+       equivalent to the increase in ms expected due to the larger
+       serialization time for each implementation.  Most of the
+       measurement errors in each system should cancel, if they are
+       stationary.
+
+   The common parameters used for tests in this section are:
+
+   o  IP header + payload = 64 octets
+
+   o  Periodic sampling at l packet per second
+
+   o  Test duration = 300 seconds total (April 12)
+
+   The netem emulator was set to add constant 100 ms delay.
+
+6.3.1.  NetProbe and Perfas+ Results for Serialization
+
+   When the IP header + payload size was increased from 64 octets to 500
+   octets, there was a delay increase observed.
+
+
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 22]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   Mean Delays in us
+   NetProbe
+   Payload    s1      s2      sA      sB
+   500    190893  191179  190892  190971
+    64    189642  189785  189747  189467
+   Diff     1251    1394    1145    1505
+
+   Perfas
+   Payload    p1      p2      p3      p4
+   500    190908  190911  191126  190709
+    64    189706  189752  189763  190220
+   Diff     1202   1159    1363      489
+
+   Serialization tests, all values in microseconds
+
+   The typical delay increase when the larger packets were used was 1.1
+   to 1.5 ms (with one outlier).  The typical measurements indicate that
+   a link with approximately 3 Mbit/s capacity is present on the path.
+
+   Through investigation of the facilities involved, it was determined
+   that the lowest speed link was approximately 45 Mbit/s, and therefore
+   the estimated difference should be about 0.077 ms.  The observed
+   differences are much higher.
+
+   The unexpected large delay difference was also the outcome when
+   testing serialization times in a lab environment, using the NIST Net
+   Emulator and NetProbe [ADV-METRICS].
+
+6.3.2.  Conclusions for Serialization
+
+   Since it was not possible to confirm the estimated serialization time
+   increases in field tests, we resort to examination of the
+   implementations to determine compliance.
+
+   NetProbe performs all time stamping above the IP layer, accepting
+   that some compromises must be made to achieve extreme portability and
+   measurement scale.  Therefore, the first-to-last bit convention is
+   supported because the serialization time is included in the one-way
+   delay measurement, enabling comparison with other implementations.
+
+   Perfas+ is optimized for its purpose and performs all time stamping
+   close to the interface hardware.  The first-to-last bit convention is
+   supported because the serialization time is included in the one-way
+   delay measurement, enabling comparison with other implementations.
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 23]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+6.4.  One-Way Delay, Difference Sample Metric
+
+   This test determines if implementations register the same relative
+   increase in delay from one measurement to another under different
+   delay conditions.  This test tends to cancel the sources of error
+   that may be present in an implementation.
+
+   This test is intended to evaluate measurements in Sections 3 and 4 of
+   [RFC2679].
+
+   1.  configure an L2TPv3 path between test sites, and each pair of
+       measurement devices to operate tests in their designated pair of
+       VLANs.
+
+   2.  measure (average) one-way delay with two or more implementations,
+       using identical options.
+
+   3.  configure the path with X+Y ms one-way delay.
+
+   4.  repeat measurements.
+
+   5.  observe that the (average) increase measured in steps 2 and 4 is
+       ~Y ms for each implementation.  Most of the measurement errors in
+       each system should cancel, if they are stationary.
+
+   In this test, X = 1000 ms and Y = 1000 ms.
+
+   The common parameters used for tests in this section are:
+
+   o  IP header + payload = 64 octets
+
+   o  Poisson sampling at lambda = 1 packet per second
+
+   o  Test duration = 900 seconds total (March 21, 2011)
+
+   The netem emulator was set to add constant delays as specified in the
+   procedure above.
+
+6.4.1.  NetProbe Results for Differential Delay
+
+         Average pre-increase delay, microseconds        1089868.0
+         Average post 1 s additional, microseconds        2089686.0
+         Difference (should be ~= Y = 1 s)                 999818.0
+
+               Average delays before/after 1 second increase
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 24]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   The NetProbe implementation observed a 1 second increase with a 182
+   microsecond error (assuming that the netem emulated delay difference
+   is exact).
+
+   We note that this differential delay test has been run under lab
+   conditions and published in prior work [ADV-METRICS].  The error was
+   6 microseconds.
+
+6.4.2.  Perfas+ Results for Differential Delay
+
+         Average pre-increase delay, microseconds        1089794.0
+         Average post 1 s additional, microseconds        2089801.0
+         Difference (should be ~= Y = 1 s)                1000007.0
+
+               Average delays before/after 1 second increase
+
+   The Perfas+ implementation observed a 1 second increase with a 7
+   microsecond error.
+
+6.4.3.  Conclusions for Differential Delay
+
+   Again, the live network conditions appear to have influenced the
+   results, but both implementations measured the same delay increase
+   within their calibration accuracy.
+
+6.5.  Implementation of Statistics for One-Way Delay
+
+   The ADK tests the extent to which the sample distributions of one-way
+   delay singletons from two implementations of [RFC2679] appear to be
+   from the same overall distribution.  By testing this way, we
+   economize on the number of comparisons, because comparing a set of
+   individual summary statistics (as defined in Section 5 of [RFC2679])
+   would require another set of individual evaluations of equivalence.
+   Instead, we can simply check which statistics were implemented, and
+   report on those facts, noting that Section 5 of [RFC2679] does not
+   specify the calculations exactly, and gives only some illustrative
+   examples.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 25]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+                                                 NetProbe  Perfas+
+
+   5.1. Type-P-One-way-Delay-Percentile            yes       no
+
+   5.2. Type-P-One-way-Delay-Median                yes       no
+
+   5.3. Type-P-One-way-Delay-Minimum               yes       yes
+
+   5.4. Type-P-One-way-Delay-Inverse-Percentile    no        no
+
+                  Implementation of Section 5 Statistics
+
+   Only the Type-P-One-way-Delay-Inverse-Percentile has been ignored in
+   both implementations, so it is a candidate for removal or deprecation
+   in a revision of RFC 2679 (this small discrepancy does not affect
+   candidacy for advancement).
+
+7.  Conclusions and RFC 2679 Errata
+
+   The conclusions throughout Section 6 support the advancement of
+   [RFC2679] to the next step of the Standards Track, because its
+   requirements are deemed to be clear and unambiguous based on
+   evaluation of the test results for two implementations.  The results
+   indicate that these implementations produced statistically equivalent
+   results under network conditions that were configured to be as close
+   to identical as possible.
+
+   Sections 6.2.3 and 6.5 indicate areas where minor revisions are
+   warranted in RFC 2679.  The IETF has reached consensus on guidance
+   for reporting metrics in [RFC6703], and this memo should be
+   referenced in the revision to RFC 2679 to incorporate recent
+   experience where appropriate.
+
+   We note that there is currently one erratum with status "Held for
+   Document Update" for [RFC2679], and it appears this minor revision
+   and additional text should be incorporated in a revision of RFC 2679.
+
+   The authors that revise [RFC2679] should review all errata filed at
+   the time the document is being written.  They should not rely upon
+   this document to indicate all relevant errata updates.
+
+8.  Security Considerations
+
+   The security considerations that apply to any active measurement of
+   live networks are relevant here as well.  See [RFC4656] and
+   [RFC5357].
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 26]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+9.  Acknowledgements
+
+   The authors thank Lars Eggert for his continued encouragement to
+   advance the IPPM metrics during his tenure as AD Advisor.
+
+   Nicole Kowalski supplied the needed CPE router for the NetProbe side
+   of the test setup, and graciously managed her testing in spite of
+   issues caused by dual-use of the router.  Thanks Nicole!
+
+   The "NetProbe Team" also acknowledges many useful discussions with
+   Ganga Maguluri.
+
+10.  References
+
+10.1.  Normative References
+
+   [RFC2026]  Bradner, S., "The Internet Standards Process -- Revision
+              3", BCP 9, RFC 2026, October 1996.
+
+   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
+              Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
+              "Framework for IP Performance Metrics", RFC 2330,
+              May 1998.
+
+   [RFC2679]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
+              Delay Metric for IPPM", RFC 2679, September 1999.
+
+   [RFC2680]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
+              Packet Loss Metric for IPPM", RFC 2680, September 1999.
+
+   [RFC3432]  Raisanen, V., Grotefeld, G., and A. Morton, "Network
+              performance measurement with periodic streams", RFC 3432,
+              November 2002.
+
+   [RFC4656]  Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M.
+              Zekauskas, "A One-way Active Measurement Protocol
+              (OWAMP)", RFC 4656, September 2006.
+
+   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
+              Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
+              RFC 5357, October 2008.
+
+   [RFC5657]  Dusseault, L. and R. Sparks, "Guidance on Interoperation
+              and Implementation Reports for Advancement to Draft
+              Standard", BCP 9, RFC 5657, September 2009.
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 27]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   [RFC6576]  Geib, R., Morton, A., Fardid, R., and A. Steinmitz, "IP
+              Performance Metrics (IPPM) Standard Advancement Testing",
+              BCP 176, RFC 6576, March 2012.
+
+   [RFC6703]  Morton, A., Ramachandran, G., and G. Maguluri, "Reporting
+              IP Network Performance Metrics: Different Points of View",
+              RFC 6703, August 2012.
+
+10.2.  Informative References
+
+   [ADK]      Scholz, F. and M. Stephens, "K-sample Anderson-Darling
+              Tests of fit, for continuous and discrete cases",
+              University of Washington, Technical Report No. 81,
+              May 1986.
+
+   [ADV-METRICS]
+              Morton, A., "Lab Test Results for Advancing Metrics on the
+              Standards Track", Work in Progress, October 2010.
+
+   [Fedora14] Fedora Project, "Fedora Project Home Page", 2012,
+              <http://fedoraproject.org/>.
+
+   [METRICS-TEST]
+              Bradner, S. and V. Paxson, "Advancement of metrics
+              specifications on the IETF Standards Track", Work
+              in Progress, August 2007.
+
+   [Perfas]   Heidemann, C., "Qualitaet in IP-Netzen Messverfahren",
+              published by ITG Fachgruppe, 2nd meeting 5.2.3 (NGN),
+              November 2001, <http://www.itg523.de/oeffentlich/01nov/
+              Heidemann_QOS_Messverfahren.pdf>.
+
+   [RFC3931]  Lau, J., Townsley, M., and I. Goyret, "Layer Two Tunneling
+              Protocol - Version 3 (L2TPv3)", RFC 3931, March 2005.
+
+   [Radk]     Scholz, F., "adk: Anderson-Darling K-Sample Test and
+              Combinations of Such Tests. R package version 1.0.", 2008.
+
+   [Rtool]    R Development Core Team, "R: A language and environment
+              for statistical computing. R Foundation for Statistical
+              Computing, Vienna, Austria. ISBN 3-900051-07-0", 2011,
+              <http://www.R-project.org/>.
+
+   [WIPM]     AT&T, "AT&T Global IP Network", 2012,
+              <http://ipnetwork.bgtmo.ip.att.net/pws/index.html>.
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 28]
+
+RFC 6808             Standards Track Tests RFC 2679        December 2012
+
+
+   [netem]    The Linux Foundation, "netem", 2009,
+              <http://www.linuxfoundation.org/collaborate/workgroups/
+              networking/netem>.
+
+Authors' Addresses
+
+   Len Ciavattone
+   AT&T Labs
+   200 Laurel Avenue South
+   Middletown, NJ  07748
+   USA
+
+   Phone: +1 732 420 1239
+   EMail: lencia@att.com
+
+
+   Ruediger Geib
+   Deutsche Telekom
+   Heinrich Hertz Str. 3-7
+   Darmstadt,   64295
+   Germany
+
+   Phone: +49 6151 58 12747
+   EMail: Ruediger.Geib@telekom.de
+
+
+   Al Morton
+   AT&T Labs
+   200 Laurel Avenue South
+   Middletown, NJ  07748
+   USA
+
+   Phone: +1 732 420 1571
+   Fax:   +1 732 368 1192
+   EMail: acmorton@att.com
+   URI:   http://home.comcast.net/~acmacm/
+
+
+   Matthias Wieser
+   Technical University Darmstadt
+   Darmstadt,
+   Germany
+
+   EMail: matthias_michael.wieser@stud.tu-darmstadt.de
+
+
+
+
+
+
+
+Ciavattone, et al.            Informational                    [Page 29]
+