summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc8239.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc8239.txt')
-rw-r--r--doc/rfc/rfc8239.txt1067
1 files changed, 1067 insertions, 0 deletions
diff --git a/doc/rfc/rfc8239.txt b/doc/rfc/rfc8239.txt
new file mode 100644
index 0000000..7072467
--- /dev/null
+++ b/doc/rfc/rfc8239.txt
@@ -0,0 +1,1067 @@
+
+
+
+
+
+
+Internet Engineering Task Force (IETF) L. Avramov
+Request for Comments: 8239 Google
+Category: Informational J. Rapp
+ISSN: 2070-1721 VMware
+ August 2017
+
+
+ Data Center Benchmarking Methodology
+
+Abstract
+
+ The purpose of this informational document is to establish test and
+ evaluation methodology and measurement techniques for physical
+ network equipment in the data center. RFC 8238 is a prerequisite for
+ this document, as it contains terminology that is considered
+ normative. Many of these terms and methods may be applicable beyond
+ the scope of this document as the technologies originally applied in
+ the data center are deployed elsewhere.
+
+Status of This Memo
+
+ This document is not an Internet Standards Track specification; it is
+ published for informational purposes.
+
+ This document is a product of the Internet Engineering Task Force
+ (IETF). It represents the consensus of the IETF community. It has
+ received public review and has been approved for publication by the
+ Internet Engineering Steering Group (IESG). Not all documents
+ approved by the IESG are a candidate for any level of Internet
+ Standard; see Section 2 of RFC 7841.
+
+ Information about the current status of this document, any errata,
+ and how to provide feedback on it may be obtained at
+ http://www.rfc-editor.org/info/rfc8239.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 1]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+Copyright Notice
+
+ Copyright (c) 2017 IETF Trust and the persons identified as the
+ document authors. All rights reserved.
+
+ This document is subject to BCP 78 and the IETF Trust's Legal
+ Provisions Relating to IETF Documents
+ (http://trustee.ietf.org/license-info) in effect on the date of
+ publication of this document. Please review these documents
+ carefully, as they describe your rights and restrictions with respect
+ to this document. Code Components extracted from this document must
+ include Simplified BSD License text as described in Section 4.e of
+ the Trust Legal Provisions and are provided without warranty as
+ described in the Simplified BSD License.
+
+Table of Contents
+
+ 1. Introduction ....................................................3
+ 1.1. Requirements Language ......................................4
+ 1.2. Methodology Format and Repeatability Recommendation ........4
+ 2. Line-Rate Testing ...............................................4
+ 2.1. Objective ..................................................4
+ 2.2. Methodology ................................................4
+ 2.3. Reporting Format ...........................................5
+ 3. Buffering Testing ...............................................6
+ 3.1. Objective ..................................................6
+ 3.2. Methodology ................................................7
+ 3.3. Reporting Format ...........................................9
+ 4. Microburst Testing .............................................10
+ 4.1. Objective .................................................10
+ 4.2. Methodology ...............................................10
+ 4.3. Reporting Format ..........................................11
+ 5. Head-of-Line Blocking ..........................................12
+ 5.1. Objective .................................................12
+ 5.2. Methodology ...............................................12
+ 5.3. Reporting Format ..........................................14
+ 6. Incast Stateful and Stateless Traffic ..........................15
+ 6.1. Objective .................................................15
+ 6.2. Methodology ...............................................15
+ 6.3. Reporting Format ..........................................17
+ 7. Security Considerations ........................................17
+ 8. IANA Considerations ............................................17
+ 9. References .....................................................18
+ 9.1. Normative References ......................................18
+ 9.2. Informative References ....................................18
+ Acknowledgments ...................................................19
+ Authors' Addresses ................................................19
+
+
+
+
+Avramov & Rapp Informational [Page 2]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+1. Introduction
+
+ Traffic patterns in the data center are not uniform and are
+ constantly changing. They are dictated by the nature and variety of
+ applications utilized in the data center. They can be largely
+ east-west traffic flows (server to server inside the data center) in
+ one data center and north-south (from the outside of the data center
+ to the server) in another, while others may combine both. Traffic
+ patterns can be bursty in nature and contain many-to-one,
+ many-to-many, or one-to-many flows. Each flow may also be small and
+ latency sensitive or large and throughput sensitive while containing
+ a mix of UDP and TCP traffic. All of these can coexist in a single
+ cluster and flow through a single network device simultaneously.
+ Benchmarking tests for network devices have long used [RFC1242],
+ [RFC2432], [RFC2544], [RFC2889], and [RFC3918], which have largely
+ been focused around various latency attributes and throughput
+ [RFC2889] of the Device Under Test (DUT) being benchmarked. These
+ standards are good at measuring theoretical throughput, forwarding
+ rates, and latency under testing conditions; however, they do not
+ represent real traffic patterns that may affect these networking
+ devices.
+
+ Currently, typical data center networking devices are
+ characterized by:
+
+ - High port density (48 ports or more).
+
+ - High speed (currently, up to 100 GB/s per port).
+
+ - High throughput (line rate on all ports for Layer 2 and/or
+ Layer 3).
+
+ - Low latency (in the microsecond or nanosecond range).
+
+ - Low amount of buffer (in the MB range per networking device).
+
+ - Layer 2 and Layer 3 forwarding capability (Layer 3 not mandatory).
+
+ This document provides a methodology for benchmarking data center
+ physical network equipment DUTs, including congestion scenarios,
+ switch buffer analysis, microburst, and head-of-line blocking, while
+ also using a wide mix of traffic conditions. [RFC8238] is a
+ prerequisite for this document, as it contains terminology that is
+ considered normative.
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 3]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+1.1. Requirements Language
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
+ "OPTIONAL" in this document are to be interpreted as described in
+ BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
+ capitals, as shown here.
+
+1.2. Methodology Format and Repeatability Recommendation
+
+ The following format is used in Sections 2 through 6 of this
+ document:
+
+ - Objective
+
+ - Methodology
+
+ - Reporting Format
+
+ For each test methodology described in this document, it is critical
+ that repeatability of the results be obtained. The recommendation is
+ to perform enough iterations of the given test and to make sure that
+ the result is consistent. This is especially important in the
+ context of the tests described in Section 3, as the buffering testing
+ has historically been the least reliable. The number of iterations
+ SHOULD be explicitly reported. The relative standard deviation
+ SHOULD be below 10%.
+
+2. Line-Rate Testing
+
+2.1. Objective
+
+ The objective of this test is to provide a "maximum rate" test for
+ the performance values for throughput, latency, and jitter. It is
+ meant to provide (1) the tests to perform and (2) methodology for
+ verifying that a DUT is capable of forwarding packets at line rate
+ under non-congested conditions.
+
+2.2. Methodology
+
+ A traffic generator SHOULD be connected to all ports on the DUT. Two
+ tests MUST be conducted: (1) a port-pair test [RFC2544] [RFC3918] and
+ (2) a test using a full-mesh DUT [RFC2889] [RFC3918].
+
+ For all tests, the traffic generator's sending rate MUST be less than
+ or equal to 99.98% of the nominal value of the line rate (with no
+ further Parts Per Million (PPM) adjustment to account for interface
+ clock tolerances), to ensure stressing of the DUT in reasonable
+
+
+
+Avramov & Rapp Informational [Page 4]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ worst-case conditions (see [RFC8238], Section 5 for more details).
+ Test results at a lower rate MAY be provided for better understanding
+ of performance increase in terms of latency and jitter when the rate
+ is lower than 99.98%. The receiving rate of the traffic SHOULD be
+ captured during this test as a percentage of line rate.
+
+ The test MUST provide the statistics of minimum, average, and
+ maximum of the latency distribution, for the exact same iteration of
+ the test.
+
+ The test MUST provide the statistics of minimum, average, and maximum
+ of the jitter distribution, for the exact same iteration of the test.
+
+ Alternatively, when a traffic generator cannot be connected to all
+ ports on the DUT, a snake test MUST be used for line-rate testing,
+ excluding latency and jitter, as those would become irrelevant. The
+ snake test is performed as follows:
+
+ - Connect the first and last port of the DUT to a traffic generator.
+
+ - Connect, back to back and sequentially, all the ports in between:
+ port 2 to port 3, port 4 to port 5, etc., to port N-2 to port N-1,
+ where N is the total number of ports of the DUT.
+
+ - Configure port 1 and port 2 in the same VLAN X, port 3 and port 4
+ in the same VLAN Y, etc., and port N-1 and port N in the same
+ VLAN Z.
+
+ This snake test provides the capability to test line rate for Layer 2
+ and Layer 3 [RFC2544] [RFC3918] in instances where a traffic
+ generator with only two ports is available. Latency and jitter are
+ not to be considered for this test.
+
+2.3. Reporting Format
+
+ The report MUST include the following:
+
+ - Physical-layer calibration information, as defined in [RFC8238],
+ Section 4.
+
+ - Number of ports used.
+
+ - Reading for "throughput received as a percentage of bandwidth",
+ while sending 99.98% of the nominal value of the line rate on each
+ port, for each packet size from 64 bytes to 9216 bytes. As
+ guidance, with a packet-size increment of 64 bytes between each
+ iteration being ideal, 256-byte and 512-byte packets are also
+
+
+
+
+Avramov & Rapp Informational [Page 5]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ often used. The most common packet-size ordering for the report
+ is 64 bytes, 128 bytes, 256 bytes, 512 bytes, 1024 bytes,
+ 1518 bytes, 4096 bytes, 8000 bytes, and 9216 bytes.
+
+ The pattern for testing can be expressed using [RFC6985].
+
+ - Throughput needs to be expressed as a percentage of total
+ transmitted frames.
+
+ - Packet drops MUST be expressed as a count of packets and SHOULD be
+ expressed as a percentage of line rate.
+
+ - For latency and jitter, values are expressed in units of time
+ (usually microseconds or nanoseconds), reading across packet sizes
+ from 64 bytes to 9216 bytes.
+
+ - For latency and jitter, provide minimum, average, and maximum
+ values. If different iterations are done to gather the minimum,
+ average, and maximum values, this SHOULD be specified in the
+ report, along with a justification for why the information could
+ not have been gathered in the same test iteration.
+
+ - For jitter, a histogram describing the population of packets
+ measured per latency or latency buckets is RECOMMENDED.
+
+ - The tests for throughput, latency, and jitter MAY be conducted as
+ individual independent trials, with proper documentation provided
+ in the report, but SHOULD be conducted at the same time.
+
+ - The methodology assumes that the DUT has at least nine ports, as
+ certain methodologies require nine or more ports.
+
+3. Buffering Testing
+
+3.1. Objective
+
+ The objective of this test is to measure the size of the buffer of a
+ DUT under typical/many/multiple conditions. Buffer architectures
+ between multiple DUTs can differ and include egress buffering, shared
+ egress buffering SoC (Switch-on-Chip), ingress buffering, or a
+ combination thereof. The test methodology covers the buffer
+ measurement, regardless of buffer architecture used in the DUT.
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 6]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+3.2. Methodology
+
+ A traffic generator MUST be connected to all ports on the DUT. The
+ methodology for measuring buffering for a data center switch is based
+ on using known congestion of known fixed packet size, along with
+ maximum latency value measurements. The maximum latency will
+ increase until the first packet drop occurs. At this point, the
+ maximum latency value will remain constant. This is the point of
+ inflection of this maximum latency change to a constant value. There
+ MUST be multiple ingress ports receiving a known amount of frames at
+ a known fixed size, destined for the same egress port in order to
+ create a known congestion condition. The total amount of packets
+ sent from the oversubscribed port minus one, multiplied by the packet
+ size, represents the maximum port buffer size at the measured
+ inflection point.
+
+ Note that the tests described in procedures 1), 2), 3), and 4) in
+ this section have iterations called "first iteration", "second
+ iteration", and "last iteration". The idea is to show the first
+ two iterations so the reader understands the logic of how to keep
+ incrementing the iterations. The last iteration shows the end state
+ of the variables.
+
+ 1) Measure the highest buffer efficiency.
+
+ o First iteration: Ingress port 1 sending 64-byte packets at line
+ rate to egress port 2, while port 3 is sending a known low
+ amount of oversubscription traffic (1% recommended) with the
+ same packet size of 64 bytes to egress port 2. Measure the
+ buffer size value of the number of frames sent from the port
+ sending the oversubscribed traffic up to the inflection point
+ multiplied by the frame size.
+
+ o Second iteration: Ingress port 1 sending 65-byte packets at
+ line rate to egress port 2, while port 3 is sending a known low
+ amount of oversubscription traffic (1% recommended) with the
+ same packet size of 65 bytes to egress port 2. Measure the
+ buffer size value of the number of frames sent from the port
+ sending the oversubscribed traffic up to the inflection point
+ multiplied by the frame size.
+
+ o Last iteration: Ingress port 1 sending packets of size B bytes
+ at line rate to egress port 2, while port 3 is sending a known
+ low amount of oversubscription traffic (1% recommended) with
+ the same packet size of B bytes to egress port 2. Measure the
+ buffer size value of the number of frames sent from the port
+ sending the oversubscribed traffic up to the inflection point
+ multiplied by the frame size.
+
+
+
+Avramov & Rapp Informational [Page 7]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ When the B value is found to provide the largest buffer size, then
+ size B allows the highest buffer efficiency.
+
+ 2) Measure maximum port buffer size.
+
+ At fixed packet size B as determined in procedure 1), for a fixed
+ default Differentiated Services Code Point (DSCP) / Class of
+ Service (CoS) value of 0 and for unicast traffic, proceed with the
+ following:
+
+ o First iteration: Ingress port 1 sending line rate to egress
+ port 2, while port 3 is sending a known low amount of
+ oversubscription traffic (1% recommended) with the same packet
+ size to egress port 2. Measure the buffer size value by
+ multiplying the number of extra frames sent by the frame size.
+
+ o Second iteration: Ingress port 2 sending line rate to egress
+ port 3, while port 4 is sending a known low amount of
+ oversubscription traffic (1% recommended) with the same packet
+ size to egress port 3. Measure the buffer size value by
+ multiplying the number of extra frames sent by the frame size.
+
+ o Last iteration: Ingress port N-2 sending line rate to egress
+ port N-1, while port N is sending a known low amount of
+ oversubscription traffic (1% recommended) with the same packet
+ size to egress port N. Measure the buffer size value by
+ multiplying the number of extra frames sent by the frame size.
+
+ This test series MAY be repeated using all different DSCP/CoS
+ values of traffic, and then using multicast traffic, in order to
+ find out if there is any DSCP/CoS impact on the buffer size.
+
+ 3) Measure maximum port pair buffer sizes.
+
+ o First iteration: Ingress port 1 sending line rate to egress
+ port 2, ingress port 3 sending line rate to egress port 4, etc.
+ Ingress port N-1 and port N will oversubscribe, at 1% of line
+ rate, egress port 2 and port 3, respectively. Measure the
+ buffer size value by multiplying the number of extra frames
+ sent by the frame size for each egress port.
+
+ o Second iteration: Ingress port 1 sending line rate to egress
+ port 2, ingress port 3 sending line rate to egress port 4, etc.
+ Ingress port N-1 and port N will oversubscribe, at 1% of line
+ rate, egress port 4 and port 5, respectively. Measure the
+ buffer size value by multiplying the number of extra frames
+ sent by the frame size for each egress port.
+
+
+
+
+Avramov & Rapp Informational [Page 8]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ o Last iteration: Ingress port 1 sending line rate to egress
+ port 2, ingress port 3 sending line rate to egress port 4, etc.
+ Ingress port N-1 and port N will oversubscribe, at 1% of line
+ rate, egress port N-3 and port N-2, respectively. Measure the
+ buffer size value by multiplying the number of extra frames
+ sent by the frame size for each egress port.
+
+ This test series MAY be repeated using all different DSCP/CoS
+ values of traffic and then using multicast traffic.
+
+ 4) Measure maximum DUT buffer size with many-to-one ports.
+
+ o First iteration: Ingress ports 1,2,... N-1 each sending
+ [(1/[N-1])*99.98]+[1/[N-1]] % of line rate per port to egress
+ port N.
+
+ o Second iteration: Ingress ports 2,... N each sending
+ [(1/[N-1])*99.98]+[1/[N-1]] % of line rate per port to egress
+ port 1.
+
+ o Last iteration: Ingress ports N,1,2...N-2 each sending
+ [(1/[N-1])*99.98]+[1/[N-1]] % of line rate per port to egress
+ port N-1.
+
+ This test series MAY be repeated using all different CoS values of
+ traffic and then using multicast traffic.
+
+ Unicast traffic, and then multicast traffic, SHOULD be used in order
+ to determine the proportion of buffer for the documented selection of
+ tests. Also, the CoS value for the packets SHOULD be provided for
+ each test iteration, as the buffer allocation size MAY differ per CoS
+ value. It is RECOMMENDED that the ingress and egress ports be varied
+ in a random but documented fashion in multiple tests in order to
+ measure the buffer size for each port of the DUT.
+
+3.3. Reporting Format
+
+ The report MUST include the following:
+
+ - The packet size used for the most efficient buffer used,
+ along with the DSCP/CoS value.
+
+ - The maximum port buffer size for each port.
+
+ - The maximum DUT buffer size.
+
+ - The packet size used in the test.
+
+
+
+
+Avramov & Rapp Informational [Page 9]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ - The amount of oversubscription, if different than 1%.
+
+ - The number of ingress and egress ports, along with their location
+ on the DUT.
+
+ - The repeatability of the test needs to be indicated: the number of
+ iterations of the same test and the percentage of variation
+ between results for each of the tests (min, max, avg).
+
+ The percentage of variation is a metric providing a sense of how big
+ the difference is between the measured value and the previous values.
+
+ For example, for a latency test where the minimum latency is
+ measured, the percentage of variation (PV) of the minimum latency
+ will indicate by how much this value has varied between the current
+ test executed and the previous one.
+
+ PV = ((x2-x1)/x1)*100, where x2 is the minimum latency value in the
+ current test and x1 is the minimum latency value obtained in the
+ previous test.
+
+ The same formula is used for maximum and average variations measured.
+
+4. Microburst Testing
+
+4.1. Objective
+
+ The objective of this test is to find the maximum amount of packet
+ bursts that a DUT can sustain under various configurations.
+
+ This test provides additional methodology that supplements the tests
+ described in [RFC1242], [RFC2432], [RFC2544], [RFC2889], and
+ [RFC3918].
+
+ - All bursts should be sent with 100% intensity. Note: "Intensity"
+ is defined in [RFC8238], Section 6.1.1.
+
+ - All ports of the DUT must be used for this test.
+
+ - It is recommended that all ports be tested simultaneously.
+
+4.2. Methodology
+
+ A traffic generator MUST be connected to all ports on the DUT. In
+ order to cause congestion, two or more ingress ports MUST send bursts
+ of packets destined for the same egress port. The simplest of the
+ setups would be two ingress ports and one egress port (2 to 1).
+
+
+
+
+Avramov & Rapp Informational [Page 10]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ The burst MUST be sent with an intensity (as defined in [RFC8238],
+ Section 6.1.1) of 100%, meaning that the burst of packets will be
+ sent with a minimum interpacket gap. The amount of packets contained
+ in the burst will be trial variable and increase until there is a
+ non-zero packet loss measured. The aggregate amount of packets from
+ all the senders will be used to calculate the maximum microburst
+ amount that the DUT can sustain.
+
+ It is RECOMMENDED that the ingress and egress ports be varied in
+ multiple tests in order to measure the maximum microburst capacity.
+
+ The intensity of a microburst (see [RFC8238], Section 6.1.1) MAY be
+ varied in order to obtain the microburst capacity at various
+ ingress rates.
+
+ It is RECOMMENDED that all ports on the DUT be tested simultaneously,
+ and in various configurations, in order to understand all the
+ combinations of ingress ports, egress ports, and intensities.
+
+ An example would be:
+
+ o First iteration: N-1 ingress ports sending to one egress port.
+
+ o Second iteration: N-2 ingress ports sending to two egress ports.
+
+ o Last iteration: Two ingress ports sending to N-2 egress ports.
+
+4.3. Reporting Format
+
+ The report MUST include the following:
+
+ - The maximum number of packets received per ingress port with the
+ maximum burst size obtained with zero packet loss.
+
+ - The packet size used in the test.
+
+ - The number of ingress and egress ports, along with their location
+ on the DUT.
+
+ - The repeatability of the test needs to be indicated: the number of
+ iterations of the same test and the percentage of variation
+ between results (min, max, avg).
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 11]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+5. Head-of-Line Blocking
+
+5.1. Objective
+
+ Head-of-line blocking (HOLB) is a performance-limiting phenomenon
+ that occurs when packets are held up by the first packet ahead
+ waiting to be transmitted to a different output port. This is
+ defined in RFC 2889, Section 5.5 ("Congestion Control"). This
+ section expands on RFC 2889 in the context of data center
+ benchmarking.
+
+ The objective of this test is to understand the DUT's behavior in the
+ HOLB scenario and measure the packet loss.
+
+ The differences between this HOLB test and RFC 2889 are as follows:
+
+ - This HOLB test starts with eight ports in two groups of four ports
+ each, instead of four ports (as compared with Section 5.5 of
+ RFC 2889).
+
+ - This HOLB test shifts all the port numbers by one in a second
+ iteration of the test; this is new, as compared to the HOLB test
+ described in RFC 2889. The shifting port numbers continue until
+ all ports are the first in the group; the purpose of this is to
+ make sure that all permutations are tested in order to cover
+ differences in behavior in the SoC of the DUT.
+
+ - Another test within this HOLB test expands the group of ports,
+ such that traffic is divided among four ports instead of two
+ (25% instead of 50% per port).
+
+ - Section 5.3 lists requirements that supplement the requirements
+ listed in RFC 2889, Section 5.5.
+
+5.2. Methodology
+
+ In order to cause congestion in the form of HOLB, groups of
+ four ports are used. A group has two ingress ports and two
+ egress ports. The first ingress port MUST have two flows configured,
+ each going to a different egress port. The second ingress port will
+ congest the second egress port by sending line rate. The goal is to
+ measure if there is loss on the flow for the first egress port, which
+ is not oversubscribed.
+
+ A traffic generator MUST be connected to at least eight ports on the
+ DUT and SHOULD be connected using all the DUT ports.
+
+
+
+
+
+Avramov & Rapp Informational [Page 12]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ Note that the tests described in procedures 1) and 2) in this section
+ have iterations called "first iteration", "second iteration", and
+ "last iteration". The idea is to show the first two iterations so
+ the reader understands the logic of how to keep incrementing the
+ iterations. The last iteration shows the end state of the variables.
+
+ 1) Measure two groups with eight DUT ports.
+
+ o First iteration: Measure the packet loss for two groups with
+ consecutive ports.
+
+ The composition of the first group is as follows:
+
+ Ingress port 1 sending 50% of traffic to egress port 3
+ and ingress port 1 sending 50% of traffic to egress port 4.
+ Ingress port 2 sending line rate to egress port 4.
+ Measure the amount of traffic loss for the traffic from ingress
+ port 1 to egress port 3.
+
+ The composition of the second group is as follows:
+
+ Ingress port 5 sending 50% of traffic to egress port 7
+ and ingress port 5 sending 50% of traffic to egress port 8.
+ Ingress port 6 sending line rate to egress port 8.
+ Measure the amount of traffic loss for the traffic from ingress
+ port 5 to egress port 7.
+
+ o Second iteration: Repeat the first iteration by shifting all
+ the ports from N to N+1.
+
+ The composition of the first group is as follows:
+
+ Ingress port 2 sending 50% of traffic to egress port 4
+ and ingress port 2 sending 50% of traffic to egress port 5.
+ Ingress port 3 sending line rate to egress port 5.
+ Measure the amount of traffic loss for the traffic from ingress
+ port 2 to egress port 4.
+
+ The composition of the second group is as follows:
+
+ Ingress port 6 sending 50% of traffic to egress port 8
+ and ingress port 6 sending 50% of traffic to egress port 9.
+ Ingress port 7 sending line rate to egress port 9.
+ Measure the amount of traffic loss for the traffic from ingress
+ port 6 to egress port 8.
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 13]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ o Last iteration: When the first port of the first group is
+ connected to the last DUT port and the last port of the second
+ group is connected to the seventh port of the DUT.
+
+ Measure the amount of traffic loss for the traffic from ingress
+ port N to egress port 2 and from ingress port 4 to egress port 6.
+
+ 2) Measure with N/4 groups with N DUT ports.
+
+ The traffic from the ingress port is split across four egress
+ ports (100/4 = 25%).
+
+ o First iteration: Expand to fully utilize all the DUT ports in
+ increments of four. Repeat the methodology of procedure 1)
+ with all the groups of ports possible to achieve on the device,
+ and measure the amount of traffic loss for each port group.
+
+ o Second iteration: Shift by +1 the start of each consecutive
+ port of the port groups.
+
+ o Last iteration: Shift by N-1 the start of each consecutive port
+ of the port groups, and measure the amount of traffic loss for
+ each port group.
+
+5.3. Reporting Format
+
+ For each test, the report MUST include the following:
+
+ - The port configuration, including the number and location of
+ ingress and egress ports located on the DUT.
+
+ - If HOLB was observed in accordance with the HOLB test described in
+ Section 5.
+
+ - Percent of traffic loss.
+
+ - The repeatability of the test needs to be indicated: the number of
+ iterations of the same test and the percentage of variation
+ between results (min, max, avg).
+
+
+
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 14]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+6. Incast Stateful and Stateless Traffic
+
+6.1. Objective
+
+ The objective of this test is to measure the values for TCP Goodput
+ [TCP-INCAST] and latency with a mix of large and small flows. The
+ test is designed to simulate a mixed environment of stateful flows
+ that require high rates of goodput and stateless flows that require
+ low latency. Stateful flows are created by generating TCP traffic,
+ and stateless flows are created using UDP traffic.
+
+6.2. Methodology
+
+ In order to simulate the effects of stateless and stateful traffic on
+ the DUT, there MUST be multiple ingress ports receiving traffic
+ destined for the same egress port. There also MAY be a mix of
+ stateful and stateless traffic arriving on a single ingress port.
+ The simplest setup would be two ingress ports receiving traffic
+ destined to the same egress port.
+
+ One ingress port MUST maintain a TCP connection through the ingress
+ port to a receiver connected to an egress port. Traffic in the TCP
+ stream MUST be sent at the maximum rate allowed by the traffic
+ generator. At the same time, the TCP traffic is flowing through the
+ DUT, and the stateless traffic is sent destined to a receiver on the
+ same egress port. The stateless traffic MUST be a microburst of
+ 100% intensity.
+
+ It is RECOMMENDED that the ingress and egress ports be varied in
+ multiple tests in order to measure the maximum microburst capacity.
+
+ The intensity of a microburst MAY be varied in order to obtain the
+ microburst capacity at various ingress rates.
+
+ It is RECOMMENDED that all ports on the DUT be used in the test.
+
+ The tests described below have iterations called "first iteration",
+ "second iteration", and "last iteration". The idea is to show the
+ first two iterations so the reader understands the logic of how to
+ keep incrementing the iterations. The last iteration shows the
+ end state of the variables.
+
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 15]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ For example:
+
+ Stateful traffic port variation (TCP traffic):
+
+ TCP traffic needs to be generated for this test. During the
+ iterations, the number of egress ports MAY vary as well.
+
+ o First iteration: One ingress port receiving stateful TCP traffic
+ and one ingress port receiving stateless traffic destined to
+ one egress port.
+
+ o Second iteration: Two ingress ports receiving stateful TCP traffic
+ and one ingress port receiving stateless traffic destined to
+ one egress port.
+
+ o Last iteration: N-2 ingress ports receiving stateful TCP traffic
+ and one ingress port receiving stateless traffic destined to
+ one egress port.
+
+ Stateless traffic port variation (UDP traffic):
+
+ UDP traffic needs to be generated for this test. During the
+ iterations, the number of egress ports MAY vary as well.
+
+ o First iteration: One ingress port receiving stateful TCP traffic
+ and one ingress port receiving stateless traffic destined to
+ one egress port.
+
+ o Second iteration: One ingress port receiving stateful TCP traffic
+ and two ingress ports receiving stateless traffic destined to
+ one egress port.
+
+ o Last iteration: One ingress port receiving stateful TCP traffic
+ and N-2 ingress ports receiving stateless traffic destined to
+ one egress port.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 16]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+6.3. Reporting Format
+
+ The report MUST include the following:
+
+ - Number of ingress and egress ports, along with designation of
+ stateful or stateless flow assignment.
+
+ - Stateful flow goodput.
+
+ - Stateless flow latency.
+
+ - The repeatability of the test needs to be indicated: the number of
+ iterations of the same test and the percentage of variation
+ between results (min, max, avg).
+
+7. Security Considerations
+
+ Benchmarking activities as described in this memo are limited to
+ technology characterization using controlled stimuli in a laboratory
+ environment, with dedicated address space and the constraints
+ specified in the sections above.
+
+ The benchmarking network topology will be an independent test setup
+ and MUST NOT be connected to devices that may forward the test
+ traffic into a production network or misroute traffic to the test
+ management network.
+
+ Further, benchmarking is performed on a "black-box" basis, relying
+ solely on measurements observable external to the DUT.
+
+ Special capabilities SHOULD NOT exist in the DUT specifically for
+ benchmarking purposes. Any implications for network security arising
+ from the DUT SHOULD be identical in the lab and in production
+ networks.
+
+8. IANA Considerations
+
+ This document does not require any IANA actions.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 17]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+9. References
+
+9.1. Normative References
+
+ [RFC1242] Bradner, S., "Benchmarking Terminology for Network
+ Interconnection Devices", RFC 1242, DOI 10.17487/RFC1242,
+ July 1991, <https://www.rfc-editor.org/info/rfc1242>.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119,
+ DOI 10.17487/RFC2119, March 1997,
+ <https://www.rfc-editor.org/info/rfc2119>.
+
+ [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for
+ Network Interconnect Devices", RFC 2544,
+ DOI 10.17487/RFC2544, March 1999,
+ <https://www.rfc-editor.org/info/rfc2544>.
+
+ [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in
+ RFC 2119 Key Words", BCP 14, RFC 8174,
+ DOI 10.17487/RFC8174, May 2017,
+ <https://www.rfc-editor.org/info/rfc8174>.
+
+ [RFC8238] Avramov, L. and J. Rapp, "Data Center Benchmarking
+ Terminology", RFC 8238, DOI 10.17487/RFC8238, August 2017,
+ <https://www.rfc-editor.org/info/rfc8238>.
+
+9.2. Informative References
+
+ [RFC2432] Dubray, K., "Terminology for IP Multicast Benchmarking",
+ RFC 2432, DOI 10.17487/RFC2432, October 1998,
+ <https://www.rfc-editor.org/info/rfc2432>.
+
+ [RFC2889] Mandeville, R. and J. Perser, "Benchmarking Methodology
+ for LAN Switching Devices", RFC 2889,
+ DOI 10.17487/RFC2889, August 2000,
+ <https://www.rfc-editor.org/info/rfc2889>.
+
+ [RFC3918] Stopp, D. and B. Hickman, "Methodology for IP Multicast
+ Benchmarking", RFC 3918, DOI 10.17487/RFC3918,
+ October 2004, <https://www.rfc-editor.org/info/rfc3918>.
+
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 18]
+
+RFC 8239 Data Center Benchmarking Methodology August 2017
+
+
+ [RFC6985] Morton, A., "IMIX Genome: Specification of Variable Packet
+ Sizes for Additional Testing", RFC 6985,
+ DOI 10.17487/RFC6985, July 2013,
+ <https://www.rfc-editor.org/info/rfc6985>.
+
+ [TCP-INCAST]
+ Chen, Y., Griffith, R., Zats, D., Joseph, A., and R. Katz,
+ "Understanding TCP Incast and Its Implications for Big
+ Data Workloads", April 2012, <http://yanpeichen.com/
+ professional/usenixLoginIncastReady.pdf>.
+
+Acknowledgments
+
+ The authors would like to thank Al Morton and Scott Bradner for their
+ reviews and feedback.
+
+Authors' Addresses
+
+ Lucien Avramov
+ Google
+ 1600 Amphitheatre Parkway
+ Mountain View, CA 94043
+ United States of America
+
+ Email: lucien.avramov@gmail.com
+
+
+ Jacob Rapp
+ VMware
+ 3401 Hillview Ave.
+ Palo Alto, CA 94304
+ United States of America
+
+ Email: jhrapp@gmail.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Avramov & Rapp Informational [Page 19]
+