doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc4963.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 563 insertions, 0 deletions
diff --git a/doc/rfc/rfc4963.txt b/doc/rfc/rfc4963.txt
new file mode 100644
index 0000000..f02e72d
--- /dev/null
+++ b/doc/rfc/rfc4963.txt
@@ -0,0 +1,563 @@
+
+
+
+
+
+
+Network Working Group                                         J. Heffner
+Request for Comments: 4963                                     M. Mathis
+Category: Informational                                      B. Chandler
+                                                                     PSC
+                                                               July 2007
+
+
+               IPv4 Reassembly Errors at High Data Rates
+
+Status of This Memo
+
+   This memo provides information for the Internet community.  It does
+   not specify an Internet standard of any kind.  Distribution of this
+   memo is unlimited.
+
+Copyright Notice
+
+   Copyright (C) The IETF Trust (2007).
+
+Abstract
+
+   IPv4 fragmentation is not sufficiently robust for use under some
+   conditions in today's Internet.  At high data rates, the 16-bit IP
+   identification field is not large enough to prevent frequent
+   incorrectly assembled IP fragments, and the TCP and UDP checksums are
+   insufficient to prevent the resulting corrupted datagrams from being
+   delivered to higher protocol layers.  This note describes some easily
+   reproduced experiments demonstrating the problem, and discusses some
+   of the operational implications of these observations.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Heffner, et al.              Informational                      [Page 1]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+1.  Introduction
+
+   The IPv4 header was designed at a time when data rates were several
+   orders of magnitude lower than those achievable today.  This document
+   describes a consequent scale-related failure in the IP identification
+   (ID) field, where fragments may be incorrectly assembled at a rate
+   high enough that it is likely to invalidate assumptions about data
+   integrity failure rates.
+
+   That IP fragmentation results in inefficient use of the network has
+   been well documented [Kent87].  This note presents a different kind
+   of problem, which can result not only in significant performance
+   degradation, but also frequent data corruption.  This is especially
+   pertinent due to the recent proliferation of UDP bulk transport tools
+   that sometimes fragment every datagram.
+
+   Additionally, there is some network equipment that ignores the Don't
+   Fragment (DF) bit in the IP header to work around MTU discovery
+   problems [RFC2923].  This equipment indirectly exposes properly
+   implemented protocols and applications to corrupt data.
+
+2.  Wrapping the IP ID Field
+
+   The Internet Protocol standard [RFC0791] specifies:
+
+      "The choice of the Identifier for a datagram is based on the need
+      to provide a way to uniquely identify the fragments of a
+      particular datagram.  The protocol module assembling fragments
+      judges fragments to belong to the same datagram if they have the
+      same source, destination, protocol, and Identifier.  Thus, the
+      sender must choose the Identifier to be unique for this source,
+      destination pair and protocol for the time the datagram (or any
+      fragment of it) could be alive in the Internet."
+
+   Strict conformance to this standard limits transmissions in one
+   direction between any address pair to no more than 65536 packets per
+   protocol (e.g., TCP, UDP, or ICMP) per maximum packet lifetime.
+
+   Clearly, not all hosts follow this standard because it implies an
+   unreasonably low maximum data rate.  For example, a host sending
+   1500-byte packets with a 30-second maximum packet lifetime could send
+   at only about 26 Mbps before exceeding 65535 packets per packet
+   lifetime.  Or, filling a 1 Gbps interface with 1500-byte packets
+   requires sending 65536 packets in less than 1 second, an unreasonably
+   short maximum packet lifetime, being less than the round-trip time on
+   some paths.  This requirement is widely ignored.
+
+
+
+
+
+Heffner, et al.              Informational                      [Page 2]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+   Additionally, it is worth noting that reusing values in the IP ID
+   field once per 65536 datagrams is the best case.  Some
+   implementations randomize the IP ID to prevent leaking information
+   out of the kernel [Bellovin02], which causes reuse of the IP ID field
+   to occur probabilistically at all sending rates.
+
+   IP receivers store fragments in a reassembly buffer until all
+   fragments in a datagram arrive, or until the reassembly timeout
+   expires (15 seconds is suggested in [RFC0791]).  Fragments in a
+   datagram are associated with each other by their protocol number, the
+   value in their ID field, and by the source/destination address pair.
+   If a sender wraps the ID field in less than the reassembly timeout,
+   it becomes possible for fragments from different datagrams to be
+   incorrectly spliced together ("mis-associated"), and delivered to the
+   upper layer protocol.
+
+   A case of particular concern is when mis-association is self-
+   propagating.  This occurs, for example, when there is reliable
+   ordering of packets and the first fragment of a datagram is lost in
+   the network.  The rest of the fragments are stored in the fragment
+   reassembly buffer, and when the sender wraps the ID field, the first
+   fragment of the new datagram will be mis-associated with the rest of
+   the old datagram.  The new datagram will be now be incomplete (since
+   it is missing its first fragment), so the rest of it will be saved in
+   the fragment reassembly buffer, forming a cycle that repeats every
+   65536 datagrams.  It is possible to have a number of simultaneous
+   cycles, bounded by the size of the fragment reassembly buffer.
+
+   IPv6 is considerably less vulnerable to this type of problem, since
+   its fragment header contains a 32-bit identification field [RFC2460].
+   Mis-association will only be a problem at packet rates 65536 times
+   higher than for IPv4.
+
+3.  Effects of Mis-Associated Fragments
+
+   When the mis-associated fragments are delivered, transport-layer
+   checksumming should detect these datagrams as incorrect and discard
+   them.  When the datagrams are discarded, it could create a
+   performance problem for loss-feedback congestion control algorithms,
+   particularly when a large congestion window is required, since it
+   will introduce a certain amount of non-congestive loss.
+
+   Transport checksums, however, may not be designed to handle such high
+   error rates.  The TCP/UDP checksum is only 16 bits in length.  If
+   these checksums follow a uniform random distribution, we expect mis-
+   associated datagrams to be accepted by the checksum at a rate of one
+   per 65536.  With only one mis-association cycle, we expect corrupt
+   data delivered to the application layer once per 2^32 datagrams.
+
+
+
+Heffner, et al.              Informational                      [Page 3]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+   This number can be significantly higher with multiple concurrent
+   cycles.
+
+   With non-random data, the TCP/UDP checksum may be even weaker still.
+   It is possible to construct datasets where mis-associated fragments
+   will always have the same checksum.  Such a case may be considered
+   unlikely, but is worth considering.  "Real" data may be more likely
+   than random data to cause checksum hot spots and increase the
+   probability of false checksum match [Stone98].  Also, some
+   applications or higher-level protocols may turn off checksumming to
+   increase speed, though this practice has been found to be dangerous
+   for other reasons when data reliability is important [Stone00].
+
+4.  Experimental Observations
+
+   To test the practical impact of fragmentation on UDP, we ran a series
+   of experiments using a UDP bulk data transport protocol that was
+   designed to be used as an alternative to TCP for transporting large
+   data sets over specialized networks.  The tool, Reliable Blast UDP
+   (RBUDP), part of the QUANTA networking toolkit [QUANTA], was selected
+   because it has a clean interface which facilitated automated
+   experiments.  The decision to use RBUDP had little to do with the
+   details of the transport protocol itself.  Any UDP transport protocol
+   that does not have additional means to detect corruption, and that
+   could be configured to use IP fragmentation, would have the same
+   results.
+
+   In order to diagnose corruption on files transferred with the UDP
+   bulk transfer tool, we used a file format that included embedded
+   sequence numbers and MD5 checksums in each fragment of each datagram.
+   Thus, it was possible to distinguish random corruption from that
+   caused by mis-associated fragments.  We used two different types of
+   files.  One was constructed so that all the UDP checksums were
+   constant -- we will call this the "constant" dataset.  The other was
+   constructed so that UDP checksums were uniformly random -- the
+   "random" dataset.  All tests were done using 400 MB files, sent in
+   1524-byte datagrams so that they were fragmented on standard Fast
+   Ethernet with a 1500-byte MTU.
+
+   The UDP bulk file transport tool was used to send the datasets
+   between a pair of hosts at slightly less than the available data rate
+   (100 Mbps).  Near the beginning of each flow, a brief secondary flow
+   was started to induce packet loss in the primary flow.  Throughout
+   the life of the primary flow, we typically observed mis-association
+   rates on the order of a few hundredths of a percent.
+
+
+
+
+
+
+Heffner, et al.              Informational                      [Page 4]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+   Tests run with the "constant" dataset resulted in corruption on all
+   mis-associated fragments, that is, corruption on the order of a few
+   hundredths of a percent.  In sending approximately 10 TB of "random"
+   datasets, we observed 8847668 UDP checksum errors and 121 corruptions
+   of the data due to mis-associated fragments.
+
+5.  Preventing Mis-Association
+
+   The most straightforward way to avoid mis-association is to avoid
+   fragmentation altogether by implementing Path MTU Discovery [RFC1191]
+   [RFC4821].  However, this is not always feasible for all
+   applications.  Further, as a work-around for MTU discovery problems
+   [RFC2923], some TCP implementations and communications gear provide
+   mechanisms to disable path MTU discovery by clearing or ignoring the
+   DF bit.  Doing so will expose all protocols using IPv4, even those
+   that participate in MTU discovery, to mis-association errors.
+
+   If IP fragmentation is in use, it may be possible to reduce the
+   timeout sufficiently so that mis-association will not occur.
+   However, there are a number of difficulties with such an approach.
+   Since the sender controls the rate of packets sent and the selection
+   of IP ID, while the receiver controls the reassembly timeout, there
+   would need to be some mutual assurance between each party as to
+   participation in the scheme.  Further, it is not generally possible
+   to set the timeout low enough so that a fast sender's fragments will
+   not be mis-associated, yet high enough so that a slow sender's
+   fragments will not be unconditionally discarded before it is possible
+   to reassemble them.  Therefore, the timeout and IP ID selection would
+   need to be done on a per-peer basis.  Also, it is likely NAT will
+   break any per-peer tables keyed by IP address.  It is not within the
+   scope of this document to recommend solutions to these problems,
+   though we believe a per-peer adaptive timeout is likely to prevent
+   mis-association under circumstances where it would most commonly
+   occur.
+
+   A case particularly worth noting is that of tunnels encapsulating
+   payload in IPv4.  To deal with difficulties in MTU Discovery
+   [RFC4459], tunnels may rely on fragmentation between the two
+   endpoints, even if the payload is marked with a DF bit [RFC4301].  In
+   such a mode, the two tunnel endpoints behave as IP end hosts, with
+   all tunneled traffic having the same protocol type.  Thus, the
+   aggregate rate of tunneled packets may not exceed 65536 per maximum
+   packet lifetime, or tunneled data becomes exposed to possible mis-
+   association.  Even protocols doing MTU discovery such as TCP will be
+   affected.  Operators of tunnels should ensure that the receiving
+   end's reassembly timeout is short enough that mis-association cannot
+   occur given the tunnel's maximum rate.
+
+
+
+
+Heffner, et al.              Informational                      [Page 5]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+6.  Mitigating Mis-Association
+
+   It is difficult to concisely describe all possible situations under
+   which fragments might be mis-associated.  Even if an end host
+   carefully follows the specification, ensuring unique IP IDs, the
+   presence of NATs or tunnels may expose applications to IP ID space
+   conflicts.  Further, devices in the network that the end hosts cannot
+   see or control, such as tunnels, may cause mis-association.  Even a
+   fragmenting application that sends at a low rate might possibly be
+   exposed when running simultaneously with a non-fragmenting
+   application that sends at a high rate.  As described above, the
+   receiver might implement to reduce or eliminate the possibility of
+   conflict, but there is no mechanism in place for a sender to know
+   what the receiver is doing in this respect.  As a consequence, there
+   is no general mechanism for an application that is using IPv4
+   fragmentation to know if it is deterministically or statistically
+   protected from mis-associated fragments.
+
+   Under circumstances when it is impossible or impractical to prevent
+   mis-association, its effects may be mitigated by use of stronger
+   integrity checking at any layer above IP.  This is a natural side
+   effect of using cryptographic authentication.  For example, IPsec AH
+   [RFC4302] will discard any corrupted datagrams, preventing their
+   deliver to upper layers.  A stronger transport layer checksum such as
+   SCTP's, which is 32 bits in length [RFC2960], may help significantly.
+   At the application layer, SSH message authentication codes [RFC4251]
+   will prevent delivery of corrupted data, though since the TCP
+   connection underneath is not protected, it is considered invalid and
+   the session is immediately terminated.  While stronger integrity
+   checking may prevent data corruption, it will not prevent the
+   potential performance impact described above of non-congestive loss
+   on congestion control at high congestion windows.
+
+   It should also be noted that mis-association is not the only possible
+   source of data corruption above the network layer [Stone00].  Most
+   applications for which data integrity is critically important should
+   implement strong integrity checking regardless of exposure to mis-
+   association.
+
+   In general, applications that rely on IPv4 fragmentation should be
+   written with these issues in mind, as well as those issues documented
+   in [Kent87].  Applications that rely on IPv4 fragmentation while
+   sending at high speeds (the order of 100 Mbps or higher) and devices
+   that deliberately introduce fragmentation to otherwise unfragmented
+   traffic (e.g., tunnels) should be particularly cautious, and
+   introduce strong mechanisms to ensure data integrity.
+
+
+
+
+
+Heffner, et al.              Informational                      [Page 6]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+7.  Security Considerations
+
+   If a malicious entity knows that a pair of hosts are communicating
+   using a fragmented stream, it may be presented with an opportunity to
+   corrupt the flow.  By sending "high" fragments (those with offset
+   greater than zero) with a forged source address, the attacker can
+   deliberately cause corruption as described above.  Exploiting this
+   vulnerability requires only knowledge of the source and destination
+   addresses of the flow, its protocol number, and fragment boundaries.
+   It does not require knowledge of port or sequence numbers.
+
+   If the attacker has visibility of packets on the path, the attack
+   profile is similar to injecting full segments.  Using this attack
+   makes blind disruptions easier and might possibly be used to cause
+   degradation of service.  We believe only streams using IPv4
+   fragmentation are likely vulnerable.  Because of the nature of the
+   problems outlined in this document, the use of IPv4 fragmentation for
+   critical applications may not be advisable, regardless of security
+   concerns.
+
+8.  Informative References
+
+   [Kent87]     Kent, C. and J. Mogul, "Fragmentation considered
+                harmful", Proc. SIGCOMM '87 vol. 17, No. 5, October
+                1987.
+
+   [RFC2923]    Lahey, K., "TCP Problems with Path MTU Discovery", RFC
+                2923, September 2000.
+
+   [RFC0791]    Postel, J., "Internet Protocol", STD 5, RFC 791,
+                September 1981.
+
+   [RFC1191]    Mogul, J. and S. Deering, "Path MTU discovery", RFC
+                1191, November 1990.
+
+   [Stone98]    Stone, J., Greenwald, M., Partridge, C., and J. Hughes,
+                "Performance of Checksums and CRC's over Real Data",
+                IEEE/ ACM Transactions on Networking vol. 6, No. 5,
+                October 1998.
+
+   [Stone00]    Stone, J. and C. Partridge, "When The CRC and TCP
+                Checksum Disagree", Proc. SIGCOMM 2000 vol. 30, No. 4,
+                October 2000.
+
+
+
+
+
+
+
+
+Heffner, et al.              Informational                      [Page 7]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+   [QUANTA]     He, E., Alimohideen, J., Eliason, J., Krishnaprasad, N.,
+                Leigh, J., Yu, O., and T. DeFanti, "Quanta: a toolkit
+                for high performance data delivery over photonic
+                networks", Future Generation Computer Systems Vol. 19,
+                No. 6, August 2003.
+
+   [Bellovin02] Bellovin, S., "A Technique for Counting NATted Hosts",
+                Internet Measurement Conference, Proceedings of the 2nd
+                ACM SIGCOMM Workshop on Internet Measurement, November
+                2002.
+
+   [RFC2460]    Deering, S. and R. Hinden, "Internet Protocol, Version 6
+                (IPv6) Specification", RFC 2460, December 1998.
+
+   [RFC2960]    Stewart, R., Xie, Q., Morneault, K., Sharp, C.,
+                Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M.,
+                Zhang, L., and V. Paxson, "Stream Control Transmission
+                Protocol", RFC 2960, October 2000.
+
+   [RFC4251]    Ylonen, T. and C. Lonvick, "The Secure Shell (SSH)
+                Protocol Architecture", RFC 4251, January 2006.
+
+   [RFC4301]    Kent, S. and K. Seo, "Security Architecture for the
+                Internet Protocol", RFC 4301, December 2005.
+
+   [RFC4302]    Kent, S., "IP Authentication Header", RFC 4302, December
+                2005.
+
+   [RFC4459]    Savola, P., "MTU and Fragmentation Issues with In-the-
+                Network Tunneling", RFC 4459, April 2006.
+
+   [RFC4821]    Mathis, M. and J. Heffner, "Packetization Layer Path MTU
+                Discovery", RFC 4821, March 2007.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Heffner, et al.              Informational                      [Page 8]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+Appendix A.  Acknowledgements
+
+   This work was supported by the National Science Foundation under
+   Grant No. 0083285.
+
+Authors' Addresses
+
+   John W. Heffner
+   Pittsburgh Supercomputing Center
+   4400 Fifth Avenue
+   Pittsburgh, PA  15213
+   US
+
+   Phone: 412-268-2329
+   EMail: jheffner@psc.edu
+
+
+   Matt Mathis
+   Pittsburgh Supercomputing Center
+   4400 Fifth Avenue
+   Pittsburgh, PA  15213
+   US
+
+   Phone: 412-268-3319
+   EMail: mathis@psc.edu
+
+
+   Ben Chandler
+   Pittsburgh Supercomputing Center
+   4400 Fifth Avenue
+   Pittsburgh, PA  15213
+   US
+
+   Phone: 412-268-9783
+   EMail: bchandle@gmail.com
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Heffner, et al.              Informational                      [Page 9]
+
+RFC 4963       IPv4 Reassembly Errors at High Data Rates       July 2007
+
+
+Full Copyright Statement
+
+   Copyright (C) The IETF Trust (2007).
+
+   This document is subject to the rights, licenses and restrictions
+   contained in BCP 78, and except as set forth therein, the authors
+   retain all their rights.
+
+   This document and the information contained herein are provided on an
+   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
+   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
+   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
+   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
+   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
+   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Intellectual Property
+
+   The IETF takes no position regarding the validity or scope of any
+   Intellectual Property Rights or other rights that might be claimed to
+   pertain to the implementation or use of the technology described in
+   this document or the extent to which any license under such rights
+   might or might not be available; nor does it represent that it has
+   made any independent effort to identify any such rights.  Information
+   on the procedures with respect to rights in RFC documents can be
+   found in BCP 78 and BCP 79.
+
+   Copies of IPR disclosures made to the IETF Secretariat and any
+   assurances of licenses to be made available, or the result of an
+   attempt made to obtain a general license or permission for the use of
+   such proprietary rights by implementers or users of this
+   specification can be obtained from the IETF on-line IPR repository at
+   http://www.ietf.org/ipr.
+
+   The IETF invites any interested party to bring to its attention any
+   copyrights, patents or patent applications, or other proprietary
+   rights that may cover technology that may be required to implement
+   this standard.  Please address the information to the IETF at
+   ietf-ipr@ietf.org.
+
+Acknowledgement
+
+   Funding for the RFC Editor function is currently provided by the
+   Internet Society.
+
+
+
+
+
+
+
+Heffner, et al.              Informational                     [Page 10]
+
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc4963.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)