summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc1810.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc1810.txt')
-rw-r--r--doc/rfc/rfc1810.txt395
1 files changed, 395 insertions, 0 deletions
diff --git a/doc/rfc/rfc1810.txt b/doc/rfc/rfc1810.txt
new file mode 100644
index 0000000..77817f4
--- /dev/null
+++ b/doc/rfc/rfc1810.txt
@@ -0,0 +1,395 @@
+
+
+
+
+
+
+Network Working Group J. Touch
+Request for Comments: 1810 ISI
+Category: Informational June 1995
+
+
+ Report on MD5 Performance
+
+Status of this Memo
+
+ This memo provides information for the Internet community. This memo
+ does not specify an Internet standard of any kind. Distribution of
+ this memo is unlimited.
+
+Abstract
+
+ MD5 is an authentication algorithm, which has been proposed as the
+ default authentication option in IPv6. When enabled, the MD5
+ algorithm operates over the entire data packet, including header.
+ This RFC addresses how fast MD5 can be implemented in software and
+ hardware, and whether it supports currently available IP bandwidth.
+ MD5 can be implemented in existing hardware technology at 256 Mbps,
+ and in software at 87 Mbps. These rates cannot support current IP
+ rates, e.g., 100 Mbps TCP and 130 Mbps UDP over ATM. If MD5 cannot
+ support existing network bandwidth using existing technology, it will
+ not scale as network speeds increase in the future. This RFC is
+ intended to alert the IP community about the performance limitations
+ of MD5, and to suggest that alternatives be considered for use in
+ high speed IP implementations.
+
+Introduction
+
+ MD5 is an authentication algorithm, which has been proposed as one
+ authentication option in IPv6 [1]. RFC 1321 describes the MD5
+ algorithm and gives a reference implementation [3]. When enabled,
+ the MD5 algorithm operates over the entire data packet, including
+ header (with dummy values for volatile fields). This RFC addresses
+ how fast MD5 can be implemented in software and hardware, and whether
+ it supports currently available IP bandwidth.
+
+ This RFC considers the general issue of checksumming and security at
+ high speed in IPv6. IPv6 has no header checksum (which IPv4 has
+ [5]), but proposes an authentication digest over the entire body of
+ the packet (including header where volatile fields are zeroed) [1].
+ This RFC specifically addresses the performance of that
+ authentication mechanism.
+
+
+
+
+
+
+Touch Informational [Page 1]
+
+RFC 1810 Report on MD5 Performance June 1995
+
+
+Measurements
+
+ The performance of MD5 was measured. The code was an optimized
+ version of the MD5 reference implementation from the RFC [3], and is
+ available for anonymous FTP [7]. The following are the results of
+ the performance test "md5 -t", modified to prohibit on-chip caching
+ of the data block:
+
+ 87 Mbps DEC Alpha (190 Mhz)
+ 33 Mbps HP 9000/720
+ 48 Mbps IBM RS/6000 7006 (PPC 601 @80 Mhz)
+ 31 Mbps Intel i486/66 NetBSD
+ 44 Mbps Intel Pentium/90 NeXTStep
+ 52 Mbps SGI/IP-20 IRIX 5.2
+ 37 Mbps Sun SPARC-10/51, SPARC-20/50 SunOS 4.1.3
+ 57 Mbps Sun SPARC-20/71 SunOS 4.1.3
+
+ These rates do not keep up with currently available IP bandwidth,
+ e.g., 100 Mbps TCP and 130 Mbps UDP over a Fore SBA-200 ATM host
+ interface in a Sun SPARC-20/71.
+
+ Values as high as 100 Mbps have been reported for the DEC Alpha (190
+ Mhz). These values reflect on-chip caching of the data. It is not
+ clear at this time whether in-memory, off-chip cache, or on-chip
+ cache performance measures are more relevant to IP performance.
+
+Analysis of the MD5 Algorithm
+
+ The MD5 algorithm is a block-chained hashing algorithm. The first
+ block is hashed with an initial seed, resulting in a hash. The hash
+ is summed with the seed, and that result becomes the seed for the
+ next block. When the last block is computed, it's "next-seed' value
+ becomes the hash for the entire stream. Thus, the seed for block
+ depends on both the hash and the seed of its preceding block. As a
+ result, blocks cannot be hashed in parallel.
+
+ Each 16-word (64-byte) block is hashed via 64 basic steps, using a
+ 4-word intermediate hash, and collapsing the intermediate hash at the
+ end. The 64 steps are 16 groups of 4 steps, one step per
+ intermediate hash word. This RFC uses the following notation (as
+ from RFC-1321 [3]):
+
+ A,B,C,D intermediate hash words
+ X[i] input data block
+ T[i] sine table lookup
+ << i rotate i bits
+ F logical functions of 3 args
+
+
+
+
+Touch Informational [Page 2]
+
+RFC 1810 Report on MD5 Performance June 1995
+
+
+ The subscripts to X, I, and << are fixed for each step, and are
+ omitted here. There are four different logical functions, also
+ omitted. Each 4-step group looks like:
+
+ A = B + ((A + F(B,C,D) + X[i] + T[i]) << i)
+ D = A + ((D + F(A,B,C) + X[i] + T[i]) << i)
+ C = D + ((C + F(D,A,B) + X[i] + T[i]) << i)
+ B = C + ((B + F(C,D,A) + X[i] + T[i]) << i)
+
+ Note that this has the general form shown below. Due to the
+ complexity of the function 'f', these equations cannot be transformed
+ into a less serial set.
+
+ A = f(D); B = f(A); C = f(B); D = f(C)
+
+ Each steps is composed of two table lookups, one rotation, a 3-
+ component logical operation, and 4 additions. The best
+ parallelization possible leaves F(x,y,z) to the last step, waiting as
+ long as possible for the result from the previous step. The
+ resulting tree is shown below.
+
+ (t0) B* C C D X T
+ | | | | | |
+ | | | | | |
+ \/ \/ \ /
+ t1 op op A + X T
+ \ / \ / | |
+ \ / \ / | |
+ \/ \/ \ /
+ t2 op + (t0) B* C C D A +
+ \ / | | | | \ /
+ \ / \ | | / \ /
+ \ / \\// \/
+ t3 + t1 op +
+ | \ /
+ | \ /
+ | \ /
+ t4 << B* t2 + B*
+ \ / \ /
+ \ / << /
+ \ / \ /
+ t5 + t3 +
+ | |
+ | |
+ | |
+ A** A**
+
+ Binary operation tree Optimized hardware tree
+
+
+
+Touch Informational [Page 3]
+
+RFC 1810 Report on MD5 Performance June 1995
+
+
+ This diagram assumes that each operation takes one unit time. The
+ tree shows the items that depend on the previous step as B*, and the
+ item that the next step depends on as A**. Sequences of the binary
+ operation tree cannot be overlapped, but the optimized hardware tree
+ can (by one time step).
+
+ There are 4 steps processed per word of input, ignoring inter-block
+ processing. The speed of the overall algorithm depends on how fast
+ we can process these 4 steps, vs. the bandwidth of one word of input
+ being processed.
+
+ The binary tree takes 5 time units per step of the algorithm, and
+ permits at best 3-way parallelism (at time t1). In software, this
+ means it takes 5 * 4 = 20 instructions per word input. A computer
+ capable of M MIPS can support a data bandwidth of M/20 * 32 Mbps,
+ i.e., bits per second equal to 1.6x its MIPS rate. Therefore, a 100
+ MIPS machine can support a 160 Mbps stream.
+
+ Parallel software rate in Mbps = 1.6 * MIPS rate
+
+ This assumes that register reads and writes are overlapped with
+ computation entirely. Without any parallelism, there are 8
+ operations per step, and 4 steps per word, so 32 operations per word,
+ i.e., the data rate in Mbps would be identical to the MIPS rate:
+
+ Serial software rate in Mbps = MIPS rate
+
+ Predictions using SpecInt92 numbers as MIPS estimators can be
+ compared with measured rates [2]:
+
+ Spec- Predicted MD5
+ Int92 Upper-Bound Measured Machine
+ ------------------------------------------------------------
+ 122 122-195 87 Mbps DEC Alpha (190 Mhz)
+ 48 48- 77 33 Mbps HP 9000/720
+ 88 88-141 48 Mbps IBM RS/6000 7006 (PPC 601 @80 Mhz)
+ 32 32- 51 31 Mbps Intel i486/33 NetBSD
+ 90 90-144 44 Mbps Intel Pentium/90 NeXTStep
+ 90 90-144 52 Mbps SGI/IP-20 IRIX 5.2
+ 65 65-104 37 Mbps Sun SPARC-10/51 SunOS 4.1.3
+ 126 126-202 57 Mbps Sun SPARC-20/71 SunOS 4.1.3
+
+ The hardware rate takes 3 time units per step, i.e. 3 * 4 = 12 time
+ units per word of input. Hardware capable of doing an operation
+ (e.g., 32-bit addition) in N nanoseconds can support a data bandwidth
+ of 32/12/N bps, i.e., 2/3N bps.
+
+ Hardware rate in Mbps = 8/3N * 1,000
+
+
+
+Touch Informational [Page 4]
+
+RFC 1810 Report on MD5 Performance June 1995
+
+
+ For CMOS, an operation (32-bit addition, including register retrieval
+ and storage) costs about 5.2 ns (2.6 ns per add, 2 ns for latching)
+ [6]. There are 6 clocks through the most highly-parallelized
+ implementation, resulting in 31.2 ns per 32-bit word, or 256 Mbps
+ [6]. This will not keep pace with existing hardware, which is
+ capable of link speeds in excess of 622 Mbps (ATM).
+
+ By comparison, IPv4 uses the Internet Checksum [5]. This checksum
+ can be performed in 32-bit-wide units in excess of 1 Gbps in an
+ existing, low-cost PLD. The checksum can also be parallelized by
+ computing partial sums and reducing the result.
+
+One Proposed Solution
+
+ There are several ways to increase the performance of the IPv6
+ authentication mechanism. One is to increase the hardware
+ performance of MD5 by slightly modifying the algorithm, the other is
+ to propose a replacement algorithm. This RFC discusses briefly the
+ modification of MD5 for high-speed hardware implementation.
+ Alternate algorithms, capable of 3.5x the speed of MD5, have been
+ discussed elsewhere [6].
+
+ MD5 uses block chaining to ensure sensitivity to block order. Block
+ chaining also prevents arbitrary parallelism, which can be as much a
+ benefit to the spoofer as to the user. MD5 can be slightly altered
+ to accommodate a higher bandwidth data rate. There should be a
+ predetermined finite number of blocks processed from independent
+ seeds, such that the I-th block is part of the "I mod K"-th chain.
+ The resulting K digests themselves form a message, which can be MD5-
+ encoded using a single-block algorithm. This idea was proposed
+ independently by the author and by Burt Kaliski of RSA.
+
+ The goal is to support finite parallelism to provide adequate
+ bandwidth at current processing rates, without providing arbitrary
+ power for spoofing. It would require further analysis to ensure that
+ it provides an adequate level of security.
+
+ For current technology and network bandwidth, a minimum of 4-way
+ parallel chaining would suffice, and 16-way chaining would be
+ preferable. This would support network bandwidth of 1 Gbps with 4-
+ way chaining, in CMOS hardware. The chaining parallelism should be a
+ multiple of 4-way, to generate a complete block of digests (4 words
+ per digest, 16 words per block). This modification is believed to
+ achieve the goals of MD5, without the penalties of implementation of
+ the current MD5 algorithm.
+
+
+
+
+
+
+Touch Informational [Page 5]
+
+RFC 1810 Report on MD5 Performance June 1995
+
+
+Security Considerations
+
+ This entire document addresses a mechanism for providing security in
+ IPv6. MD5 is the proposed default optional authentication mechanism
+ for IPv6 traffic. This RFC specifically addresses the concern that
+ security mechanisms such as MD5 that cannot support high bandwidth
+ with available hardware will compromise their deployment, and
+ ultimately, the security of the systems they are intended to
+ maintain.
+
+ The IPv6 requirements document emphasizes that IPv6 implementations
+ should not compromise performance, compared to IPv4. This is
+ presumably despite IPv6's increased functionality. "Required
+ optional" components of IPv6 should be held to this same standard.
+ MD5 compromises performance, and so its use as a required default
+ option in IPv6 should be reconsidered.
+
+ The use of MD5 as the default to the required authentication option
+ may compromise security in high-bandwidth systems, because enabling
+ the option causes performance degradation, defeating its inclusion as
+ an IPv6 option. As a result, the authentication option may be
+ disabled entirely.
+
+ It is important to the use of authentication in high-performance
+ systems that an alternative mechanism be available in IPv6 from the
+ outset. This may require the specification of multiple "required"
+ authentication algorithms - one that's slower but believed strong,
+ and one that's faster but may inspire somewhat less confidence.
+
+Conclusions
+
+ MD5 cannot be implemented in existing technology at rates in excess
+ of 256 Mbps in hardware, or 86 Mbps in software. MD5 is a proposed
+ authentication option in IPv6, a protocol that should support
+ existing networking technology, which is capable of 130 Mbps UDP.
+
+ As a result, MD5 cannot be used to support IP authentication in
+ existing networks at existing rates. Although MD5 will support
+ higher bandwidth in the future due to technological advances, these
+ will be offset by similar advances in networking. If MD5 cannot
+ support existing network bandwidth using existing technology, it will
+ not be able to scale as network speeds increase in the future. This
+ RFC proposes that MD5 be modified to support a 16-way block chaining,
+ in order to allow existing technology (CMOS hardware) to support
+ existing networking rates (1 Gbps). It further proposes that
+ alternatives to MD5 be considered for use in high-speed networks.
+
+
+
+
+
+Touch Informational [Page 6]
+
+RFC 1810 Report on MD5 Performance June 1995
+
+
+Acknowledgements
+
+ The author would like to thank Steve Kent at BBN, Burt Kaliski,
+ Victor Chang, and Steve Burnett at RSA, Ran Atkinson at the NRL, and
+ the HPCC Division at ISI for reviewing the contents of this document.
+ Burt independently suggested the block-parallel modification proposed
+ here.
+
+References
+
+ [1] Atkinson, R., "IPv6 Authentication Header", Work in Progress,
+ Naval Research Lab, February 1995.
+
+ [2] DiMarco, J., "Spec Benchmark table, V. 4.12",
+ <ftp://ftp.cfd.toronto.edu/pub/spectable>.
+
+ [3] Rivest, R., "The MD5 Message-Digest Algorithm", RFC1321, MIT LCS
+ & RSA Data Security, Inc., April 1992.
+
+ [4] Partridge, C., and F. Kastenholz, "Technical Criteria for
+ Choosing IP The Next Generation (IPng)", RFC 1726, BBN Systems
+ and Technologies, FTP Software, December 1994.
+
+ [5] Postel, J., "Internet Protocol - DARPA Internet Program Protocol
+ Specification," STD 5, RFC 791, USC/Information Sciences
+ Institute, September 1981.
+
+ [6] Touch, J., "Performance Analysis fo MD5," to appear in ACM
+ Sigcomm '95, Boston.
+
+ [7] Touch, J., Optimized MD5 software, <ftp://ftp.isi.edu/pub/hpcc-
+ papers/touch/md5-opt.tar>.
+
+Author's Address
+
+ Joe Touch
+ Information Sciences Institute
+ University of Southern California
+ 4676 Admiralty Way
+ Marina del Rey, CA 90292-6695
+ USA
+
+ Phone: +1 310-822-1511 x151
+ Fax: +1 310-823-6714
+ URL: ftp://ftp.isi.edu/pub/hpcc-papers/touch
+ EMail: touch@isi.edu
+
+
+
+
+
+Touch Informational [Page 7]
+