summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc1794.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc1794.txt')
-rw-r--r--doc/rfc/rfc1794.txt395
1 files changed, 395 insertions, 0 deletions
diff --git a/doc/rfc/rfc1794.txt b/doc/rfc/rfc1794.txt
new file mode 100644
index 0000000..53a9847
--- /dev/null
+++ b/doc/rfc/rfc1794.txt
@@ -0,0 +1,395 @@
+
+
+
+
+
+
+Network Working Group T. Brisco
+Request for Comments: 1794 Rutgers University
+Category: Informational April 1995
+
+
+ DNS Support for Load Balancing
+
+Status of this Memo
+
+ This memo provides information for the Internet community. This memo
+ does not specify an Internet standard of any kind. Distribution of
+ this memo is unlimited.
+
+1. Introduction
+
+ This RFC is meant to first chronicle a foray into the IETF DNS
+ Working Group, discuss other possible alternatives to
+ provide/simulate load balancing support for DNS, and to provide an
+ ultimate, flexible solution for providing DNS support for balancing
+ loads of many types.
+
+2. History
+
+ The history of this probably dates back well before my own time - so
+ undoubtedly some holes are here. Hopefully they can be filled in by
+ other authors.
+
+ Initially; "load balancing" was intended to permit the Domain Name
+ System (DNS) [1] agents to support the concept of "clusters" (derived
+ from the VMS usage) of machines - where all machines were
+ functionally similar or the same, and it didn't particularly matter
+ which machine was picked - as long as the load of the processing was
+ reasonably well distributed across a series of actual different
+ hosts. Around 1986 a number of different schemes started surfacing
+ as hacks to the Berkeley Internet Name Domain server (BIND)
+ distribution. Probably the most widely distributed of these were the
+ "Shuffle Address" (SA) modifications by Bryan Beecher, or possibly
+ Marshall Rose's "Round Robin" code.
+
+ The SA records, however, did a round-robin ordering of the Address
+ resource records, and didn't do much with regard to the particular
+ loads on the target machines. Matt Madison (of TGV) implemented some
+ changes that used VMS facilities to review the system loads, and
+ return A RRs in the order of least-loaded to most loaded.
+
+ The problem was with SAs was that load was not actually a factor, and
+ TGV's relied on VMS specific facilities to order the records. The SA
+ RRs required changes to the DNS specification (in file syntax and in
+
+
+
+Brisco [Page 1]
+
+RFC 1794 DNS Support for Load Balancing April 1995
+
+
+ record processing). These were both viewed as drawbacks and not as
+ general solutions.
+
+ Most of the Internet waited in anticipation of an IETF approved
+ method for simulating "clusters".
+
+ Through a few IETF DNS Working Group sessions (Chaired by Rob Austein
+ of Epilogue), it was collectively agreed upon that a number of
+ criteria must be met:
+
+ A) Backwards compatibility with the existing DNS RFC.
+
+ B) Information changes frequently.
+
+ C) Multiple addresses should be sent out.
+
+ D) Must interact with other RRs appropriately.
+
+ E) Must be able to represent many types of "loads"
+
+ F) Must be fast.
+
+ (A) would ensure that the installed base of BIND and other DNS
+ implementations would continue to operate and interoperate properly.
+
+ (B) would permit very fast update times - to enable modeling of
+ real-time data. Five minutes was thought as a normal interval,
+ though changes as fast as every sixty seconds could be imagined.
+
+ (C) would cover the possibility of a host's address being advertised
+ as optimal, yet the machine crashed during the period within the TTL
+ of the RR. The second-most preferable address would be advertised
+ second, the third-most preferable third, and so on. This would allow
+ a reasonable stab at recovery during machine failures.
+
+ (D) would ensure correct handling of all ancillary information - such
+ as MX, RP, and TXT information, as well as reverse lookup
+ information. It needed to be ensured that such processes as mail
+ handling continued to work in an unsurprising and predictable manner.
+
+ (E) would ensure the flexibility that everyone wished. A breadth of
+ "loads" were wished to be represented by various members of the DNS
+ Working Group. Some "loads" were fairly eclectic - such as the
+ address ordering by the RTT to the host, some were pragmatic - such
+ as balancing the CPU load evenly across a series of hosts. All
+ represented valid concerns within their own context, and the idea of
+ having separate RR types for each was unthinkable (primarily; it
+ would violate goal A).
+
+
+
+Brisco [Page 2]
+
+RFC 1794 DNS Support for Load Balancing April 1995
+
+
+ (F) needed to ensure a few things. Primarily that the time to
+ calculate the information to order the addressing information did not
+ exceed the TTL of the information distributed - i.e., that elements
+ with a TTL of five minutes didn't take six minutes to calculate.
+ Similarly; it seems a fairly clear goal in the DNS RFC that clients
+ should not be kept waiting - that request processing should continue
+ regardless of the state of any other processing occurring.
+
+3. Possible Alternatives
+
+ During various discussions with the DNS Working Group and with the
+ Load Balancing Committee, it was noted that no existing solution
+ dealt with all wishes appropriately. One of the major successes of
+ the DNS is its flexibility - and it was felt that this needed to be
+ retained in all aspects. It was conceived that perhaps not only
+ address information would need to be changed rapidly, but other
+ records may also need to change rapidly (at least this could not be
+ ruled out - who knows what technologies lurk in the future).
+
+ Of primary concern to many was the ability to interact with older
+ implementations of DNS. The DNS is implemented widely now, and
+ changes to critical portions of the protocol could cause havoc for
+ years. It became rapidly apparent through conversations with Jon
+ Postel and Dave Crocker (Area Director) that modifications to the
+ protocol would be viewed dimly.
+
+4. A Flexible Model
+
+ During many hours of discussions, it arose upon suggestion from Rob
+ Austein that the changes could be implemented without changes to the
+ protocol; if zone transfer behavior could be subtly changed, then the
+ zone transfer process could accommodate the changing of various RR
+ information. What was needed was a smarter program to do the zone
+ transfers. Pursuant to this, changes were made to BIND that would
+ permit the specification of the program to do the zone transfers for
+ particular zones.
+
+ There is no specification that a secondary has to receive updates
+ from its primary server in any specific manner - only that it needs
+ to check periodically, and obtain new zone copies when changes have
+ been made. Conceivably the zone transfer agent could obtain the
+ information from any number of sources (e.g., a load average daemon,
+ a round-robin sorter) and present the information back to the
+ nameserver for distribution.
+
+ A number of questions arose from this concept, and all seem to have
+ been dealt with accordingly. Primarily, the DNS protocol doesn't
+ guarantee ordering. While the DNS protocol doesn't guarantee
+
+
+
+Brisco [Page 3]
+
+RFC 1794 DNS Support for Load Balancing April 1995
+
+
+ ordering, it is clear that the ordering is predictive - that
+ information read in twice in the same order will be presented twice
+ in the same order to clients. Clients, of course, may reorder this
+ information, but that is deemed as a "local issue" as it is
+ configurable by the remote systems administrators (e.g., sortlists,
+ etc). The zone transfer agent would have to account for any "mis-
+ ordering" that may occur locally, but remote reordering (e.g., client
+ side sortlists) of RRs is is impossible to predict. Since local
+ mis-ordering is consistent, the zone transfer agents could easily
+ account for this.
+
+ Secondarily, but perhaps more subtly, the problem arises that zone
+ transfers aren't used by primary nameservers, only by secondary
+ nameservers. To clarify this, the idea of "fast" or "volatile"
+ subzones must be dealt with. In a volatile environment (where
+ address or other RR ordering changes rapidly), the refresh rate of a
+ zone must be set very low, and the TTL of the RRs handed out must
+ similarly be very low. There is no use in handing out information
+ with TTLs of an hour, when the conditions for ordering the RRs
+ changes minutely. There must be a relatively close relationship
+ between the refresh rates and TTLs of the information. Of course,
+ with very low refresh rates, zone transfers between the primary and
+ secondary would have to occur frequently. Given that primary and
+ secondary nameservers should be topologically and geographically far
+ apart, moving that much data that frequently is seen as prohibitive.
+ Also; the longer the propagation time between the primary and
+ secondary, the larger the window in which circumstances can change -
+ thus invalidating the secondary's information. It is generally
+ thought that passing volatile information on to a secondary is fairly
+ useless - if secondaries want accurate information, then they should
+ calculate it themselves and not obtain it via zone transfers. This
+ avoids the problem with secondaries losing contact with the primaries
+ (but access to the targets of the volatile domain are still
+ reachable), but the secondary has information that is growing stale.
+
+ What is essentially necessary is a secondary (with no primary) which
+ can calculate the necessary ordering of the RR data for itself (which
+ also avoids the problem of different versions of domain servers
+ predictively ordering RR information in different predictive
+ fashions). For a volatile zone, there is no primary DNS agent, but
+ rather a series of autonomous secondary agents. Each autonomous
+ secondary agent is, of course, capable of calculating the ordering or
+ content of the volatile RRs itself.
+
+
+
+
+
+
+
+
+Brisco [Page 4]
+
+RFC 1794 DNS Support for Load Balancing April 1995
+
+
+5. Implementation
+
+ With some help from Masataka Ohta (Tokyo Institute of Technology), I
+ implemented modifications to BIND to permit the specification of the
+ zone transfer program (zone transfer agent) for particular domains:
+
+ transfer <domain-name> <program-name>
+
+ Currently I define a separate subdomain that has a few hosts defined
+ in it - all volatile information. The zone has a refresh rate of
+ 300, and a minimum TTL of 300 indicated. The configuration file is
+ indicated as "volatile.hosts". Every 300 seconds a program "doAxfer"
+ is run to do the zone transfer. The program "doAxfer" reads the file
+ "volatile.hosts.template" and the file "volatile.hosts.list". The
+ addresses specified in volatile.hosts.list are rotated a random
+ number of times, and then substituted (in order) into
+ volatile.hosts.template to generate the file volatile.hosts. The
+ program "doAxfer" then exits with a value of 1 - to indicate to the
+ nameserver that the zone transfer was successful, and that the file
+ should be read in, and the information distributed. This results in
+ a host having multiple addresses, and the addresses are randomized
+ every five minutes (300 seconds).
+
+ Two bugs continue to plague us in this endeavor. BIND currently
+ considers any TTL under 300 seconds as "irrational", and substitutes
+ in the value of 300 instead. This greatly hampers the functionality
+ of volatile zones. In the fastest of all cases - a 0 TTL -
+ information would be used once, and then thrown away. Presumably the
+ new RR information could be calculated every 5 seconds, and the RRs
+ handed out with a TTL of 0. It must be considered that one
+ limitation of the speed of a zone is going to be the ability of a
+ machine to calculate new information fast enough.
+
+ The other bug that also effects this is that, as with TTLs, BIND
+ considers any zone refresh rate under 15 minutes to be similarly
+ irrational. Obviously zone refresh rates of 15 minutes is
+ unacceptable for this sort of applications.
+
+ For a work-around, the current code sets these same hard-coded values
+ to 60 seconds. Sixty seconds is still large enough to avoid any
+ residual bugs associated with small timer values, but is also short
+ enough to allow fast subzones to be of use.
+
+ This version of BIND is currently in release within Rutgers
+ University, operating in both "fast" and normal zones.
+
+
+
+
+
+
+Brisco [Page 5]
+
+RFC 1794 DNS Support for Load Balancing April 1995
+
+
+6. Performance
+
+ While the performance of fast zones isn't exactly stellar, it is not
+ much more than the normal CPU loads induced by BIND. Testing was
+ performed on a Sun Sparc-2 being used as a normal workstation, but no
+ resolvers were using the name server - essentially the nameserver was
+ idle. For a configuration with no fast subzones, BIND accrued 11 CPU
+ seconds in 24 hours. For a configuration with one fast zone, six
+ address records, and being refreshed every 300 seconds (5 minutes),
+ BIND accrued 1 minute 4 seconds CPU time. For the same previous
+ configuration, but being refreshed every sixty seconds, BIND accrued
+ 5 minutes and 38 seconds of CPU time.
+
+ As is no great surprise, the CPU load on the serving machine was
+ linear to the frequency of the refresh time. The sixty second
+ refresh configuration used approximately five times as much CPU time
+ as did the 300 second refresh configuration. One can easily
+ extrapolate that the overall CPU utilization would be linear to the
+ number of zones and the frequency of the refresh period. All of this
+ is based on a shell script that always indicated that a zone update
+ was necessary, a more intelligent program should realize when the
+ reordering of the RRs was unnecessary and avoid such periodic zone
+ reloads.
+
+7. Acknowledgments
+
+ Most of the ideas in this document are the results of conversations
+ and proposals from many, many people - including, but not limited to,
+ Robert Austein, Stuart Vance, Masataka Ohta, Marshall Rose, and the
+ members of the IETF DNS Working Group.
+
+8. References
+
+ [1] Mockapetris, P., "Domain Names - Implementation and
+ Specification", STD 13, RFC 1035, USC/Information Sciences
+ Institute, November 1987.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Brisco [Page 6]
+
+RFC 1794 DNS Support for Load Balancing April 1995
+
+
+9. Security Considerations
+
+ Security issues are not discussed in this memo.
+
+10. Author's Address
+
+ Thomas P. Brisco
+ Associate Director for Network Operations
+ Rutgers University
+ Computing Services, Telecommunications Division
+ Hill Center for the Mathematical Sciences
+ Busch Campus
+ Piscataway, New Jersey 08855-0879
+ USA
+
+ Phone: +1-908-445-2351
+ EMail: brisco@rutgers.edu
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Brisco [Page 7]
+