From 4bfd864f10b68b71482b35c818559068ef8d5797 Mon Sep 17 00:00:00 2001 From: Thomas Voss Date: Wed, 27 Nov 2024 20:54:24 +0100 Subject: doc: Add RFC documents --- doc/rfc/rfc1296.txt | 507 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 507 insertions(+) create mode 100644 doc/rfc/rfc1296.txt (limited to 'doc/rfc/rfc1296.txt') diff --git a/doc/rfc/rfc1296.txt b/doc/rfc/rfc1296.txt new file mode 100644 index 0000000..988bb09 --- /dev/null +++ b/doc/rfc/rfc1296.txt @@ -0,0 +1,507 @@ + + + + + + +Network Working Group M. Lottor +Request for Comments: 1296 SRI International + Network Information Systems Center + January 1992 + + + Internet Growth (1981-1991) + +Status of this Memo + + This memo provides information for the Internet community. It does + not specify an Internet standard. Distribution of this memo is + unlimited. + +Abstract + + This document illustrates the growth of the Internet by examination + of entries in the Domain Name System (DNS) and pre-DNS host tables. + DNS entries are collected by a program called ZONE, which searches + the Internet and retrieves data from all known domains. Pre-DNS host + table data were retrieved from system archive tapes. Various + statistics are presented on the number of hosts and domains. + +Table of Contents + + Introduction.................................................... 1 + How ZONE Works.................................................. 2 + Problems with Data Collection................................... 3 + Scope of the Study.............................................. 3 + N. Results...................................................... 4 + N.1 Number of Internet Hosts.................................... 4 + N.2 Number of Domains........................................... 6 + N.3 Distribution of IP Addresses per Host....................... 7 + N.4 Distribution of Hosts by Top-level Domain................... 7 + N.5 Distribution of Hosts by Host Name.......................... 8 + Future Issues................................................... 8 + RFC References.................................................. 9 + Security Considerations......................................... 9 + Author's Address................................................ 9 + +Introduction + + This document provides statistics on the growth of the Internet by + examining the number of Internet hosts and domains over a 10-year + period. Before the Domain Name System was established, practically + all hosts on the Internet were registered with the Network + Information Center (SRI-NIC) and entries were placed in the Official + Host Table for each one. Data on the number of hosts for pre-DNS + + + +Lottor [Page 1] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + + years comes from copies of the host table at selected times. The DNS + system was introduced around 1984 but took almost 4 years before it + was fully implemented on the Internet. However, by this time many + hosts were no longer registered in the Host Table. + + In 1986, the ZONE (Zealot Of Name Edification) program was written. + ZONE was originally intended to be used during the host-table-to-DNS + transition period. ZONE would "walk" the DNS tree and build a host + table of all the information it collected. This host table could + then be used by sites that had not yet made the DNS transition. + However, ZONE was never used for this purpose. Instead, it was found + to be useful for collecting statistics on the size of the domain + system and the Internet. + + ZONE could not collect complete data on the DNS until around 1988, + because early versions of BIND (the popular Unix DNS implementation) + had major problems with the zone transfer function of the DNS + protocol. ZONE has been used in varying ways ever since to collect + this information. In the first few years, it was used to produce a + wall-size chart of the domain tree. However, the number of domains + quickly outgrew the size of the wall and the charts were abandoned. + In later years, statistics on the number of hosts and domains were + extracted from the resulting host table, sometimes categorizing data + based on top-level domain names or on computer system type or + manufacturer. + + The time to gather the data also grew from hours to a week, and the + size of the host table produced soon reached 50 megabytes. In order + to reduce the amount of data collected, ZONE is now run in a mode + collecting only host names and IP addresses, ignoring protocol, host + information and MX record data. The host table is then groveled over + by some utilities (such as sort, uniq and grep) to produce the + statistics required. ZONE is currently run every 3 months at SRI. + +How ZONE Works + + ZONE maintains a list of domains and their servers and a flag + indicating whether information for a domain has been successfully + loaded from one of the servers. Because of another bug in BIND, ZONE + must be primed with a list of all the top-level domains and their + name servers. It then cycles through the domain list, attempting to + contact one of the servers for each domain not yet transferred. When + a server is contacted (via TCP), a Start of Authority (SOA) query is + first sent to make sure the server is authoritative for the domain + being requested. If so, then a zone transfer query (AXFR) is sent to + request all the resource records for the domain to be retrieved. + + When a name server record (NS) is received, the referenced domain and + + + +Lottor [Page 2] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + + server are added to the list of domains to process. When host + records (A, CNAME, HINFO, MX) are received, they are added to an in- + core table of host information. The program ends when it has cycled + through the entire list of domains without receiving any new + information. It then dumps the table of host information to a + HOSTS.TXT format file. + +Problems with Data Collection + + For various reasons, some Internet sites do not allow zone transfers + of their domain servers. ZONE also eventually gives up trying to + transfer a domain after too many failures. The number of domains + that could not be zone transferred during the 1-Jan-92 ZONE run was + around 800 out of 17,000. Additionally, it is assumed that not all + hosts on the Internet are registered in a domain server. These + problems cause the statistics gathered by ZONE to be lower than the + actual amounts. + + Manual review of some of the data collected by ZONE also shows a lot + of random entries in the DNS. Misformatted entries may cause bogus + server or host records to appear. Many times a server is found to + not be authoritative for the domain listed. Sometimes entire domains + are renamed and their old entries left in place for a transition + period, thus causing each host within that domain to be counted + twice. These problems cause the results of ZONE to be higher than + the actual amounts. + + Manual scanning of the data indicates that the additional entries are + insignificant compared to the missing entries discussed earlier. + ZONE data can thus be viewed as the minimum number of Internet hosts, + and not the actual figures. + + A final problem with data collection is that of expense. Downloading + domain information from every domain on the Internet generates a + large amount of network traffic. It also puts an extra CPU load on + each domain server it must contact. An organized effort might be + considered to have only one such program doing this on the Internet + at regularly scheduled intervals to keep the problem of multiple data + collectors from occurring. + +Scope of the Study + + A problem with counting hosts and domains on the Internet is defining + what the Internet really is. Finding host entries in the DNS does + not necessarily indicate that the host is reachable from the + Internet. Many companies have mail gateways between the Internet and + their local nets, thus disallowing direct access. However, some of + these companies advertise all their hosts, and some advertise only + + + +Lottor [Page 3] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + + the gateway. Are these hosts on the Internet or not? + + Furthermore, many domains in the DNS are just mail-forwarding (MX) + entries for off-Internet (such as Usenet) sites. Are these domains + really part of the Internet and should they be counted in an Internet + size study? + + For the purposes of this study, a host has been defined as a + [name(s),IP-address(es)] grouping discovered from the DNS. This + prevents us from counting a host with multiple names or addresses + more than once. However, this does not consider whether the host is + directly accessible or not. When ZONE counts the number of domains + it includes all domains referenced by an NS record in the DNS, thus + including MX-only domain sites in the final results. + +N. Results + + This section presents data from archive tapes of SRI-NIC from 1981 to + 1986, and statistics gathered by runs of ZONE from 1986 to 1992. + +N.1 Number of Internet Hosts + + The chart below shows the number of IP hosts on the Internet. These + are hosts with at least one IP address assigned. Data was collected + by ZONE except where noted. The following two sections are graphs of + the data in this chart. + + Date Hosts + + 08/81 213 Host table #152 + 05/82 235 Host table #166 + 08/83 562 Host table #300 + 10/84 1,024 Host table #392 + 10/85 1,961 Host table #485 + 02/86 2,308 Host table #515 + 11/86 5,089 + 12/87 28,174 + 07/88 33,000 + 10/88 56,000 + 01/89 80,000 + 07/89 130,000 + 10/89 159,000 + 10/90 313,000 + 01/91 376,000 + 07/91 535,000 + 10/91 617,000 + 01/92 727,000 + + + + +Lottor [Page 4] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + + Number of Internet Hosts (linear) +800| +780| +760| +740| * +720| +700| +680| . +660| +640| +620| +600| T * +580| h +560| o +540| u +520| s * +500| a +480| n . +460| d +440| s +420| . +400| o +380| f +360| * +340| H . +320| o +300| s * +280| t +260| s . +240| . +220| . +200| . +180| . +160| +140| * +120| * +100| .. + 80| * + 60| . + 40| * + 20| ..*...* + 0|...*....*......*......*.....*.*....*... + ------------------------------------------------------------------- + 8 8 8 8 8 8 8 8 8 9 9 9 + 1 2 3 4 5 6 7 8 9 0 1 2 + Date + "*" = data point, "." = estimate +This graph is a linear plot of the number of Internet hosts. + + + +Lottor [Page 5] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + + Number of Internet Hosts (logarithmic) + + + | 1000000 + | *.* + | ..*.*..* + | ... + | 100000 ..** + | *.* + H | ...* + o | .* + s | 10000 .. + t | .. + s | ....* + | ...*.* +1000| ...*.. + | ... + | ...* + | ..*....*... + 100|. + ------------------------------------------------------------------- + 8 8 8 8 8 8 8 8 8 9 9 9 + 1 2 3 4 5 6 7 8 9 0 1 2 + Date + + "*" = data point, "." = estimate + +This graph is a logarithmic plot of the number of Internet hosts. + +N.2 Number of Domains + + This chart shows the number of domains existing in the Internet + Domain Name System as collected by ZONE. + + Date Domains + + 07/88 900 + 10/88 1,280 + 01/89 2,600 + 07/89 3,900 + 10/89 4,800 + 10/90 9,300 + 01/91 11,200 + 07/91 16,000 + 10/91 18,000 + 01/92 17,000 + + + + + +Lottor [Page 6] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + +N.3 Distribution of IP Addresses per Host + + This chart shows how many hosts have how many IP addresses. This + data was collected on 1-Jan-92 and only the first 10 entries are + shown. + + Addresses Hosts + + 1 715143 + 2 9015 + 3 1027 + 4 556 + 5 314 + 6 213 + 7 100 + 8 85 + 9 58 + 10 71 + +N.4 Distribution of Hosts by Top-level Domain + + This chart shows the number of hosts per top-level domain (top 40 + only) on 1-Jan-92. The percentage listed is the increase since 1- + Oct-91. Large variations are probably due to problems and variations + in the collection process; these figures are not meant to be + authoritative, but serve as reasonable estimates. + + 243020 edu 13% 13011 fr 4% 1791 dk 4% 357 be -5% + 181361 com 12% 12770 nl 21% 1662 es 15% 334 gr 14% + 46463 gov 13% 12647 ch 10% 1506 kr 9% 308 br 26% + 31622 au 19% 11994 fi 15% 1111 nz -16% 284 mx -5% + 31016 de 20% 10228 no 9% 1016 tw n/a 207 is 0% + 27492 mil 26% 8579 jp 6% 929 za n/a 146 pl 97% + 27052 ca 22% 4109 net -49% 784 pt n/a 127 us 25% + 19117 org 10% 3324 at 19% 484 sg 251% 25 tn 0% + 18984 uk 139% 2719 it 197% 448 hk 78% 24 hu 71% + 18473 se 34% 2020 il 14% 374 ie -7% 6 arpa 0% + + + + + + + + + + + + + + +Lottor [Page 7] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + +N.5 Distribution of Hosts by Host Name + + This chart shows the distribution of hosts by their host name on 1- + Jan-92. The host name is defined to be the first part of a fully + qualified domain name. Only the top 100 names are shown. + +384 venus 204 mac4 172 mac9 155 pollux 138 chaos +356 pluto 201 hobbes 172 mac11 155 frodo 136 bart +323 mars 201 hermes 170 mac8 153 helios 135 pc5 +288 jupiter 198 thor 169 phoenix 152 mac17 135 larry +286 saturn 198 sirius 169 mac12 151 vega 135 cs +285 pc1 196 gw 169 hal 151 mac18 133 odin +282 zeus 195 calvin 168 snoopy 150 falcon 131 tiger +262 iris 194 mac5 168 mac13 150 bach 131 sparky +260 mercury 191 mac10 167 mac15 146 castor 131 ariel +259 mac1 190 fred 167 mac14 145 sol 130 sneezy +258 orion 189 titan 167 grumpy 145 dopey 128 mac +254 mac2 189 pc3 163 gandalf 144 mac20 127 sun1 +240 newton 186 opus 162 pc4 144 mac19 127 rocky +234 neptune 186 mac6 160 uranus 142 spock 126 pc6 +233 pc2 185 charon 159 mac16 142 euler 125 hydra +224 gauss 185 apollo 158 sleepy 141 mickey 125 homer +222 eagle 179 mac7 158 io 141 atlas 124 isis +213 mac3 179 athena 157 earth 140 maxwell 123 moe +209 merlin 177 alpha 156 europa 140 happy 123 delta +207 cisco 172 mozart 155 rigel 140 doc 122 pc10 + +Future Issues + + ZONE currently runs on a DECsystem-20 and is written in assembler. + The amount of data is quickly reaching the limits of the DEC-20 + section address space, and the hardware's ability to survive gets + slimmer each day. ZONE assembles all its data in core before dumping + it to disk. The implementation does this in order to be able to + match host nicknames with official names before dumping complete host + records. Sometimes a nickname can be in a different domain than the + official name, complicating simpler methods. + + A new version of ZONE needs to be written to run on a modern computer + system. A completely new architecture should be designed to handle + the enormous amount of data collected and expected in the future. + Data should be kept on disk so that a system crash will not wipe out + days of collection. Multiple zone transfers could be occurring in + parallel to reduce the time needed for data gathering. A new ZONE + might run continuously, cycling through the domain system on a cycle + lasting weeks to a month, updating a local database with statistics + collected for each domain. In this way, current statistics on the + size of the Internet would always be known. The resulting database + + + +Lottor [Page 8] + +RFC 1296 Internet Growth (1981-1991) January 1992 + + + may also be useful for other network information services. + +RFC References + + Libes, D., "Choosing a Name for Your Computer", RFC 1178, Integrated + Systems Group/NIST, August 1990. (Also FYI 5.) + + Mockapetris, P., "Domain Names - Implementation and Specification", + RFC 1035, USC/Information Sciences Institute, November 1987. + + Mockapetris, P., "Domain names - Concepts and Facilities", RFC 1034, + USC/Information Sciences Institute, November 1987. + + Lazear, W., "MILNET Name Domain Transition", RFC 1031, Mitre, + November 1987. + + Harrenstien, K. Stahl, M., and J. Feinler, "DoD Internet Host Table + Specification", SRI, October 1985. + + Postel, J., "Domain Name System Implementation Schedule - Revised", + RFC 921, USC/Information Sciences Institute, October 1984. + +Security Considerations + + Security issues are not discussed in this memo. + +Author's Address + + Mark K. Lottor + SRI International + Network Information Systems Center + 333 Ravenswood Avenue, EJ282 + Menlo Park, CA 94025 + + EMail: mkl@nisc.sri.com + + + + + + + + + + + + + + + + +Lottor [Page 9] + \ No newline at end of file -- cgit v1.2.3