diff options
Diffstat (limited to 'doc/rfc/rfc1273.txt')
-rw-r--r-- | doc/rfc/rfc1273.txt | 451 |
1 files changed, 451 insertions, 0 deletions
diff --git a/doc/rfc/rfc1273.txt b/doc/rfc/rfc1273.txt new file mode 100644 index 0000000..fab1cef --- /dev/null +++ b/doc/rfc/rfc1273.txt @@ -0,0 +1,451 @@ + + + + + + +Network Working Group M. Schwartz +Request for Comments: 1273 University of Colorado + November 1991 + + + A Measurement Study of Changes in + Service-Level Reachability in the Global + TCP/IP Internet: Goals, Experimental Design, + Implementation, and Policy Considerations + +Status of this Memo + + This memo provides information for the Internet community. It does + not specify an Internet standard. Distribution of this memo is + unlimited. + +Abstract + + In this report we discuss plans to carry out a longitudinal + measurement study of changes in service-level reachability in the + global TCP/IP Internet. We overview our experimental design, + considerations of network and remote site load, mechanisms used to + control the measurement collection process, and network appropriate + use and privacy issues, including our efforts to inform sites + measured by this study. A list of references and information on how + to contact the Principal Investigator are included. + +Introduction + + The global TCP/IP Internet interconnects millions of individuals at + thousands of institutions worldwide, offering the potential for + significant collaboration through network services and electronic + information exchange. At the same time, such powerful connectivity + offers many avenues for security violations, as evidenced by a number + of well publicized events over the past few years. In response, many + sites have imposed mechanisms to limit their exposure to security + intrusions, ranging from disabling certain inter-site services, to + using external gateways that only allow electronic mail delivery, to + gateways that limit remote interactions via access control lists, to + disconnection from the Internet. While these measures are preferable + to the damage that could occur from security violations, taken to an + extreme they could eventually reduce the Internet to little more than + a means of supporting certain pre-approved point-to-point data + transfers. Such diminished functionality could hinder or prevent the + deployment of important new types of network services, impeding both + research and commercial advancement. + + To understand the evolution of this situation, we have designed a + + + +Schwartz [Page 1] + +RFC 1273 A Measurement Study November 1991 + + + study to measure changes in Internet service-level reachability over + a period of one year. The study considers upper layer service + reachability instead of basic IP connectivity because the former + indicates the willingness of organizations to participate in inter- + organizational computing, which will be an important component of + future wide area distributed applications. + + The data we gather will contribute to Internet research and + engineering planning activities in a number of ways. The data will + indicate the mechanisms sites use to distance themselves from + Internet connectivity, the types of services that sites are willing + to run (and hence the type of distributed collaboration they are + willing to support), and variations in these characteristics as a + function of geographic location and type of institution (commercial, + educational, etc.). Understanding these trends will allow + application designers and network builders to more realistically plan + for how to support future wide area distributed applications such as + digital library systems, information services, wide area distributed + file systems, and conferencing and other collaboration-support + systems. The measurements will also be of general interest, as they + represent direct measurements of the evolution of a global electronic + society. + + Clearly, a study of this nature and magnitude raises a number of + potential concerns. In this note we overview our experimental + design, considerations of network and remote site load, mechanisms + used to control the measurement collection process, and our efforts + to inform sites measured by this study, along with concomitant + network appropriate use and privacy issues. + + A point we wish to stress from the outset is that this is not a study + of network security. The experiments do not attempt to probe the + security mechanisms of any machine on the network. The study is + concerned solely with the evolution of network connectivity and + service reachability. + +Experimental Design + + The study consists of a set of runs of a program over the span of one + to two days each month, repeated bimonthly for a period of one year + (in January 1992, March 1992, May 1992, July 1992, September 1992, + and November 1992). Each program run attempts to connect to 13 + different TCP services at each of approximately 12,700 Internet + domains worldwide, recording the failure/success status of each + attempt. The program will attempt no data transfers in either + direction. If a connection is successful, it is simply closed and + counted. (Note in particular that this means that the security + mechanism behind individual network services will not be tested.) + + + +Schwartz [Page 2] + +RFC 1273 A Measurement Study November 1991 + + + The machines on which connections are attempted will be selected at + random from a large list of machines in the Internet, constrained + such that at most 1 to 3 machines is contacted in any particular + domain. + + The services to which connections will be attempted are: + + __________________________________________________________________ + Port Number Service Port Number Service + ------------------------------------------------------------------ + 13 daytime 111 Sun portmap + 15 netstat 513 rlogin + 21 FTP 514 rsh + 23 telnet 540 UUCP + 25 SMTP 543 klogin + 53 Domain Naming System 544 krcmd, kshell + 79 finger + _________________________________________________________________ + + This list was chosen to span a representative range of service + types, each of which can be expected to be found on any machine in a + site (so that probing random machines is meaningful). The one + exception is the Domain Naming System, for which the machines + to probe are selected from information obtained from the Domain + system itself. Only TCP services are tested, since the TCP + connection mechanism allows one to determine if a server is + running in an application-independent fashion. + + As an aside, it would be possible to retrieve "Well Known + Service" records from the Domain Naming System, as a somewhat less + "invasive" measurement approach. However, these records are not + required for proper network operation, and hence are far from + complete or consistent in the Domain Naming System. The only way + to collect the data we want is to measure them in the fashion + described above. + +Network and Remote Site Load + + The measurement software is quite careful to avoid generating + unnecessary internet packets, and to avoid congesting the internet + with too much concurrent activity. Once it has successfully + connected to a particular service in a domain, the software never + attempts to connect to that service on any machine in that domain + again, for the duration of the current measurement run (i.e., the + current 60 days). Once it has recorded 3 connection refusals at any + machines in that domain for a service, it does not try that service + at that domain again during the current measurement run. If it + experiences 3 timeouts on any machine in a domain, it gives up on the + + + +Schwartz [Page 3] + +RFC 1273 A Measurement Study November 1991 + + + domain, possibly to be retried again a day later (to overcome + transient network problems). In the worst case there will be 3 + connection failures for each service at 3 different machines, which + amounts to 37 connection requests per domain (3 for each of the 12 + services other than the Domain Naming System, and one for the Domain + Naming System). However, the average will be much less than this. + + To quantify the actual Internet load, we now present some + measurements from test runs of the measurement software that were + performed in August 1991. In total, 50,549 Domain Naming System + lookups were performed, and 73,760 connections were attempted. This + measurement run completed in approximately 10 hours, never initiating + more than 20 network operations (name lookups or connection attempts) + concurrently. The total NSFNET backbone load from all traffic + sources that month was approximately 5 billion packets. Therefore, + the traffic from our measurement study amounted to less than .5% of + this volume on the day that the measurements were collected. Since + the Internet contains several other backbones besides NSFNET, the + proportionate increase in total Internet traffic was significantly + less than .5%. + + The cost to a remote site being measured is effectively zero. From + the above measurements, on average we attempted 5.7 connections per + remote domain. The cost of a connection open/close sequence is quite + small, particularly when compared to the cost of the many electronic + mail and news transmissions that most sites experience on a given + day. + +Control Over Measurement Collection Process + + The measurement software evolved from an earlier set of experiments + used to measure the reach of an experimental Internet white pages + tool called netfind [Schwartz & Tsirigotis 1991b], and has been + evolved and tested extensively over a period of two years. During + this time it has been used in a number of experiments of increasing + scale. The software uses several redundant checks and other + mechanisms to ensure that careful control is maintained over the + network operations that are performed [Schwartz & Tsirigotis 1991a]. + In addition, we monitor the progress and network loading of the + measurements during the measurement runs, observing the log of + connection requests in progress as well as physical and transport + level network status (which indicate the amount of concurrent network + activity in progress). Finally, because the measurements are + controlled from a single centralized location, it is quite easy to + stop the measurements at any time. + + + + + + +Schwartz [Page 4] + +RFC 1273 A Measurement Study November 1991 + + +Network Appropriate Use and Privacy Issues + + When we performed our initial test runs of this study, we attempted + to inform site administrators at each study site about this study, by + posting a message on the USENET newsgroup "alt.security" and by + sending individual electronic mail messages to site administrators. + We also informed the Computer Emergency Response Team (CERT) at CMU + of the study. As a practical matter, informing all sites turned out + to be quite difficult. Part of the problem was that no channels + exist to allow such information to be easily disseminated. + Approximately half of the messages we sent to site administrators + were returned by remote mail systems as undeliverable. Moreover, the + network traffic and remote site administrative load caused by the + study announcement messages far outstripped the network and + administrative load required by the study itself. Some sites felt + that the announcement was an unnecessary imposition of their time. + + In addition to these practical problems, a broad announcement of this + study could affect the measurements it attempts to gather. Some + sites would likely react to the announcement by changing the + reachability of their services. Asking for explicit permission from + sites would yield even worse methodological problems, as this would + have provided a self-selected study group consisting of sites that + are less likely to disconnect from the Internet. + + In contrast with our attempts to announce the study, running the + study without announcing it caused only a small number of site + administrators to notice the traffic and inquire about it to either + the CERT or to one of the responsible network contacts at the + University of Colorado. The remote site administrator and network + overhead of announcing the the study, coupled with the practical and + methodological problems of announcing the study, lead us to prefer to + run the study without further broad announcements. Yet, to avoid + causing alarm at a site detecting our network measurement activity, + it makes sense to announce the study. + + To resolve this problem, we discussed the study with the Internet + Activities Board, Internet Engineering Steering Group, National + Science Foundation, representatives of several U.S. regional + networks, and a number of individuals involved with network security, + including the Computer Emergency Response Team, members of the + Internet Engineering Task Force Security and Advisory Group, and a + member of the Lawrence Livermore National Laboratory Computer + Incident Advisory Capability. The first part of our efforts resulted + in the production of Internet Request For Comments (RFC) number 1262 + [Cerf 1991]. Beyond this, we have agreed that the appropriate action + at this point is to announce the study well ahead of running it via + the current RFC, augmented with an electronic posting that briefly + + + +Schwartz [Page 5] + +RFC 1273 A Measurement Study November 1991 + + + describes the study goals and methodology and points to this RFC. + That announcement will be posted to the Internet Engineering Task + Force mailing list, the comp.protocols.tcp-ip USENET bulletin board, + and the Computer Emergency Response Team's cert-tools mailing list. + Moreover, in case a site misses these announcements, we will run the + measurement software in a fashion intended to minimize the effort a + site administrator might expend to determine the nature of the + activity after detecting it. In particular, we will run the program + from an account called "testnet" on a machine with few other users + logged in. "Fingering" [Zimmerman 1990] this machine will indicate + the testnet login. "Fingering" the testnet login will return + information about this study. + + The data collected by this study is somewhat sensitive to privacy and + security concerns, in the sense that it might be used as a "road map" + of accessible network services. We will treat the raw data as + private information, publishing measurements only in global + statistical terms, divorced from the actual sites that make up the + underlying data points. We previously carried out a study with much + larger privacy implications than the current study [Schwartz & Wood + 1991], and successfully masked the data to protect individual + privacy. + +For Further Information + + Information about the general research program within which this + study fit is available by anonymous FTP from latour.cs.colorado.edu, + in pub/RD.Papers. This directory contains a "README" file that + describes the overall research project (which focuses on resource + discovery), and includes a bibliography. Particularly relevant are: + + o [Schwartz 1991b], a project overview; + + o [Schwartz 1991a], about an earlier, simpler version of the + current study; + + o [Schwartz & Tsirigotis 1991b], about the netfind white pages + tool; + + o [Schwartz & Tsirigotis 1991a], which considers a number of + the techniques used in this experiment, including those for + controlling the progress of the measurements; + + and + + o [Schwartz & Wood 1991], about an earlier study we carried out + that raises significant potential privacy questions, for + which we carefully masked the underlying data, presenting the + + + +Schwartz [Page 6] + +RFC 1273 A Measurement Study November 1991 + + + results without sacrificing individual privacy. + + Also: + + o [Cerf 1991], IAB guidelines for Internet measurement + activity. + + Once the results of this study are complete, we will publish them in + a conference or journal, as well as by anonymous FTP. + +Communication With Principal Investigator + + If you would like to have your site removed from this study, or you + would like to be added to the list of people who receive results from + this study, or you would like to communicate with the Principal + Investigator for some other reason, please send electronic mail to + schwartz@cs.colorado.edu. + +References + + [Cerf 1991] + Cerf, V., Editor, "Guidelines for Internet Measurement + Activities", RFC 1262, IAB, October 1991. + + [Schwartz & Tsirigotis 1991a] + Schwartz M., and P. Tsirigotis, "Techniques for + Supporting Wide Area Distributed Applications", Technical + Report CU-CS-519-91, Department of Computer Science, + University of Colorado, Boulder, Colorado, February 1991; + Revised August 1991. Submitted for publication. + + [Schwartz & Tsirigotis 1991b] + Schwartz M., and P. Tsirigotis "Experience with a + Semantically Cognizant Internet White Pages Directory + Tool", Journal of Internetworking: Research and Experience, + 2(1), pp. 23-50, March 1991. + + [Schwartz 1991a] + Schwartz, M., "The Great Disconnection?", Technical Report + CU-CS-521-91, Department of Computer Science, University of + Colorado, Boulder, Colorado, February 1991. + + [Schwartz & Wood 1991] + Schwartz M., and D. Wood, "A Measurement Study of + Organizational Properties in the Global Electronic Mail + Community", Technical Report CU-CS- 482-90, Department of + Computer Science, University of Colorado, Boulder, Colorado, + August 1990; Revised July 1991. Submitted for publication. + + + +Schwartz [Page 7] + +RFC 1273 A Measurement Study November 1991 + + + [Schwartz 1991b] + Schwartz, M., "Resource Discovery in the Global Internet", + Technical Report CU-CS-555-91, Department of Computer + Science, University of Colorado, Boulder, Colorado, + November 1991. Submitted for publication. + + [Zimmerman 1990] + Zimmerman, D., "The Finger User Information Protocol", + RFC 1194, Center for Discrete Mathematics and Theoretical + Computer Science, November 1990. + +Security Considerations + + Security issues are discussed in the "Network Appropriate Use and + Privacy Issues" section. + +Author's Address + + Michael F. Schwartz + Department of Computer Science + Campus Box 430 + University of Colorado + Boulder, Colorado 80309-0430 + + Phone: (303) 492-3902 + + EMail: schwartz@cs.colorado.edu + + + + + + + + + + + + + + + + + + + + + + + + +Schwartz [Page 8] +
\ No newline at end of file |