diff options
Diffstat (limited to 'doc/rfc/rfc1152.txt')
-rw-r--r-- | doc/rfc/rfc1152.txt | 1291 |
1 files changed, 1291 insertions, 0 deletions
diff --git a/doc/rfc/rfc1152.txt b/doc/rfc/rfc1152.txt new file mode 100644 index 0000000..da6ab89 --- /dev/null +++ b/doc/rfc/rfc1152.txt @@ -0,0 +1,1291 @@ + + + + + + +Network Working Group C. Partridge +Request for Comments: 1152 BBN Systems and Technologies + April 1990 + + + Workshop Report + Internet Research Steering Group Workshop on + Very-High-Speed Networks + +Status of this Memo + + This memo is a report on a workshop sponsored by the Internet + Research Steering Group. This memo is for information only. This + RFC does not specify an Internet standard. Distribution of this memo + is unlimited. + +Introduction + + The goal of the workshop was to gather together a small number of + leading researchers on high-speed networks in an environment + conducive to lively thinking. The hope is that by having such a + workshop the IRSG has helped to stimulate new or improved research in + the area of high-speed networks. + + Attendance at the workshop was limited to fifty people, and attendees + had to apply to get in. Applications were reviewed by a program + committee, which accepted about half of them. A few key individuals + were invited directly by the program committee, without application. + The workshop was organized by Dave Clark and Craig Partridge. + + This workshop report is derived from session writeups by each of the + session chairman, which were then reviewed by the workshop + participants. + +Session 1: Protocol Implementation (David D. Clark, Chair) + + This session was concerned with what changes might be required in + protocols in order to achieve very high-speed operation. + + The session was introduced by David Clark (MIT LCS), who claimed that + existing protocols would be sufficient to go at a gigabit per second, + if that were the only goal. In fact, proposals for high-speed + networks usually include other requirements as well, such as going + long distances, supporting many users, supporting new services such + as reserved bandwidth, and so on. Only by examining the detailed + requirements can one understand and compare various proposals for + protocols. A variety of techniques have been proposed to permit + protocols to operate at high speeds, ranging from clever + + + +Partridge [Page 1] + +RFC 1152 IRSG Workshop Report April 1990 + + + implementation to complete relayering of function. Clark asserted + that currently even the basic problem to be solved is not clear, let + alone the proper approach to the solution. + + Mats Bjorkman (Uppsala University) described a project that involved + the use of an outboard protocol processor to support high-speed + operation. He asserted that his approach would permit accelerated + processing of steady-state sequences of packets. Van Jacobson (LBL) + reported results that suggest that existing protocols can operate at + high speeds without the need for outboard processors. He also argued + that resource reservation can be integrated into a connectionless + protocol such as IP without losing the essence of the connectionless + architecture. This is in contrast to a more commonly held belief + that full connection setup will be necessary in order to support + resource reservation. Jacobson said that he has an experimental IP + gateway that supports resource reservation for specific packet + sequences today. + + Dave Borman (Cray Research) described high-speed execution of TCP on + a Cray, where the overhead is most probably the system and I/O + architecture rather than the protocol. He believes that protocols + such as TCP would be suitable for high-speed operation if the windows + and sequence spaces were large enough. He reported that the current + speed of a TCP transfer between the processors of a Cray Y-MP was + over 500 Mbps. Jon Crowcroft (University College London) described + the current network projects at UCL. He offered a speculation that + congestion could be managed in very high-speed networks by returning + to the sender any packets for which transmission capacity was not + available. + + Dave Feldmeier (Bellcore) reported on the Bellcore participation in + the Aurora project, a joint experiment of Bellcore, IBM, MIT, and + UPenn, which has the goal of installing and evaluating two sorts of + switches at gigabit speeds between those four sites. Bellcore is + interested in switch and protocol design, and Feldmeier and his group + are designing and implementing a 1 Gbps transport protocol and + network interface. The protocol processor will have special support + for such things as forward error correction to deal with ATM cell + loss in VLSI; a new FEC code and chip design have been developed to + run at 1 Gbps. + + Because of the large number of speakers, there was no general + discussion after this session. + + + + + + + + +Partridge [Page 2] + +RFC 1152 IRSG Workshop Report April 1990 + + +Session 2: High-Speed Applications (Keith Lantz, Chair) + + This session focused on applications and the requirements they impose + on the underlying networks. Keith Lantz (Olivetti Research + California) opened by introducing the concept of the portable office + - a world where a user is able to take her work with her wherever she + goes. In such an office a worker can access the same services and + the same people regardless of whether she is in the same building + with those services and people, at home, or at a distant site (such + as a hotel) - or whether she is equipped with a highly portable, + multi-media workstation, which she can literally carry with her + wherever she goes. Thus, portable should be interpreted as referring + to portability of access to services rather than to portability of + hardware. Although not coordinated in advance, each of the + presentations in this session can be viewed as a perspective on the + portable office. + + The bulk of Lantz's talk focused on desktop teleconferencing - the + integration of traditional audio/video teleconferencing technologies + with workstation-based network computing so as to enable + geographically distributed individuals to collaborate, in real time, + using multiple media (in particular, text, graphics, facsimile, + audio, and video) and all available computer-based tools, from their + respective locales (i.e., office, home, or hotel). Such a facility + places severe requirements on the underlying network. Specifically, + it requires support for several data streams with widely varying + bandwidths (from a few Kbps to 1 Gbps) but generally low delay, some + with minimal jitter (i.e., isochronous), and all synchronized with + each other (i.e., multi-channel or media synchronization). It + appears that high-speed network researchers are paying insufficient + attention to the last point, in particular. For example, the bulk of + the research on ATM has assumed that channels have independent + connection request and burst statistics; this is clearly not the case + in the context of desktop teleconferencing. + + Lantz also stressed the need for adaptive protocols, to accommodate + situations where the capacity of the network is exceeded, or where it + is necessary to interoperate with low-speed networks, or where human + factors suggest that the quality of service should change (e.g., + increasing or decreasing the resolution of a video image). Employing + adaptive protocols suggests, first, that the interface to the network + protocols must be hardware-independent and based only on quality of + service. Second, a variety of code conversion services should be + available, for example, to convert from one audio encoding scheme to + another. Promising examples of adaptive protocols in the video + domain include variable-rate constant-quality coding, layered or + embedded coding, progressive transmission, and (most recently, at + UC-Berkeley) the extension of the concepts of structured graphics to + + + +Partridge [Page 3] + +RFC 1152 IRSG Workshop Report April 1990 + + + video, such that the component elements of the video image are kept + logically separate throughout the production-to-presentation cycle. + + Charlie Catlett (National Center for Supercomputing Applications) + continued by analyzing a specific scientific application, simulation + of a thunderstorm, with respect to its network requirements. The + application was analyzed from the standpoint of identifying data flow + and the interrelationships between the computational algorithms, the + supercomputer CPU throughput, the nature and size of the data set, + and the available network services (throughput, delay, etc). + + Simulation and the visualization of results typically involves + several steps: + + 1. Simulation + + 2. Tessellation (transform simulation data into three-dimensional + geometric volume descriptions, or polygons) + + 3. Rendering (transform polygons into raster image) + + For the thunderstorm simulation, the simulation and tessellation are + currently done using a Cray supercomputer and the resulting polygons + are sent to a Silicon Graphics workstation to be rendered and + displayed. The simulation creates data at a rate of between 32 and + 128 Mbps (depending on the number of Cray-2 processors working on the + simulation) and the tessellation output data rate is in typically in + the range of 10 to 100 Mbps, varying with the complexity of the + visualization techniques. The SGI workstation can display 100,000 + polygons/sec which for this example translates to up to 10 + frames/sec. Analysis tools such as tracer particles and two- + dimensional slices are used interactively at the workstation with + pre-calculated polygon sets. + + In the next two to three years, supercomputer speeds of 10-30 GFLOPS + and workstation speeds of up to 1 GFLOPS and 1 million + polygons/second display are projected to be available. Increased + supercomputer power will yield a simulation data creation rate of up + to several Gbps for this application. The increased workstation + power will allow both tessellation and rendering to be done at the + workstation. The use of shared window systems will allow multiple + researchers on the network to collaborate on a simulation, with the + possibility of each scientist using his or her own visualization + techniques with the tessellation process running on his or her + workstation. Further developments, such as network virtual memory, + will allow the tessellation processes on the workstations to access + variables directly in supercomputer memory. + + + + +Partridge [Page 4] + +RFC 1152 IRSG Workshop Report April 1990 + + + Terry Crowley (BBN Systems and Technologies) continued the theme of + collaboration, in the context of real-time video and audio, shared + multimedia workspaces, multimedia and video mail, distributed file + systems, scientific visualization, network access to video and image + information, transaction processing systems, and transferring data + and computational results between workstations and supercomputers. + In general, such applications could help groups collaborate by + directly providing communication channels (real-time video, shared + multimedia workspaces), by improving and expanding on the kinds of + information that can be shared (multimedia and video mail, + supercomputer data and results), and by reducing replication and the + complexity of sharing (distributed file systems, network access to + video and image information). + + Actual usage patterns for these applications are hard to predict in + advance. For example, real-time video might be used for group + conferencing, for video phone calls, for walking down the hall, or + for providing a long-term shared viewport between remote locations in + order to help establish community ties. Two characteristics of + network traffic that we can expect are the need to provide multiple + data streams to the end user and the need to synchronize these + streams. These data streams will include real-time video, access to + stored video, shared multimedia workspaces, and access to other + multimedia data. A presentation involving multiple data streams must + be synchronized in order to maintain cross-references between them + (e.g., pointing actions within the shared multimedia workspace that + are combined with a voice request to delete this and save that). + While much traffic will be point-to-point, a significant amount of + traffic will involve conferences between multiple sites. A protocol + providing a multicast capability is critical. + + Finally, Greg Watson (HP) presented an overview of ongoing work at + the Hewlett-Packard Bristol lab. Their belief is that, while + applications for high-speed networks employing supercomputers are the + the technology drivers, the economic drivers will be applications + requiring moderate bandwidth (say 10 Mbps) that are used by everyone + on the network. + + They are investigating how multimedia workstations can assist + distributed research teams - small teams of people who are + geographically dispersed and who need to work closely on some area of + research. Each workstation provides multiple video channels, + together with some distributed applications running on personal + computers. The bandwidth requirements per workstation are about 40 + Mbps, assuming a certain degree of compression of the video channels. + Currently the video is distributed as an analog signal over CATV + equipment. Ideally it would all be carried over a single, unified + wide-area network operating in the one-to-several Gbps range. + + + +Partridge [Page 5] + +RFC 1152 IRSG Workshop Report April 1990 + + + They have constructed a gigabit network prototype and are currently + experimenting with uncompressed video carried over the same network + as normal data traffic. + +Session 3: Lightwave Technology and its Implications (Ira Richer, Chair) + + Bob Kennedy (MIT) opened the session with a talk on network design in + an era of excess bandwidth. Kennedy's research is focused on multi- + purpose networks in which bandwidth is not a scarce commodity, + networks with bandwidths of tens of terahertz. Kennedy points out + that a key challenge in such networks is that electronics cannot keep + up with fiber speeds. He proposes that we consider all-optical + networks (in which all signals are optical) with optoelectronic nodes + or gateways capable of recognizing and capturing only traffic + destined for them, using time, frequency, or code divisions of the + huge bandwidth. The routing algorithms in such networks would be + extremely simple to avoid having to convert fiber-optics into slower + electronic pathways to do switching. + + Rich Gitlin (AT&T Bell Labs) gave a talk on issues and opportunities + in broadband telecommunications networks, with emphasis on the role + of fiber optic and photonic technology. A three-level architecture + for a broadband telecommunications network was presented. The + network is B-ISDN/ATM 150 (Mbps) based and consists of: customer + premises equipment (PBXs, LANs, multimedia terminals) that access the + network via a router/gateway, a Network Node (which is a high + performance ATM packet switch) that serves both as a LAN-to-LAN + interconnect and as a packet concentrator for traffic destined for + CPE attached to other Network Nodes, and a backbone layer that + interconnects the NODES via a Digital Cross-Connect System that + provide reconfigurable SONET circuits between the NODES (the use of + circuits minizes delay and avoids the need for implementation of + peak-transmission-rate packet switching). Within this framework, the + most likely places for near-term application of photonics, apart from + pure transport (ie, 150 Mbps channels in a 2.4 Gbps SONET system), + are in the Cross-Connect (a Wavelength Division Multiplexed based + structure was described) and in next-generation LANs that provide + Gigabit per second throughputs by use of multiple fibers, concurrent + transmission, and new access mechanisms (such as store and forward). + + A planned interlocation Bell Labs multimedia gigabit/sec research + network, LuckyNet, was described that attempts to extend many of the + above concepts to achieve its principal goals: provision of a gigabit + per second capability to a heterogeneous user community, the + stimulation of applications that require Gpbs throughput (initial + applications are video conferencing and LAN interconnect), and, to + the extent possible, be based on standards so that interconnection + with other Gigabit testbeds is possible. + + + +Partridge [Page 6] + +RFC 1152 IRSG Workshop Report April 1990 + + +Session 4: High Speed Networks and the Phone System + (David Tennenhouse, Chair) + + David Tennenhouse (MIT) reported on the ATM workshop he hosted the + two days previous to this workshop. His report will appear as part + of the proceedings of his workshop. + + Wally St. John (LANL) followed with a presentation on the Los Alamos + gigabit testbed. This testbed is based on the High Performance + Parallel Interface (HPPI) and on crossbar switch technology. LANL + has designed its own 16x16 crossbar switch and has also evaluated the + Network Systems 8x8 crossbar switch. Future plans for the network + include expansion to the CASA gigabit testbed. The remote sites (San + Diego Supercomputer Center, Caltech, and JPL) are configured + similarly to the LANL testbed. The long-haul interface is from HPPI + to/from SONET (using ATM if in time). + + Wally also discussed some of the problems related to building a + HPPI-SONET gateway: + + a) Flow control. The HPPI, by itself, is only readily extensible + to 64 km because of the READY-type flow control used in the + physical layer. The gateway will need to incorporate larger + buffers and independent flow control. + + b) Error-rate expectations. SONET is only specified to have a + 1E-10 BER on a per hop basis. This is inadequate for long + links. Those in the know say that SONET will be much better + but the designer is faced with the poor BER in the SONET spec. + + c) Frame mapping. There are several interesting issues to be + considered in finding a good mapping from the HPPI packet + to the SONET frame. Some are what SONET STS levels will be + available in what time frame, the availability of concatenated + service, and the error rate issue. + + Dan Helman (UCSC) talked about work he has been doing with Darrell + Long to examine the interconnection of Internet networks via an ATM + B-ISDN network. Since network interfaces and packet processing are + the expensive parts of high-speed networks, they believe it doesn't + make sense to use the ATM backbone only for transmission; it should + be used for switching as well. Therefore gateways (either shared by + a subnet or integrated with fast hosts) are needed to encapsulate or + convert conventional protocols to ATM format. Gateways will be + responsible for caching connections to recently accessed + destinations. Since many short-lived low-bandwidth connections as + foreseen (e.g., for mail and ftp), routing in the ATM network (to set + up connections) should not be complicated - a form of static routing + + + +Partridge [Page 7] + +RFC 1152 IRSG Workshop Report April 1990 + + + should be adequate. Connection performance can be monitored by the + gateways. Connections are reestablished if unacceptable. All + decision making can be done by gateways and route servers at low + packet rates, rather than the high aggregate rate of the ATM network. + One complicated issue to be addressed is how to transparently + introduce an ATM backbone alongside the existing Internet. + +Session 5: Distributed Systems (David Farber, Chair) + + Craig Partridge (BBN Systems and Technologies) started this session + by arguing that classic RPC does not scale well to gigabit-speed + networks. The gist of his argument was that machines are getting + faster and faster, while the round-trip delay of networks is staying + relatively constant because we cannot send faster than the speed of + light. As a result, the effective cost of doing a simple RPC, + measured in instruction cycles spent waiting at the sending machine, + will become extremely high (millions of instruction cycles spent + waiting for the reply to an RPC). Furthermore, the methods currently + used to improve RPC performance, such as futures and parallel RPC, do + not adequately solve this problem. Future requests will have to be + made much much earlier if they are to complete by the time they are + needed. Parallel RPC allows multiple threads, but doesn't solve the + fact that each individual sequence of RPCs still takes a very long + time. + + Craig went on to suggest that there are at least two possible ways + out of the problem. One approach is to try to do a lot of caching + (to waste bandwidth to keep the CPU fed). A limitation of this + approach is that at some point the cache becomes so big that you have + to keep in consistent with other systems' caches, and you suddenly + find yourself doing synchronization RPCs to avoid doing normal RPCs + (oops!). A more promising approach is to try to consolidate RPCs + being sent to the same machine into larger operations which can be + sent as a single transaction, run on the remote machine, and the + result returned. (Craig noted that he is pursuing this approach in + his doctoral dissertation at Harvard). + + Ken Schroder (BBN Systems and Technologies) gave a talk on the + challenges of combining gigabit networks with wide-area heterogeneous + distributed operating systems. Ken feels the key goals of wide area + distributed systems will be to support large volume data transfers + between users of conferencing and similar applications, and to + deliver information to a large number of end users sharing services + such as satellite image databases. These distributed systems will be + motivated by the natural distribution of users, of information and of + expensive special purpose computer resources. + + Ken pointed to three of the key problems that must be addressed at + + + +Partridge [Page 8] + +RFC 1152 IRSG Workshop Report April 1990 + + + the system level in these environments: how to provide high + utilization; how to manage consistency and synchronization in the + presence of concurrency and non-determinism; and how to construct + scalable system and application services. Utilization is key only to + high performance applications, where current systems would be limited + by the cost of factors such as repeatedly copying messages, + converting data representations and switching between application and + operating system. Concurrency can be used improve performance, but + is also likely to occur in many programs inadvertently because of + distribution. Techniques are required both to exploit concurrency + when it is needed, and to limit it when non-determinism can lead to + incorrect results. Extensive research on ensuring consistency and + resolving resource conflicts has been done in the database area, + however distributed scheduling and the need for high availability + despite partial system failures introduce special problems that + require additional research. Service scalability will be required to + support customer needs as the size of the user community grow. It + will require attention both ensuring that components do not break + when they are subdivided across additional processors to support a + larger user population, and to ensure that performance does to each + user can be affordably maintained as new users are added. + + In a bold presentation, Dave Cheriton (Stanford) made a sweeping + argument that we are making a false dichotomy between distributed + operating systems and networks. In a gigabit world, he argued, the + major resource in the system is the network, and in a normal + operating system we would expect such a critical resource to be + managed by the operating system. Or, put another way, the gigabit + network distributed operating system should manage the network. + Cheriton went on to say that if a gigabit distributed operating + system is managing the network, then it is perfectly reasonable to + make the network very dumb (but fast) and put the system intelligence + in the operating systems on the hosts that form the distributed + system. + + In another talk on interprocess communication, Jonathan Smith (UPenn) + again raised the problem of network delay limiting RPC performance. + In contrast to Partridge's earlier talk, Smith argued that the + appropriate approach is anticipation or caching. He justified his + argument with a simple cost example. If a system is doing a page + fetch between two systems which have a five millisecond round-trip + network delay between them, the cost of fetching n pages is: + + 5 msec + (n-1) * 32 usec + + Thus the cost of fetching an additional page is only 32 usec, but + underfetching and having to make another request to get a page you + missed costs 5000 usec. Based on these arguments, Smith suggested + + + +Partridge [Page 9] + +RFC 1152 IRSG Workshop Report April 1990 + + + that we re-examine work in virtual memory to see if there are + comfortable ways to support distributed virtual memory with + anticipation. + + In the third talk on RPC in the session, Tommy Joseph (Olivetti), for + reasons similar to those of Partridge and Smith, argued that we have + to get rid of RPC and give programmers alternative programming + paradigms. He sketched out ideas for asynchronous paradigms using + causal consistency, in which systems ensure that operations happen in + the proper order, without synchronizing through a single system. + +Session 6: Hosts and Host Interfaces (Gary Delp, Chair) + + Gary Delp (IBM Research) discussed several issues involved in the + increase in speed of network attachment to hosts of increasing + performance. These issues included: + + - Media Access - There are aspects of media access that are + best handled by dedicated silicon, but there are also aspects + that are best left to a general-purpose processor. + + - Compression - Some forms of compression/expansion may belong + on the network interface; most will be application-specific. + + - Forward Error Correction - The predicted major packet loss + mode is packet drops due to internal network congestion, rather + than bit errors, so forward error correction internal to a + packet may not be useful. On the other hand, the latency cost + of not being able to recover from bit errors is very high. + Some proposals were discussed which suggest that FEC among + packet groups, with dedicated hardware support, is the way + to go. + + - Encryption/Decryption - This is a computationally intensive + task. Most agree that if it is done with all traffic, some + form of hardware support is helpful. Where does it fit in the + protocol stack? + + - Application Memory Mapping - How much of the host memory + structure should be exposed to the network interface? + Virtual memory and paging complicate this issue considerably. + + - Communication with Other Channel Controllers - Opinions were + expressed that ranged from absolutely passive network + interfaces to interfaces that run major portions of the + operating system and bus arbitration codes. + + - Blocking/Segmentation - The consensus is that B/S should + + + +Partridge [Page 10] + +RFC 1152 IRSG Workshop Report April 1990 + + + occur wherever the transport layer is processed. + + - Routing - This is related to communications with other + controllers. A routing-capable interface can reduce the bus + requirements by a factor of two. + + - Intelligent participation in the host structure as a gateway, + router, or bridge. + + - Presentation Layer issues - All of the other overheads can be + completely overshadowed by this issue if it is not solved well + and integrated into the overall host architecture. This points + out the need for some standardization of representation (IEEE + floating point, etc.) + + Eric Cooper (CMU) summarized some initial experience with Nectar, a + high-speed fiber-optic LAN that has been built at Carnegie Mellon. + Nectar consists of an arbitrary mesh of crossbar switches connected + by means of 100 Mbps fiber-optic links. Hosts are connected to + crossbar switches via communication processor boards called CABs. + The CAB presents a memory-mapped interface to user processes and + off-loads all protocol processing from the host. + + Preliminary performance figures show that latency is currently + limited by the number of VME operations required by the host-to-CAB + shared memory interface in the course of sending and receiving a + message. The bottleneck in throughput is the speed of the VME + interface: although processes running on the CABs can communicate + over Nectar at 70 Mbps, processes on the hosts are limited to + approximately 25 Mbps. + + Jeff Mogul (DEC Western Research Lab) made these observations: + Although off-board protocol processors have been a popular means to + connect a CPU to a network, they will be less useful in the future. + In the hypothetical workstation of the late 1990s, with a 1000-MIPS + CPU and a Gbps LAN, an off-board protocol processor will be of no + use. The bottleneck will not be the computation required to + implement the protocol, but the cost of moving the packet data into + the CPU's cache and the cost of notifying the user process that the + data is available. It will take far longer (hundreds of instruction + cycles) to perform just the first cache miss (required to get the + packet into the cache) than to perform all of the instructions + necessary to implement IP and TCP (perhaps a hundred instructions). + + A high-speed network interface for a reasonably-priced system must be + designed with this cost structure in mind; it should also eliminate + as many CPU interrupts as possible, since interrupts are also very + expensive. It makes more sense to let a user process busy-wait on a + + + +Partridge [Page 11] + +RFC 1152 IRSG Workshop Report April 1990 + + + network-interface flag register than to suspend it and then take an + interrupt; the normal CPU scheduling mechanism is more efficient than + interrupts if the network interactions are rapid. + + David Greaves (Olivetti Research Ltd.) briefly described the need for + a total functionality interface architecture that would allow the + complete elimination of communication interrupts. He described the + Cambridge high-speed ring as an ATM cell-like interconnect that + currently runs at 500-1000 MBaud, and claims that ATM at that speed + is a done deal. Dave Tennenhouse also commented that ATM at high + speeds with parallel processors is not the difficult thing that + several others have been claiming. + + Bob Beach (Ultra Technologies) started his talk with the observation + that networking could be really fast if only we could just get rid of + the hosts. He then supported his argument with illustrations of + 80-MByte/second transfers to frame buffers from Crays that drop to + half that speed when the transfer is host-to-host. Using null + network layers and proprietary MAC layers, the Ultra Net system can + communicate application-to-application with ISO TP4 as the transport + layer at impressive rates of speed. The key to high-speed host + interconnects has been found to be both large packets and large (on + the order of one megabyte) channel transfer requests. Direct DMA + interfaces exhibit much smaller transfer latencies. + + Derek McAuley (University Cambridge Computer Laboratory) described + work of the Fairisle project which is producing an ATM network based + on fast packet switches. A RISC processor (12 MIPS) is used in the + host interface to do segmentation/reassembly/demultiplexing. Line + rates of up to 150 Mbps are possible even with this modest processor. + Derek has promised that performance and requirement results from this + system will be published in the spring. + + Bryan Lyles (XEROX PARC) volunteered to give an abbreviated talk in + exchange for discussion rights. He reported that Xerox PARC is + interested in ATM technology and wants to install an ATM LAN at the + earliest possible opportunity. Uses will include such applications + as video where guaranteed quality of service (QOS) is required. ATM + technology and the desire for guaranteed QOS places a number of new + constraints on the host interface. In particular, they believe that + they will be forced towards rate-based congestion control. Because + of implementation issues and burst control in the ATM switches, the + senders will be forced to do rate based control on a cell-by-cell + basis. + + Don Tolmie (Los Alamos National Laboratory) described the High- + Performance Parallel Interface (HPPI) of ANSI task group X3T9.3. The + HPPI is a standardized basic building block for implementing, or + + + +Partridge [Page 12] + +RFC 1152 IRSG Workshop Report April 1990 + + + connecting to, networks at the Gbps speeds, be they ring, hub, + cross-bar, or long-haul based. The HPPI physical layer operates at + 800 or 1600 Mbps over 25-meter twisted-pair copper cables in a + point-to-point configuration. The HPPI physical layer has almost + completed the standards process, and a companion HPPI data framing + standard is under way, and a Fiber Channel standard at comparable + speeds is also being developed. Major companies have completed, or + are working on, HPPI interfaces for supercomputers, high-end + workstations, fiber-optic extenders, and networking components. + + The discussion at the end of the session covered a range of topics. + The appropriateness of outboard protocol processing was questioned. + Several people agreed that outboarding on a Cray (or similar + cost/performance) machines makes economic sense. Van Jacobson + contended that for workstations, a simple memory-mapped network + interface that provides packets visible to the host processor may + well be the ideal solution. + + Bryan Lyles reiterated several of his earlier points, asserting that + when we talk about host interfaces and how to build them we should + remember that we are really talking about process-to-process + communication, not CPU-to-CPU communication. Not all processes run + on the central CPU, e.g., graphics processors and multimedia. + Outboard protocol processing may be a much better choice for these + architectures. + + This is especially true when we consider that memory/bus bandwidth is + often a bottleneck. When our systems run out of bandwidth, we are + forced towards a NUMA model and multiple buses to localize memory + traffic. + + Because of QOS issues, the receiver must be able to tell the sender + how fast it can send. Throwing away cells (packets) will not work + because unwanted packets will still clog the receiver's switch + interface, host interface, and requires processing to throw away. + +Session 7: Congestion Control (Scott Shenker, Chair) + + The congestion control session had six talks. The first two talks + were rather general, discussing new approaches and old myths. The + other four talks discussed specific results on various aspects of + packet (or cell) dropping: how to avoid drops, how to mitigate their + impact on certain applications, a calculation of the end-to-end + throughput in the presence of drops, and how rate-based flow control + can reduce buffer usage. Thumbnail sketches of the talks follow. + + In the first of the general talks, Scott Shenker (XEROX PARC) + discussed how ideas from economics can be applied to congestion + + + +Partridge [Page 13] + +RFC 1152 IRSG Workshop Report April 1990 + + + control. Using economics, one can articulate questions about the + goals of congestion control, the minimal feedback necessary to + achieve those goals, and the incentive structure of congestion + control. Raj Jain (DEC) then discussed eight myths related to + congestion control in high-speed networks. Among other points, Raj + argued that (1) congestion problems will not become less important + when memory, processors, and links become very fast and cheap, (2) + window flow control is required along with rate flow control, and (3) + source-based controls are required along with router-based control. + + In the first of the more specific talks, Isidro Castineyra (BBN + Communications Corporation) presented a back-of-the-envelope + calculation on the effect of cell drops on end-to-end throughput. + While at extremely low drop rates the retransmission strategies of + go-back-n and selective retransmission produced similar end-to-end + throughput, at higher drop rates selective retransmission achieved + much higher throughput. Next, Tony DeSimone (AT&T) told us why + high-speed networks are not just fast low-speed networks. If the + buffer/window ratio is fixed, the drop rate decreases as the network + speed increases. Also, data was presented which showed that adaptive + rate control can greatly decrease buffer utilization. Jamal + Golestani (Bellcore) then presented his work on stop-and-go queueing. + This is a simple stalling algorithm implemented at the switches which + guarantees no dropped packets and greatly reduces delay jitter. The + algorithm requires prior bandwidth reservation and some flow control + on sources, and is compatible with basic FIFO queues. In the last + talk, Victor Frost (University of Kansas) discussed the impact of + different dropping policies on the perceived quality of a voice + connection. When the source marks the drop priority of cells and the + switch drops low priority cells first, the perceived quality of the + connection is much higher than when cells are dropped randomly. + +Session 8: Switch Architectures (Dave Sincoskie, Chair) + + Dave Mills (University of Delaware) presented work on a project now + under way at the University of Delaware to study architectures and + protocols for a high-speed network and packet switch capable of + operation to the gigabit regime over distances spanning the country. + It is intended for applications involving very large, very fast, very + bursty traffic typical of supercomputing, remote sensing, and + visualizing applications. The network is assumed to be composed of + fiber trunks, while the switch architecture is based on a VLSI + baseband crossbar design which can be configured for speeds from 25 + Mbps to 1 Gbps. + + Mills' approach involves an externally switched architecture in which + the timing and routing of flows between crossbar switches are + determined by sequencing tables and counters in high-speed memory + + + +Partridge [Page 14] + +RFC 1152 IRSG Workshop Report April 1990 + + + local to each crossbar. The switch program is driven by a + reservation-TDMA protocol and distributed scheduling algorithm + running in a co-located, general-purpose processor. The end-to-end + customers are free to use any protocol or data format consistent with + the timing of the network. His primary interest in the initial + phases of the project is the study of appropriate reservation and + scheduling algorithms. He expect these algorithms to have much in + common with the PODA algorithm used in the SATNET and WIDEBAND + satellite systems and to the algorithms being considered for the + Multiple Satellite System (MSS). + + John Robinson (JR, BBN Systems and Technologies) gave a talk called + Beyond the Butterfly, which described work on a design for an ATM + cell switch, known as MONET. The talk described strategies for + buffering at the input and output interfaces to a switch fabric + (crossbar or butterfly). The main idea was that cells should be + introduced to the switch fabric in random sequence and to random + fabric entry ports to avoid persistent traffic patterns having high + cell loss in the switch fabric, where losses arise due to contention + at output ports or within the switch fabric (in the case of a + butterfly). Next, the relationship of this work to an earlier design + for a large-scale parallel processor, the Monarch, was described. In + closing, JR offered the claim that this class of switch is realizable + in current technology (barely) for operation over SONET OC-48 2.4 + Gbps links. + + Dave Sincoskie (Bellcore) reported on two topics: recent switch + construction at Bellcore, and high-speed processing of ATM cells + carrying VC or DG information. Recent switch design has resulted in + a switch architecture named SUNSHINE, a Batcher-banyan switch which + uses recirculation and multiple output banyans to resolve contention + and increase throughput. A paper on this switch will be published at + ISS '90, and is available upon request from the author. One of the + interesting traffic results from simulations of SUNSHINE shows that + per-port output queues of up to 1,000 cells (packets) may be + necessary for bursty traffic patterns. Also, Bill Marcus (at + Bellcore) has recently produced Batcher-banyan (32x32) chips which + test up to 170Mb/sec per port. + + The second point in this talk was that there is little difference in + the switching processing of Virtual Circuit (VC) and Datagram (DG) + traffic that which has been previously broken into ATM cells at the + network edge. The switch needs to do a header translation operation + followed by some queueing (not necessarily FIFO). The header + translation of the VC and DG cells differs mainly in the memory + organization of the address translation tables (dense vs. sparse). + + The discussion after the presentations seemed to wander off the topic + + + +Partridge [Page 15] + +RFC 1152 IRSG Workshop Report April 1990 + + + of switching, back to some of the source-routing vs. network routing + issues discussed earlier in the day. + +Session 9: Open Mike Night (Craig Partridge, Chair) + + As an experiment, the workshop held an open mike session during the + evening of the second day. Participants were invited to speak for up + to five minutes on any subject of their choice. Minutes of this + session are sketchy because the chair found himself pre-occupied by + keeping speakers roughly within their time limits. + + Charlie Catlett (NSCA) showed a film of the thunderstorm simulations + he discussed earlier. + + Dave Cheriton (Stanford) made a controversial suggestion that perhaps + one could manage congestion in the network simply by using a steep + price curve, in which sending large amounts of data cost + exponentially more than sending small amounts of data (thus leading + people only to ask for large bandwidth when they needed it, and + having them pay so much, that we can afford to give it to them). + + Guru Parulkar (Washington University, St. Louis) argued that the + recent discussion on appropriateness of existing protocol and need + for new protocols (protocol architecture) for gigabit networking + lacks the right focus. The emphasis of the discussion should be on + what is the right functionality for gigabit speeds, which is simpler + per packet processing, combination of rate and window based flow + control, smart retransmission strategy, appropriate partitioning of + work among host cpu+os, off board cpu, and custom hardware, and + others. It is not surprising that the existing protocols can be + modified to include this functionality. By the same token, it is not + surprising that new protocols can be designed which take advantage of + lessons of existing protocols and also include other features + necessary for gigabit speeds. + + Raj Jain (DEC) suggested we look at new ways to measure protocol + performance, suggesting our current metrics are insufficiently + informative. + + Dan Helman (UCSC) asked the group to consider, more carefully, who + exactly the users of the network will be. Large consumers? or many + small consumers? + + + + + + + + + +Partridge [Page 16] + +RFC 1152 IRSG Workshop Report April 1990 + + +Session 10: Miscellaneous Topics (Bob Braden, Chair) + + As its title implies, this session covered a variety of different + topics relating to high-speed networking. + + Jim Kurose (University of Massachussetts) described his studies of + scheduling and discard policies for real-time (constrained delay) + traffic. He showed that by enforcing local deadlines at switches + along the path, it is possible to significantly reduce overall loss + for such traffic. Since his results depend upon the traffic model + assumptions, he ended with a plea for work on traffic models, stating + that Poisson models can sometimes lead to results that are wrong by + many orders of magnitude. + + Nachum Shacham (SRI International) discussed the importance of error + correction schemes that can recover lost cells, and as an example + presented a simple scheme based upon longitudinal parity. He also + showed a variant, diagonal parity, which allows a single missing cell + to be recreated and its position in the stream determined. + + Two talks concerned high-speed LANs. Biswanath Muhkerjee (UC Davis) + surveyed the various proposals for fair scheduling on unidirectional + bus networks, especially those that are distance insensitive, i.e., + that can achieve 100% channel utilization independent of the bus + length and data rate. He described in particular his own scheme, + which he calls p-i persistant. + + Howard Salwen (Proteon), speaking in place of Mehdi Massehi of IBM + Zurich who was unable to attend, also discussed high-speed LAN + technologies. At 100 Mbps, a token ring has a clear advantage, but + at 1 Gbps, the speed of light kills 802.6, for example. He briefly + described Massehi's reservation-based scheme, CRMA (Cyclic- + Reservation Multiple-Access). + + Finally, Yechiam Yemeni (YY, Columbia University) discussed his work + on a protocol silicon compiler. In order to exploit the potential + parallelism, he is planning to use one processor per connection. + + The session closed with a spirited discussion of about the relative + merits of building an experimental network versus simulating it. + Proponents of simulation pointed out the high cost of building a + prototype and limitation on the solution space imposed by a + particular hardware realization. Proponents of building suggested + that artificial traffic can never explore the state space of a + network as well as real traffic can, and that an experimental + prototype is important for validating simulations. + + + + + +Partridge [Page 17] + +RFC 1152 IRSG Workshop Report April 1990 + + +Session 11: Protocol Architectures (Vint Cerf, Chair) + + Nick Maxemchuk (AT&T Bell Labs) summarized the distinctions between + circuit switching, virtual circuits, and datagrams. Circuits are + good for (nearly) constant rate sources. Circuit switching dedicates + resources for the entire period of service. You have to set up the + resource allocation before using it. In a 1.7 Gbps network, a 3000- + mile diameter consumes 10**7 bytes during the circuit set-up round- + trip time, and potentially the same for circuit teardown. Some + service requirements (file transfer, facsimile transmission) are far + smaller than the wasted 2*10**7 bytes these circuit management delays + impose. (Of course, these costs are not as dramatic if the allocated + bandwidth is less than the maximum possible.) + + Virtual circuits allow shared use of bandwidth (multiplexing) when + the primary source of traffic is idle (as in Voice Time Assigned + Speech Interpolation). The user notifies the network of planned + usage. + + Datagrams (DG) are appropriate when there is no prior knowledge of + use statistics or usage is far less than the capacity wasted during + circuit or virtual circuit set-up. One can adaptively route traffic + among equivalent resources. + + In gigabit ATMs, the high service speed and decreased cell size + increases the relative burstiness of service requests. All of these + characteristics combine to make DG service very attractive. + + Maxemchuk then described a deflection routing notion in which traffic + would be broken into units of fixed length and allowed into the + network when capacity was available and routed out by any available + channel, with preference being given to the channel on the better + path. This idea is similar to the hot potato routing of Paul Baran's + 1964 packet switching design. With buffering (one buffer), Maxemchuk + achieved a theoretical 90% utilization. Large reassembly buffers + provide for better throughput. + + Maxemchuk did not have an answer to the question: how do you make + sure empty "slots" are available where needed? This is rather like + the problem encountered by D. Davies at the UK National Physical + Laboratory in his isarythmic network design in which a finite number + of crates are available for data transport throughout the network. + + Guru Parulkar (Washington University, St. Louis) presented a broad + view of an Internet architecture in which some portion of the system + would operate at gigabit speeds. In his model, internet, transport, + and application protocols would operate end to end. The internet + functions would be reflected in gateways and in the host/net + + + +Partridge [Page 18] + +RFC 1152 IRSG Workshop Report April 1990 + + + interface, as they are in the current Internet. However, the + internet would support a new type of service called a congram which + aims at combining strengths of both soft connection and datagram. + + In this architecture, a variable grade of service would be provided. + Users could request congrams (UCON) or the system could set them up + internally (Picons) to avoid end-to-end setup latency. The various + grades of service could be requested, conceptually, by asserting + various required (desired) levels of error control, throughput, + delay, interarrival jitter, and so on. Gateways based on ATM + switches, for example, would use packet processors at entry/exit to + do internet specific per packet processing, which may include + fragmentation and reassembly of packets (into and out of ATM cells). + + At the transport level, Parulkar argued for protocols which can + provide application-oriented flow and error control with simple per + packet processing. He also mentioned the notion of a generalized RPC + (GRPC) in which code, data, and execution might be variously local or + remote from the procedure initiator. GRPC can be implemented using + network level virtual storage mechanisms. + + The basic premise of Raj Yavatkar's presentation (University of + Kentucky) was that processes requiring communication service would + specify their needs in terms of peak and average data rate as well as + defining burst parameters (frequency and size). Bandwidth for a + given flow would be allocated at the effective data rate that is + computed on the basis of flow parameters. The effective data rate + lies somewhere between the peak and average data rate based on the + burst parameters. Statistical multiplexing would take up the gap + between peak and effective rate when a sudden burst of traffic + arrives. Bounds on packet loss rate can be computed for a given set + of flow parameters and corresponding effective data rate. + + This presentation led to a discussion about deliberate disciplining + of inter-process communication demands to match the requested flow + (service) profile. This point was made in response to the + observation that we often have little information about program + behavior and might have trouble estimating the network service + requirements of any particular program. + +Architectural Discussion + + An attempt was made to conduct a high-level discussion on various + architectural questions. The discussion yielded a variety of + opinions: + + 1. The Internet would continue to exist in a form similar + to its current incarnation, and gateways would be required, + + + +Partridge [Page 19] + +RFC 1152 IRSG Workshop Report April 1990 + + + at least to interface the existing facilities to the high + speed packet switching environment. + + 2. Strong interest was expressed by some participants in access + to raw (naked ATM) services. This would permit users + to construct their own gigabit nets, at the IP level, at any + rate. The extreme view of this was taken by David Cheriton + who would prefer to have control over routing decisions and + other behavior of the ATM network. + + 3. The speed of light problem (latency, round-trip delay) + is not going to go away and will have serious impact on + control of the system. The optimistic view was taken, + for example, by Craig Partridge and Van Jacobson, who felt + that many of the existing network and communications + management mechanisms used in the present Internet protocols + would suffice, if suitably implemented, at higher speeds. + A less rosy view was taken by David Clark who observed + (as did others) that many transactions would be serviced in + much less than one round-trip time, so that any end-to-end + controls would be largely useless. + + 4. For applications requiring fixed, periodic service, + reservation of resource seemed reasonably attractive to many + participants, as long as the service period dominated the + set-up time (round-trip delay) by an appreciable + margin. + + 5. There was much discussion throughout the workshop of + congestion control and flow control. Although these + problems were not new, they took on somewhat newer + character in the presence of much higher round-trip delays + (measured in bits outstanding). One view is that end-to-end + flow control is needed, in any case, to moderate sources + sending to limited bandwidth receivers. End-to-end flow + control may not, however, be sufficient to protect the + interior of the network from congestion problems, so + additional, intra-network means are needed to cope with + congestion hot spots. Eventually such conditions + have to be reflected to the periphery of the network to + moderate traffic sources. + + 6. There was disagreement on the build or simulate + question. One view was in favor of building network + components so as to collect and understand live application + data. Another view held that without some careful + simulation, one might have little idea what to build + (for example, Sincoskie's large buffer pool requirement was + + + +Partridge [Page 20] + +RFC 1152 IRSG Workshop Report April 1990 + + + not apparent until the system was simulated). + +Comments from Workshop Evaluation Forms + + At the end of the IRSG workshop, we asked attendees to fill out an + evaluation form. Of the fifty-one attendees, twenty-nine (56%) + turned in a form. + + The evaluation form asked attendees to answer two questions: + + #1. Do you feel that having attended this workshop will help you + in your work on high speed networks during the next year? + + #2. What new ideas, questions, or issues, did you feel were + brought up in the workshop? + + In this section we discuss the answers we got to both questions. + +Question #1 + + There was a satisfying unanimity of opinion on question #1. Twenty- + four attendees answered yes, often strongly (e.g., Absolutely and + very much so). Of the remaining five respondents, three said they + expected it to have some effect on their research and only two said + the workshop would have little or no effect. + + Some forms had some additional notes about why the workshop helped + them. Several people mentioned that there was considerable benefit + to simply meeting and talking with people they hadn't met before. A + few other people noted that the workshop had broadened their + perspective, or improved their understanding of certain issues. A + couple of people noted that they'd heard ideas they thought they + could use immediately in their research. + +Question #2 + + Almost everyone listed ideas they'd seen presented at the conference + which were new to them. + + It is clear that which new ideas were important was a matter of + perspective - the workshop membership was chosen to represent a broad + spectrum of specialties, and people in different specialities were + intrigued by different ideas. However, there were some general + themes raised in many questionnaires: + + + (1) Limitations of our traffic models. This particular subject + was mentioned, in some form, on many forms. The attendees + + + +Partridge [Page 21] + +RFC 1152 IRSG Workshop Report April 1990 + + + generally felt we didn't understand how network traffic would + behave on a gigabit network, and were concerned that people + might design (or worse, standardize) network protocols for + high speed networks that would prove inadequate when used + with real traffic. Questions were raised about closed-loop + vs. open-loop traffic models and the effects of varying types + of service. This concern led several people to encourage the + construction of a high-speed testbed, so we can actually see + some real traffic. + + (2) Congestion control. Despite the limitations of our traffic + models, respondents felt that congestion control at both + switching elements and network wide was going to be even more + important than today, due to the wider mix of traffic that + will appear on gigabit networks. Most forms mentioned at + least one of the congestion control talks as a containing a + new idea. The talks by Victor Frost, Jamal Golestani and + Scott Shenker received the most praise. Some attendees were + also interested in methods for keeping the lower-layer + switching fabric from getting congested and mentioned the + talks by Robinson and Maxemchuk as of interest. + + (3) Effects of fixed delay. While the reviews were by no means + unanimous, many people had come to the conclusion that the + most serious problem in gigabit networking was not bandwidth, + but delay. The workshop looked at this issue in several + guises, and most people listed at least one aspect of fixed + delay as a challenging new problem. Questions that people + mentioned include: + + + - How to avoid a one round-trip set up delay, for less than one + round-trip time's worth of data? + + - How to recover from error without retransmission (and thus + additional network delays)? Several people were intrigued by + Nachum Shacham's work on error detection and recovery. + + - Should we use window flow-control or rate-based flow control + when delays were long? + + - Can we modify the idea of remote procedure calls to deal with + the fact that delays are relatively long? + +A couple of attendees noted that while some of these problems looked +similar to those of today, the subtle differences caused by operating a +network at gigabit speeds led them to believe the actual approaches to +solving these problems would have to be very different from those of + + + +Partridge [Page 22] + +RFC 1152 IRSG Workshop Report April 1990 + + +today. + +Security Considerations + + Security issues are not discussed in this memo. + +Author's Address + + Craig Partridge + Bolt Beranek and Newman Inc. + 50 Moulton Street + Cambridge, MA 02138 + + Phone: (617) 873-2459 + + EMail: craig@BBN.COM + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Partridge [Page 23] +
\ No newline at end of file |