summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc2391.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc2391.txt')
-rw-r--r--doc/rfc/rfc2391.txt1011
1 files changed, 1011 insertions, 0 deletions
diff --git a/doc/rfc/rfc2391.txt b/doc/rfc/rfc2391.txt
new file mode 100644
index 0000000..3dc7b38
--- /dev/null
+++ b/doc/rfc/rfc2391.txt
@@ -0,0 +1,1011 @@
+
+
+
+
+
+
+Network Working Group P. Srisuresh
+Request for Comments: 2391 Lucent Technologies
+Category: Informational D. Gan
+ Juniper Networks, Inc.
+ August 1998
+
+
+ Load Sharing using IP Network Address Translation (LSNAT)
+
+Status of this Memo
+
+ This memo provides information for the Internet community. It does
+ not specify an Internet standard of any kind. Distribution of this
+ memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (1998). All Rights Reserved.
+
+Preface
+
+ This document combines the idea of address translation described in
+ RFC 1631 with real-time load share algorithms to introduce Load Share
+ Network Address Translators(or, simply LSNATs). LSNATs would
+ transparently offload network load on a single server and distribute
+ the load across a pool of servers.
+
+Abstract
+
+ Network Address Translators (NATs) translate IP addresses in a
+ datagram, transparent to end nodes, while routing the datagram. NATs
+ have traditionally been been used to allow private network domains to
+ connect to Global networks using as few as one globally unique IP
+ address. In this document, we extend the use of NATs to offer Load
+ share feature, where session load can be distributed across a pool of
+ servers, instead of directing to a single server. Load sharing is
+ beneficial to service providers and system administrators alike in
+ grappling with scalability of servers with increasing session load.
+
+1. Introduction
+
+ Traditionally, Network Address Translators, or simply NATs were used
+ to connect private network domains to globally unique public domain
+ IP networks. Applications originate in private domains and NATs would
+ transparently translate datagrams belonging to these applications in
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 1]
+
+RFC 2391 LSNAT August 1998
+
+
+ either direction. This document combines the characteristic of
+ transparent address translation with real-time load share algorithms
+ to introduce Load Share Network Address Translators.
+
+ The problem of Load sharing or Load balancing is not new and goes
+ back many years. A variety of techniques were applied to address the
+ problem. Some very ad-hoc and platform specific and some employing
+ clever schemes to reorder DNS resource records. REF [11] uses DNS
+ zone transfer program in name servers to periodically shuffle the
+ order of resource records for server nodes based on a pre-determined
+ load balancing algorithm. The problem with this approach is that
+ reordering time periods can be very large on the order of minutes and
+ does not reflect real-time load variations on the servers. Secondly,
+ all hosts in the server pool are assumed to have equal capability to
+ offer all services. This may not often be the case. In addition,
+ there may be requirement to support load balancing for a few specific
+ services only. The load share approach outlined in this document
+ addresses both these concerns and offers a solution that does not
+ require changes to clients or servers and one that can be tailored to
+ individual services or for all services.
+
+ For the reminder of this document, we will refer to NAT routers that
+ provide load sharing support as LSNATs. Unlike traditional NATs,
+ LSNATs are not required to operate between private and public domain
+ routing realms alone. LSNATs also operate in a single routing realm
+ and provide load sharing functionality.
+
+ The need for Load sharing arises when a single server is not able to
+ cope with increasing demand for multiple sessions simultaneously.
+ Clearly, load sharing across multiple servers would enhance
+ responsiveness and scale well with session load. Popular applications
+ inundating servers would include Web browsers, remote login, file
+ transfer and mail applications.
+
+ When a client attempts to access a server through an LSNAT router,
+ the router selects a node in server pool, based on a load share
+ algorithm and redirect the request to that node. LSNATs pose no
+ restriction on the organization and rearrangement of nodes in server
+ pool. Nodes in a pool may be replaced, new nodes may be added and
+ others may be in transition. Changes of this kind to server pool can
+ be shielded from client nodes by making LSNAT router the focal point
+ for change management.
+
+ There are limitations to using LSNATs. Firstly, it is mandatory that
+ all requests and responses pertaining to a session between a client
+ and server be routed via the same LSNAT router. For this reason, we
+ recommend LSNATs to be operated on a single border router to a stub
+ domain in which the server pool would be confined. This would ensure
+
+
+
+Srisuresh & Gan Informational [Page 2]
+
+RFC 2391 LSNAT August 1998
+
+
+ that all traffic directed to servers from clients outside the domain
+ and vice versa would necessarily traverse the LSNAT border router.
+ Later in the document, we will examine a special case of LSNAT setup,
+ which gets around the topological constraint on server pool. Another
+ limitation of LSNATs is the inability to switch loads between hosts
+ in the midst of sessions. This is because LSNATs measure load in
+ granularity of sessions. Once a session is assigned to a host, the
+ session cannot be moved to a different host till the end of that
+ session. Other limitations, inherent to NATs, as outlined in REF [1]
+ are also applicable to LSNATs.
+
+ As with traditional NATs, LSNATs have the disadvantage of taking away
+ the end-to-end significance of an IP address. The major advantage,
+ however, is that it can be installed without changes to clients or
+ servers.
+
+2. Terminology and concepts used
+
+2.1. TU ports, Server ports, Client ports
+
+ For the reminder of this document, we will refer TCP/UDP ports
+ associated with an IP address simply as "TU ports".
+
+ For most TCP/IP hosts, TU port range 0-1023 is used by servers
+ listening for incoming connections. Clients trying to initiate a
+ connection typically select a TU port in the range of 1024-65535.
+ However, this convention is not universal and not always followed. It
+ is possible for client nodes to initiate connections using a TU port
+ number in the range of 0-1023, and there are applications listening
+ on TU port numbers in the range of 1024-65535.
+
+ A complete list of TU port services may be found in REF [2]. The TU
+ ports used by servers to listen for incoming connections are called
+ "Server Ports" and the TU ports used by clients to initiate a
+ connection to server are called "Client Ports".
+
+2.2. Session flow vs. Packet flow
+
+ Connection or session flows are different from packet flows. A
+ session flow indicates the direction in which the session was
+ initiated with reference to a network port. Packet flow is the
+ direction in which the packet has traversed with reference to a
+ network port. A session flow is uniquely identified by the direction
+ in which the first packet of that session traversed.
+
+ Take for example, a telnet session. The telnet session consists of
+ packet flows in both inbound and outbound directions. Outbound telnet
+ packets carry terminal keystrokes from the client and inbound telnet
+
+
+
+Srisuresh & Gan Informational [Page 3]
+
+RFC 2391 LSNAT August 1998
+
+
+ packets carry screen displays from the telnet server. Performing
+ address translation for a telnet session would involve translation of
+ incoming as well as outgoing packets belonging to that session.
+
+ Packets belonging to a TCP/UDP session are uniquely identified by
+ the tuple of (source IP address, source TU port, target IP address,
+ target TU port). ICMP sessions that correlate queries and responses
+ using query id are uniquely identified by the tuple of (source IP
+ address, ICMP Query Identifier, target IP address). For lack of
+ well-known ways to distinguish, all other types of sessions are
+ lumped together and distinguished by the tuple of (source IP address,
+ IP protocol, target IP address).
+
+2.3. Start of session for TCP, UDP and others
+
+ The first packet of every TCP session tries to establish a session
+ and contains connection startup information. The first packet of a
+ TCP session may be recognized by the presence of SYN bit and absence
+ of ACK bit in the TCP flags. All TCP packets, with the exception of
+ the first packet must have the ACK bit set.
+
+ The first packet of every session, be it a TCP session, UDP session,
+ ICMP query session or any other session, tries to establish a
+ session. However, there is no deterministic way of recognizing the
+ start of a UDP session or any other non-TCP session.
+
+ Start of session is significant with NATs, as a state describing
+ translation parameters for the session is established at the start
+ of session. Packets pertaining to the session cannot undergo
+ translation, unless a state is established by NAT at the start of
+ session.
+
+2.4. End of session for TCP, UDP and others
+
+ The end of a TCP session is detected when FIN is acknowledged by both
+ halves of the session or when either half receives RST bit in TCP
+ flags field. Within a short period (say, a couple of seconds) after
+ one of the session partners sets RST bit, the session can be safely
+ assumed to have been terminated.
+
+ For all other types of session, there is no deterministic way of
+ determining the end of session unless you know the application
+ protocol. Many heuristic approaches are used to terminate sessions.
+ You can make the assumption that TCP sessions that have not been used
+ for say, 24 hours, and non-TCP sessions that have not been used for
+ say, 1 minute, are terminated. Often this assumption works, but
+ sometimes it doesn't. These idle period session timeouts may vary
+ considerably across the board and may be made user configurable.
+
+
+
+Srisuresh & Gan Informational [Page 4]
+
+RFC 2391 LSNAT August 1998
+
+
+ Another way to handle session terminations is to timestamp sessions
+ and keep them as long as possible and retire the longest idle session
+ when it becomes necessary.
+
+2.5. Basic Network Address Translation (Basic NAT)
+
+ Basic NAT is a method by which hosts in a private network domain are
+ allowed access to hosts in the external network transparently. A
+ block of external addresses are set aside for translating addresses
+ of private hosts as the private hosts originate sessions to
+ applications in external domain. Once an external address is bound by
+ the NAT device to a specific private address, that address binding
+ remains in place for all subsequent sessions originating from the
+ same private host. This binding may be terminated when there are no
+ sessions left to use the binding.
+
+2.6. Network Address Port Translation (NAPT)
+
+ Network Address Port Translation(NAPT) is a method by which hosts in
+ a private network domain are allowed simultaneous access to hosts in
+ the external network transparently using a single registered address.
+ This is made possible by multiplexing transport layer identifiers of
+ private hosts into the transport identifiers of the single assigned
+ external address. For this reason, only the applications based on TCP
+ and UDP protocols are supported by NAPT. ICMP query based
+ applications are also supported as the ICMP header carries a query
+ identifier that is used to corelate responses with requests.
+ Sessions other than TCP, UDP and ICMP query type are simply not
+ permitted from local nodes, serviced by a NAPT router.
+
+2.7. Load share
+
+ Load sharing for the purpose of this document is defined as the
+ spread of session load amongst a cluster of servers which are
+ functionally similar or the same. In other words, each of the nodes
+ in cluster can support a client session equally well with no
+ discernible difference in functionality. Once a node is assigned to
+ service a session, that session is bound to that node till
+ termination. Sessions are not allowed to swap between nodes in the
+ midst of session.
+
+ Load sharing may be applicable for all services, if all hosts in
+ server cluster carry the capability to carry out all services.
+ Alternately, load sharing may be limited to one or more specific
+ services alone and not to others.
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 5]
+
+RFC 2391 LSNAT August 1998
+
+
+ Note, the term "Session load" used in the context of load share is
+ different from the term "system load" attributed to hosts by way of
+ CPU, memory and other resource usage on the system.
+
+3. Overview of Load sharing
+
+ While both traditional NATs and LSNATs perform address translations,
+ and provide transparent connectivity between end nodes, there are
+ distinctions between the two. Traditional NATs initiate translations
+ on outbound sessions, by binding a private address to a global
+ address (basic NAT) or by binding a tuple of private address and
+ transport identifier (such as TCP/UDP port or ICPM query ID) to a
+ tuple of global address and transport identifier. LSNATs, on the
+ other hand, initiate translations on inbound sessions, by binding
+ each session represented by a tuple such as (client address, client
+ TU port, virtual server address, server TU port) to one of server
+ pool nodes, selected based on a real-time load-share algorithm. A
+ virtual server address is a globally unique IP address that
+ identifies a physical server or a group of servers that can provide
+ similar or same functionality.
+
+ For the reminder of this document, we will refer traditional NATs
+ simply as NATs and refer LSNATs exclusively in the context of load
+ share, without implying traditional NAT functionality.
+
+ LSNATs are not limited to operate between private and public domain
+ routing realms. LSNATs may operate within a single routing realm with
+ globally unique IP addresses, just as well as between private and
+ public network domains. The only requirement is that server pool be
+ confined to a stub domain, accessible to clients outside the domain
+ through a single LSNAT border router. However, as you will notice
+ later, this topology limitation on server pool can be overcome under
+ certain configurations.
+
+ Load Share NAT operates as follows. A client attempts to access a
+ server by using the server virtual address. The LSNAT router
+ transparently redirects the request to one of the hosts in server
+ pool, selected using a real-time load sharing algorithm. Multiple
+ sessions may be initiated from the same client, and each session
+ could be directed to a different host based on load balance across
+ server pool hosts at the time. If load share is desired for just a
+ few specific services, the configuration on LSNAT could be defined to
+ restrict load share for just the services desired.
+
+
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 6]
+
+RFC 2391 LSNAT August 1998
+
+
+ In the case where virtual server address is same as the interface
+ address of an LSNAT router, server applications (such as telnet) on
+ LSNAT router must be disabled for external access on that address.
+ This is the limitation to using address owned by LSNAT router as the
+ virtual server address.
+
+ Load share NAT operation is also applicable during individual server
+ upgrades as follows. Say, a server, that needs to be upgraded is
+ statically mapped to a backup server on the inbound. Subsequent to
+ this mapping, new session requests to the original server would be
+ redirected by LSNAT to the backup server. As an extension, it is
+ also possible to statically map a specific TU port service on a
+ server to that of backup sever.
+
+ We illustrate the operation of LSNAT in the following subsections,
+ where (a) servers are confined to a stub domain, and belong to
+ globally unique address space as shared by clients, (b) servers are
+ confined to private address space stub domain, and (c) servers are
+ not restrained by any topological limitations.
+
+3.1 Operation of LSNAT in a globally unique address space
+
+ In this section, we will illustrate the operation of LSNAT in a
+ globally unique address space. The border router with LSNAT enabled
+ on WAN link would perform load sharing and address translations for
+ inbound sessions. However, sessions outbound from the hosts in server
+ pool will not be subject to any type of translation, as all nodes
+ have globally unique IP addresses.
+
+ In the example below, servers S1 (172.85.0.1), S2(172.85.0.2) and
+ S3(172.85.0.3) form a server pool, confined to a stub domain. LSNAT
+ on the border router is enabled on the WAN link, such that the
+ virtual server address S(172.87.0.100) is mapped to the server pool
+ consisting of hosts S1, S2 and S3. When a client 198.76.29.7
+ initiates a HTTP session to the virtual server S, the LSNAT router
+ examines the load on hosts in server pool and selects a host, say S1
+ to service the request. The transparent address and TU port
+ translations performed by the LSNAT router become apparent as you
+ follow the down arrow line. IP packets on the return path go through
+ similar address translation. Suppose, we have another client
+ 198.23.47.2 initiating telnet session to the same virtual server S.
+ The LSNAT would determine that host S3 is a better choice to service
+ this session as S1 is busy with a session and redirect the session to
+ S3. The second session redirection path is delineated with colons.
+ The procedure continues for any number of sessions the same way.
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 7]
+
+RFC 2391 LSNAT August 1998
+
+
+ Notice that this requires no changes to clients or servers. All the
+ configuration and mapping necessary would be limited just to the
+ LSNAT router.
+
+ \ | /
+ +---------------+
+ |Backbone Router|
+ +---------------+
+ WAN |
+ |
+ Stub domain border .......|.........
+ |
+ {s=198.76.29.7, 2745, v | {s=198.23.47.2, 3200,
+ d=172.87.0.100, 80 } v | d=172.87.0.100, 23 }
+ v +------------------+ :
+ v |Border Router with| :
+ v |LSNAT enabled on | :
+ v |WAN interface | :
+ v +------------------+ :
+ v | :
+ v | LAN :
+ ------v----------------------:---
+ {s=198.76.29.7, 2745, v | | |:{s=198.23.47.2, 3200,
+ d=172.85.0.1, 80 } | | | d=172.85.0.3, 23 }
+ +--+ +--+ +--+
+ |S1| |S2| |S3|
+ |--| |--| |--|
+ /____\ /____\ /____\
+ 172.85.0.1 172.85.0.2 172.85.0.3
+
+ Figure 1: Operation of LSNAT in Globally unique address space
+
+3.2. Operation of LSNAT in conjunction with a private network
+
+ In this section, we will illustrate the operation of LSNAT in
+ conjunction with NAT on the same router. The NAT configuration is
+ required for translation of outbound sessions and could be either
+ Basic NAT or NAPT. The illustration below will assume NAPT on the
+ outbound and LSNAT on the inbound on WAN link.
+
+ Say, an organization has a private IP network and a WAN link to
+ backbone router. The private network's stub router is assigned a
+ globally valid address on the WAN link and the remaining nodes in the
+ organization have IP addresses that have only local significance. The
+ border router is NAPT configured on the outbound allowing access to
+ external hosts, using the single registered IP address.
+
+
+
+
+
+Srisuresh & Gan Informational [Page 8]
+
+RFC 2391 LSNAT August 1998
+
+
+ In addition, say the organization has servers S1 (10.0.0.1),
+ S2(10.0.0.2) and S3 (10.0.0.3) that form a pool to provide inbound
+ access to external clients. This is made possible by enabling LSNAT
+ on the WAN link of the border router, such that virtual server
+ address S(198.76.28.4) is mapped to the server pool consisting of
+ hosts S1, S2 and S3. When an external client 198.76.29.7 initiates a
+ HTTP session to the virtual server S, the LSNAT router examines load
+ on hosts in server pool and selects a host, say S1 to service the
+ request. The transparent address and TU port translations performed
+ by the LSNAT router are apparent as you follow the down arrow line.
+ IP packets on the return path go through similar address translation.
+ Suppose, we have another client 198.23.47.2 initiating telnet session
+ to the same address. The LSNAT would determine that host S3 is a
+ better choice to service this session as S1 is busy with a session
+ and redirect the session to S3. The second session redirection path
+ is delineated with colons. The procedure continues for any number of
+ sessions the same way.
+
+ \ | /
+ +---------------+
+ |Backbone Router|
+ +---------------+
+ WAN |
+ |
+ Stub domain border ........|.........
+ |
+ {s=198.76.29.7, 2745, v | {s=198.23.47.2, 3200,
+ d=198.76.28.4, 80 }v | :d=198.76.28.4, 23 }
+ v+-------------------+:
+ v|Border Router with |:
+ v| LSNAT and NAPT |:
+ v|enabled on WAN link|:
+ v+-------------------+:
+ v | :
+ v | LAN :
+ ------v---------------------:------
+ {s=198.76.29.7, 2745, v | | | : {s=198.23.47.2, 3200,
+ d=10.0.0.1, 80 } | | | d=10.0.0.3, 23 }
+ +--+ +--+ +--+
+ |S1| |S2| |S3|
+ |--| |--| |--|
+ /____\ /____\ /____\
+ 10.0.0.1 10.0.0.2 10.0.0.3
+
+ Figure 2: Operation of LSNAT, in coexistence with NAPT
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 9]
+
+RFC 2391 LSNAT August 1998
+
+
+ Once again, notice that this requires no changes to clients or
+ servers. The translation is completely transparent to end nodes.
+ Address mapping on the LSNAT performs load sharing and address
+ translations for inbound sessions. Sessions outbound from hosts in
+ server pool are subject to NAPT. Both NAT and LSNAT co-exist with
+ each other in the same router.
+
+3.3. Load Sharing with no topological restraints on servers
+
+ In this section, we will illustrate a configuration in which load
+ sharing can be accomplished on a router without enforcing topological
+ limitations on servers. In this configuration, virtual server address
+ will be owned by the router that supports load sharing. I.e., virtual
+ server address will be same as address of one of the interfaces of
+ load share router. We will distinguish this configuration from LSNAT
+ by referring this as "Load Share Network Address Port Translation"
+ (LS-NAPT). Routers that support the LS-NAPT configuration will be
+ termed "LS-NAPT routers", or simply LS-NAPTs.
+
+ In an LSNAT router, inbound TCP/UDP sessions, represented by the
+ tuple of (client address, client TU port, virtual server address,
+ service port) are translated into a tuple of (client address, client
+ TU port, selected server address, service port). Translation is
+ carried out on all datagrams pertaining to the same session, in
+ either direction. Whereas, LS-NAPT router would translate the same
+ session into a tuple of (virtual server address, virtual server TU
+ port, selected server, service port). Notice that LS-NAPT router
+ translates the client address and TU port with the address and TU
+ port of virtual server, which is same as the address of one of its
+ interfaces. By doing this, datagrams from clients as well as servers
+ are forced to bear the address of LS-NAPT router as the destination
+ address, thereby guaranteeing that the datagrams would necessarily
+ traverse the LS-NAPT router. As a result, there is no need to require
+ servers to be under topological constraints.
+
+ Take for example, figure 1 in section 3.1. Let us say the router on
+ which load sharing is enabled is not just a border router, but can be
+ any kind of router. Let us also say that the virtual server address S
+ (172.87.0.100) is same as the address of WAN link and LS-NAPT is
+ enabled on the WAN interface. Figure 3 summarizes the new router
+ configuration.
+
+ When a client 198.76.29.7 initiates a HTTP session to the virtual
+ server address S (i.e., address of the WAN interface), the LS-NAPT
+ router examines load on hosts in server pool and selects a host, say
+ S1 to service the request. Appropriately, the destination address is
+ translated to be S1 (172.85.0.1). Further, original client address
+ and TU port are replaced with the address and TU port of the WAN
+
+
+
+Srisuresh & Gan Informational [Page 10]
+
+RFC 2391 LSNAT August 1998
+
+
+ link. As a result, destination addresses as well as source address
+ and source TU port are translated when the packet reaches S1, as can
+ be noticed from the down-arrow path. IP packets on the return path go
+ through similar translation. The second client 198.23.47.2 initiating
+ telnet session to the same virtual server address S is load share
+ directed to S3. This packet once again undergoes LS-NAPT translation,
+ just as with the first client. The data path and translations can be
+ noticed following the colon line. The procedure continues for any
+ number of sessions the same way. The translations made to datagrams
+ in either direction are completely transparent to end nodes.
+
+ \ | /
+ +---------------+
+ | Router |
+ +---------------+
+ WAN |
+ |
+ |
+ {s=198.76.29.7, 2745, v | {s=198.23.47.2, 3200,
+ d=198.76.28.4, 80 }v | 198.76.28.4 :d=198.76.28.4, 23 }
+ v +----------------+ :
+ v | A Router with | :
+ v | LS-NAPT enabled| :
+ v | on WAN link | :
+ v +----------------+ :
+ v | :
+ v LAN | :
+ ------v---------------------:------
+ {s=198.76.28.4, 7001, v| | |:{s=198.76.28.4,7002,
+ d=172.85.0.1, 80 } | | | d=172.85.0.3, 23 }
+ +--+ +--+ +--+
+ |S1| |S2| |S3|
+ |--| |--| |--|
+ /____\ /____\ /____\
+ 172.85.0.1 172.85.0.2 172.85.0.3
+
+ Figure 3: LS-NAPT configuration on a router
+
+ As you will notice, datagrams from clients as well as servers are
+ forced to be directed to the router, because they use WAN interface
+ address of router as the destination address in their datagrams. With
+ the assurance that all packets from clients and servers would
+ traverse the router, there is no longer a requirement for servers to
+ be confined to a stub domain and for LSNAT to be enabled only on
+ border router to the stub domain.
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 11]
+
+RFC 2391 LSNAT August 1998
+
+
+ The LS-NAPT configuration described in this section involves more
+ translations and hence is more complex compared to LSNAT
+ configurations described in the previous sections. While the
+ processing is complex, there are benefits to this configuration.
+ Firstly, it breaks down restraints on server topology. Secondly, it
+ scales with bandwidth expansion for client access. Even if Service
+ providers have one link today for client access, the LS-NAPT
+ configuration allows them to expand to more links in the future
+ guaranteeing the same LS-NAPT load share service on newer links.
+
+ The configuration is not without its limitations. Server applications
+ (such as telnet) on the router box would have to be disabled for the
+ interface address assigned to be virtual server address. Load sharing
+ would be limited to TCP and UDP applications only. Maximum
+ concurrently allowed sessions would be limited by the maximum allowed
+ TCP/UDP client ports on the same address. Assuming that ports 0-1023
+ must be set aside as well-known service ports, that would leave a
+ maximum of 63K TCP client ports and 63K of UDP client ports on the
+ LS-NAPT router to communicate with each load-share server. As a
+ result, LS-NAPT routers will not be able to concurrently support more
+ than a maximum of (63K * count of Load-share servers) TCP sessions
+ and (63K * count of Load-share servers) UDP sessions.
+
+4.0. Translation phases of a session in LSNAT router.
+
+ As with NATs, LSNATs must monitor the following three phases in
+ relation to Address translation.
+
+4.1. Session binding:
+
+ Session binding is the phase in which an incoming session is
+ associated with the address of a host in server pool. This
+ association essentially sets the translation parameters for all
+ subsequent datagrams pertaining to the session. For addresses that
+ have static mapping, the binding happens at startup time. Otherwise,
+ each incoming session is dynamically bound to a different host based
+ on a load sharing algorithm.
+
+4.2. Address lookup and translation:
+
+ Once session binding is established for a connection setup, all
+ subsequent packets belonging to the same connection will be subject
+ to session lookup for translation purposes.
+
+ For outbound packets of a session, the source IP address (and source
+ TU port, in case of TCP/UDP sessions) and related fields (such as IP,
+ TCP, UDP and ICMP header checksums) will undergo translation. For
+ inbound packets of a session, the destination IP address (and
+
+
+
+Srisuresh & Gan Informational [Page 12]
+
+RFC 2391 LSNAT August 1998
+
+
+ destination TU port, in case of TCP/UDP sessions) and related fields
+ such as IP, TCP, UDP and ICMP header checksums) will undergo
+ translation.
+
+ The header and payload modifications made to IP datagrams subject to
+ LSNAT will be exactly same as those subject to traditional NATs,
+ described in section 5.0 of REF [1]. Hence, the reader is urged to
+ refer REF [1] document for packet translation process.
+
+4.3. Session unbinding:
+
+ Session unbinding is the phase in which a server node is no longer
+ responsible for the session. Usually, session unbinding happens when
+ the end of session is detected. As described in the terminology
+ section, it is not always easy to determine end of session.
+
+5. Load share algorithms
+
+ Many algorithms are available to select a host from a pool of servers
+ to service a new session. The load distribution is based primarily on
+ (a) cost of accessing the network on which a server resides and load
+ on the network interface used to access the server, and (b)resource
+ availability and system load on the server. A variety of policies can
+ be adapted to distribute sessions across the servers in a server
+ pool.
+
+ For simplicity, we will consider two types algorithms, based on
+ proximity between server nodes and LSNAT router. The higher the cost
+ of access to a sever, the farther the proximity of server is assumed
+ to be. The first kind of algorithms will assume that all server pool
+ members are at equal or nearly equal proximity to LSNAT router and
+ hence the load distribution can be based solely on resource
+ availability or system load on remote servers. Cost of network access
+ will be considered irrelevant. The second kind would assume that all
+ server pool members have equal resource availability and the criteria
+ for selection would be proximity to servers. In other words, we
+ consider algorithms which take into account the cost of network
+ access.
+
+5.1. Local Load share algorithms
+
+ Ideally speaking, the selection process would have precise knowledge
+ of real-time resource availability and system load for each host in
+ server pool, so that the selection of host with maximum unutilized
+ capacity would be the obvious choice. However, this is not so easy to
+ achieve.
+
+
+
+
+
+Srisuresh & Gan Informational [Page 13]
+
+RFC 2391 LSNAT August 1998
+
+
+ We consider here two kinds of heuristic approaches to monitor session
+ load on server pool members. The first kind is where the load share
+ selector tracks system load on individual servers in non-intrusive
+ way. The second kind is where the individual members actively
+ participate in communicating with the load share selector, notifying
+ the selector of their load capacity.
+
+ Listed below are the most common selection algorithms adapted in the
+ non-intrusive category.
+
+ 1. Round-Robin algorithm
+ This is the simplest scheme, where a host is selected simply on a
+ round robin basis, without regard to load on the host.
+
+ 2. Least Load first algorithm
+ This is an improvement over round-robin approach, in that, the
+ host with least number of sessions bound to it is selected to
+ service a new session. This approach is not without its caveats.
+ Each session is assumed to be as resource consuming as any other
+ session, independent of the type of service the session represents
+ and all hosts in server pool are assumed to be equally
+ resourceful.
+
+ 3. Least traffic first algorithm
+ A further improvement over the previous algorithm would be to
+ measure system load by tracking packet count or byte count
+ directed from or to each of the member hosts over a period of
+ time. Although packet count is not the same as system load, it is
+ a reasonable approximation.
+
+ 4. Least Weighted Load first approach
+ This would be an enhancement to the first two. This would allow
+ administrators to assign (a) weights to sessions, based on likely
+ resource consumption estimates of session types and (b) weights to
+ hosts based on resource availability.
+
+ The sum of all session loads by weight assigned to a server,
+ divided by weight of server would be evaluated to select the
+ server with least weighted load to assign for each new session.
+ Say, FTP sessions are assigned 5 times the weight(5x) as a telnet
+ session(x), and server S3 is assumed to be 3 times as resourceful
+ as server S1. Let us also say that S1 is assigned 1 FTP session
+ and 1 telnet session, whereas S3 is assigned 2 FTP sessions and 5
+ telnet sessions. When a new telnet session need assignment, the
+ weighted load on S3 is evaluated to be (2*5x+5*x)/3 = 5x, and the
+ load on S1 is evaluated to be (1*5x+1*x) = 6x. Server S3 is
+ selected to bind the new telnet session, as the weighted load on
+ S3 is smaller than that of S1.
+
+
+
+Srisuresh & Gan Informational [Page 14]
+
+RFC 2391 LSNAT August 1998
+
+
+ 5. Ping to find the most responsive host.
+ Till now, capacity of a member host is determined exclusively by
+ the LSNAT using heuristic approaches. In reality, it is impossible
+ to predict system capacity from remote, without interaction with
+ member hosts. A prudent approach would be to periodically ping
+ member hosts and measure the response time to determine how busy
+ the hosts really are. Use the response time in conjunction with
+ the heuristics to select the host most appropriate for the new
+ session.
+
+ In the active category, we involve individual member hosts in
+ resource utilization monitoring process. An agent software on each
+ node would notify the monitoring agent on resource availability.
+ Clearly, this would imply having an application program (one that
+ does not consume significant resources, by itself) to run on each
+ member node. This strategy of involving member hosts in system load
+ monitoring is likely to yield the most optimal results in the
+ selection process.
+
+5.2. Distributed Load share algorithms
+
+ When server nodes are distributed geographically across different
+ areas and cost to access them vary widely, the load share selector
+ could use that information in selecting a server to service a new
+ session. In order to do this, the load share selector would need to
+ consult the routing tables maintained by routing protocols such as
+ RIP and OSPF to find the cost of accessing a server.
+
+ All algorithms listed below would be non-intrusive kind where the
+ server nodes do not actively participate in notifying the load share
+ selector of their load capacity.
+
+ 1. Weighted Least Load first algorithm
+ The selection criteria would be based on (a) cost of access to
+ server, and (b) the number of sessions assigned to server. The
+ product of cost and session load for each server would be
+ evaluated to select the server with least weighted load for each
+ new session. Say, cost of accessing server S1 is twice as much as
+ that of server S2. In that case, S1 will be assigned twice as much
+ load as that of S2 during the distribution process. When a server
+ is not accessible due to network failure, the cost of access is
+ set to infinity and hence no further load can be assigned to that
+ server.
+
+ 2. Weighted Least traffic first algorithm
+ An improvement over the previous algorithm would be
+ to measure network load by tracking packet count or byte
+ count directed from or to each of the member hosts over a
+
+
+
+Srisuresh & Gan Informational [Page 15]
+
+RFC 2391 LSNAT August 1998
+
+
+ period of time. Although packet count is not the same as
+ system load, it is a reasonable approximation. So, the
+ product of cost and traffic load (over a fixed duration)
+ for each server would be evaluated to select the server
+ with least weighted traffic load for each new session.
+
+6. Dead host detection
+
+ As sessions are assigned to hosts, it is important to detect the
+ live-ness of the hosts. Otherwise, sessions could simply be black-
+ holed into a dead host. Many heuristic approaches are adopted.
+ Sending pings periodically would be one way to determine the live-
+ ness. Another approach would be to track datagrams originating from a
+ member host in response to new session assignments. If no response
+ is detected in a few seconds, declare the server dead and do not
+ assign new sessions to this host. The server can be monitored later
+ again after a long pause (say, in the order of a few minutes) by
+ periodically reassigning new sessions and monitoring response times
+ and so on.
+
+7. Miscellaneous
+
+ The IETF has been notified of potential intellectual Property Rights
+ (IPR) issues with the technology described in this document.
+ Interested people are requested to look in the IETF web page
+ (http://www.ietf.org) under the Intellectual property Rights Notices
+ section for the current information.
+
+8. Security Considerations
+
+ All security considerations associated with NAT routers, described in
+ REF [1] are applicable to LSNAT routers as well.
+
+REFERENCES
+
+ [1] Egevang, K. and P. Francis, "The IP Network Address Translator
+ (NAT)", RFC 1631, May 1994.
+
+ [2] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC 1700,
+ October 1994. See also: http://www.iana.org/numbers.html
+
+ [3] Braden, R., "Requirements for Internet Hosts -- Communication
+ Layers", STD 3, RFC 1122, October 1989.
+
+ [4] Braden, R., "Requirements for Internet Hosts -- Application and
+ Support", STD 3, RFC 1123, October 1989.
+
+
+
+
+
+Srisuresh & Gan Informational [Page 16]
+
+RFC 2391 LSNAT August 1998
+
+
+ [5] Baker, F., "Requirements for IP Version 4 Routers", RFC 1812,
+ June 1995.
+
+ [6] Postel, J., and J. Reynolds, "File Transfer Protocol (FTP)", STD
+ 9, RFC 959, October 1985.
+
+ [7] Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
+ September 1981.
+
+ [8] Postel, J., "Internet Control Message (ICMP) Specification", STD
+ 5, RFC 792, September 1981.
+
+ [9] Postel, J., "User Datagram Protocol (UDP)", STD 6, RFC 768,
+ August 1980.
+
+ [10] Mogul, J., and J. Postel, "Internet Standard Subnetting
+ Procedure", STD 5, RFC 950, August 1985.
+
+ [11] Brisco, T., "DNS Support for Load Balancing", RFC 1794, April
+ 1995.
+
+Authors' Addresses
+
+ Pyda Srisuresh
+ Lucent Technologies
+ 4464 Willow Road
+ Pleasanton, CA 94588-8519
+ U.S.A.
+
+ Voice: (925) 737-2153
+ Fax: (925) 737-2110
+ EMail: suresh@ra.lucent.com
+
+
+ Der-hwa Gan
+ Juniper Networks, Inc.
+ 385 Ravensdale Drive.
+ Mountain View, CA 94043
+ U.S.A.
+
+ Voice: (650) 526-8074
+ Fax: (650) 526-8001
+ EMail: dhg@juniper.net
+
+
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 17]
+
+RFC 2391 LSNAT August 1998
+
+
+Full Copyright Statement
+
+ Copyright (C) The Internet Society (1998). All Rights Reserved.
+
+ This document and translations of it may be copied and furnished to
+ others, and derivative works that comment on or otherwise explain it
+ or assist in its implementation may be prepared, copied, published
+ and distributed, in whole or in part, without restriction of any
+ kind, provided that the above copyright notice and this paragraph are
+ included on all such copies and derivative works. However, this
+ document itself may not be modified in any way, such as by removing
+ the copyright notice or references to the Internet Society or other
+ Internet organizations, except as needed for the purpose of
+ developing Internet standards in which case the procedures for
+ copyrights defined in the Internet Standards process must be
+ followed, or as required to translate it into languages other than
+ English.
+
+ The limited permissions granted above are perpetual and will not be
+ revoked by the Internet Society or its successors or assigns.
+
+ This document and the information contained herein is provided on an
+ "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+ TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+ BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+ HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+ MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Srisuresh & Gan Informational [Page 18]
+