summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc1045.txt
diff options
context:
space:
mode:
authorThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committerThomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
treee3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc1045.txt
parentea76e11061bda059ae9f9ad130a9895cc85607db (diff)
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc1045.txt')
-rw-r--r--doc/rfc/rfc1045.txt7130
1 files changed, 7130 insertions, 0 deletions
diff --git a/doc/rfc/rfc1045.txt b/doc/rfc/rfc1045.txt
new file mode 100644
index 0000000..f42d07d
--- /dev/null
+++ b/doc/rfc/rfc1045.txt
@@ -0,0 +1,7130 @@
+
+
+Network Working Group David Cheriton
+Request for Comments: 1045 Stanford University
+ February 1988
+
+
+ VMTP: VERSATILE MESSAGE TRANSACTION PROTOCOL
+ Protocol Specification
+
+
+
+STATUS OF THIS MEMO
+
+This RFC describes a protocol proposed as a standard for the Internet
+community. Comments are encouraged. Distribution of this document is
+unlimited.
+
+
+OVERVIEW
+
+This memo specifies the Versatile Message Transaction Protocol (VMTP)
+[Version 0.7 of 19-Feb-88], a transport protocol specifically designed
+to support the transaction model of communication, as exemplified by
+remote procedure call (RPC). The full function of VMTP, including
+support for security, real-time, asynchronous message exchanges,
+streaming, multicast and idempotency, provides a rich selection to the
+VMTP user level. Subsettability allows the VMTP module for particular
+clients and servers to be specialized and simplified to the services
+actually required. Examples of such simple clients and servers include
+PROM network bootload programs, network boot servers, data sensors and
+simple controllers, to mention but a few examples.
+
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Table of Contents
+
+1. Introduction 1
+
+ 1.1. Motivation 2
+ 1.1.1. Poor RPC Performance 2
+ 1.1.2. Weak Naming 3
+ 1.1.3. Function Poor 3
+ 1.2. Relation to Other Protocols 4
+ 1.3. Document Overview 5
+
+2. Protocol Overview 6
+
+ 2.1. Entities, Processes and Principals 7
+ 2.2. Entity Domains 9
+ 2.3. Message Transactions 10
+ 2.4. Request and Response Messages 11
+ 2.5. Reliability 12
+ 2.5.1. Transaction Identifiers 13
+ 2.5.2. Checksum 14
+ 2.5.3. Request and Response Acknowledgment 14
+ 2.5.4. Retransmissions 15
+ 2.5.5. Timeouts 15
+ 2.5.6. Rate Control 18
+ 2.6. Security 19
+ 2.7. Multicast 21
+ 2.8. Real-time Communication 22
+ 2.9. Forwarded Message Transactions 24
+ 2.10. VMTP Management 25
+ 2.11. Streamed Message Transactions 25
+ 2.12. Fault-Tolerant Applications 28
+ 2.13. Packet Groups 29
+ 2.14. Runs of Packet Groups 31
+ 2.15. Byte Order 32
+ 2.16. Minimal VMTP Implementation 33
+ 2.17. Message vs. Procedural Request Handling 33
+ 2.18. Bibliography 34
+
+3. VMTP Packet Formats 37
+
+ 3.1. Entity Identifier Format 37
+ 3.2. Packet Fields 38
+
+
+
+
+
+
+
+Cheriton [page i]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ 3.3. Request Packet 45
+ 3.4. Response Packet 47
+
+4. Client Protocol Operation 49
+
+ 4.1. Client State Record Fields 49
+ 4.2. Client Protocol States 51
+ 4.3. State Transition Diagrams 51
+ 4.4. User Interface 52
+ 4.5. Event Processing 53
+ 4.6. Client User-invoked Events 54
+ 4.6.1. Send 54
+ 4.6.2. GetResponse 56
+ 4.7. Packet Arrival 56
+ 4.7.1. Response 58
+ 4.8. Management Operations 61
+ 4.8.1. HandleNoCSR 62
+ 4.9. Timeouts 64
+
+5. Server Protocol Operation 66
+
+ 5.1. Remote Client State Record Fields 66
+ 5.2. Remote Client Protocol States 66
+ 5.3. State Transition Diagrams 67
+ 5.4. User Interface 69
+ 5.5. Event Processing 70
+ 5.6. Server User-invoked Events 71
+ 5.6.1. Receive 71
+ 5.6.2. Respond 72
+ 5.6.3. Forward 73
+ 5.6.4. Other Functions 74
+ 5.7. Request Packet Arrival 74
+ 5.8. Management Operations 78
+ 5.8.1. HandleRequestNoCSR 79
+ 5.9. Timeouts 82
+
+6. Concluding Remarks 84
+
+I. Standard VMTP Response Codes 85
+
+II. VMTP RPC Presentation Protocol 87
+
+
+
+
+
+
+
+
+Cheriton [page ii]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ II.1. Request Code Management 87
+
+III. VMTP Management Procedures 89
+
+ III.1. Entity Group Management 100
+ III.2. VMTP Management Digital Signatures 101
+
+IV. VMTP Entity Identifier Domains 102
+
+ IV.1. Domain 1 102
+ IV.2. Domain 3 104
+ IV.3. Other Domains 105
+ IV.4. Decentralized Entity Identifier Allocation 105
+
+V. Authentication Domains 107
+
+ V.1. Authentication Domain 1 107
+ V.2. Other Authentication Domains 107
+
+VI. IP Implementation 108
+
+VII. Implementation Notes 109
+
+ VII.1. Mapping Data Structures 109
+ VII.2. Client Data Structures 111
+ VII.3. Server Data Structures 111
+ VII.4. Packet Group transmission 112
+ VII.5. VMTP Management Module 113
+ VII.6. Timeout Handling 114
+ VII.7. Timeout Values 114
+ VII.8. Packet Reception 115
+ VII.9. Streaming 116
+ VII.10. Implementation Experience 117
+
+VIII. UNIX 4.3 BSD Kernel Interface for VMTP 118
+
+Index 120
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page iii]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ List of Figures
+
+ Figure 1-1: Relation to Other Protocols 4
+ Figure 3-1: Request Packet Format 45
+ Figure 3-2: Response Packet Format 47
+ Figure 4-1: Client State Transitions 52
+ Figure 5-1: Remote Client State Transitions 68
+ Figure III-1: Authenticator Format 92
+ Figure VII-1: Mapping Client Identifier to CSR 109
+ Figure VII-2: Mapping Server Identifiers 110
+ Figure VII-3: Mapping Group Identifiers 111
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page iv]
+
+RFC 1045 VMTP February 1988
+
+
+1. Introduction
+
+The Versatile Message Transaction Protocol (VMTP) is a transport
+protocol designed to support remote procedure call (RPC) and general
+transaction-oriented communication. By transaction-oriented
+communication, we mean that:
+
+ - Communication is request-response: A client sends a request
+ for a service to a server, the request is processed, and the
+ server responds. For example, a client may ask for the next
+ page of a file as the service. The transaction is terminated
+ by the server responding with the next page.
+
+ - A transaction is initiated as part of sending a request to a
+ server and terminated by the server responding. There are no
+ separate operations for setting up or terminating associations
+ between clients and servers at the transport level.
+
+ - The server is free to discard communication state about a
+ client between transactions without causing incorrect behavior
+ or failures.
+
+The term message transaction (or transaction) is used in the reminder of
+this document for a request-response exchange in the sense described
+above.
+
+VMTP handles the error detection, retransmission, duplicate suppression
+and, optionally, security required for transport-level end-to-end
+reliability.
+
+The protocol is designed to provide a range of behaviors within the
+transaction model, including:
+
+ - Minimal two packet exchanges for short, simple transactions.
+
+ - Streaming of multi-packet requests and responses for efficient
+ data transfer.
+
+ - Datagram and multicast communication as an extension of the
+ transaction model.
+
+Example Uses:
+
+ - Page-level file access - VMTP is intended as the transport
+ level for file access, allowing simple, efficient operation on
+ a local network. In particular, VMTP is appropriate for use
+ by diskless workstations accessing shared network file
+
+
+Cheriton [page 1]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ servers.
+
+ - Distributed programming - VMTP is intended to provide an
+ efficient transport level protocol for remote procedure call
+ implementations, distributed object-oriented systems plus
+ message-based systems that conform to the request-response
+ model.
+
+ - Multicast communication with groups of servers to: locate a
+ specific object within the group, update a replicated object,
+ synchronize the commitment of a distributed transaction, etc.
+
+ - Distributed real-time control with prioritized message
+ handling, including datagrams, multicast and asynchronous
+ calls.
+
+The protocol is designed to operate on top of a simple unreliable
+datagram service, such as is provided by IP.
+
+
+1.1. Motivation
+
+VMTP was designed to address three categories of deficiencies with
+existing transport protocols in the Internet architecture. We use TCP
+as the key current transport protocol for comparison.
+
+
+1.1.1. Poor RPC Performance
+
+First, current protocols provide poor performance for remote procedure
+call (RPC) and network file access. This is attributable to three key
+causes:
+
+ - TCP requires excessive packets for RPC, especially for
+ isolated calls. In particular, connection setup and clear
+ generates extra packets over that needed for VMTP to support
+ RPC.
+
+ - TCP is difficult to implement, speaking purely from the
+ empirical experience over the last 10 years. VMTP was
+ designed concurrently with its implementation, with focus on
+ making it easy to implement and providing sensible subsets of
+ its functionality.
+
+ - TCP handles packet loss due to overruns poorly. We claim that
+ overruns are the key source of packet loss in a
+ high-performance RPC environment and, with the increasing
+
+
+Cheriton [page 2]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ performance of networks, will continue to be the key source.
+ (Older machines and network interfaces cannot keep up with new
+ machines and network interfaces. Also, low-end network
+ interfaces for high-speed networks have limited receive
+ buffering.)
+
+VMTP is designed for ease of implementation and efficient RPC. In
+addition, it provides selective retransmission with rate-based flow
+control, thus addressing all of the above issues.
+
+
+1.1.2. Weak Naming
+
+Second, current protocols provide inadequate naming of transport-level
+endpoints because the names are based on IP addresses. For example, a
+TCP endpoint is named by an Internet address and port identifier.
+Unfortunately, this makes the endpoint tied to a particular host
+interface, not specifically the process-level state associated with the
+transport-level endpoint. In particular, this form of naming causes
+problems for process migration, mobile hosts and multi-homed hosts.
+VMTP provides host-address independent names, thereby solving the above
+mentioned problems.
+
+In addition, TCP provides no security and reliability guarantees on the
+dynamically allocated names. In particular, other than well-known
+ports, (host-addr, port-id)-tuples can change meaning on reboot
+following a crash. VMTP provides large identifiers with guarantee of
+stability, meaning that either the identifier never changes in meaning
+or else remains invalid for a significant time before becoming valid
+again.
+
+
+1.1.3. Function Poor
+
+TCP does not support multicast, real-time datagrams or security. In
+fact, it only supports pair-wise, long-term, streamed reliable
+interchanges. Yet, multicast is of growing importance and is being
+developed for the Internet (see RFC 966 and 988). Also, a datagram
+facility with the same naming, transmission and reception facilities as
+the normal transport level is a powerful asset for real-time and
+parallel applications. Finally, security is a basic requirement in an
+increasing number of environments. We note that security is natural to
+implement at the transport level to provide end-to-end security (as
+opposed to (inter)network level security). Without security at the
+transport level, a transport level protocol cannot guarantee the
+standard transport level service definition in the presence of an
+intruder. In particular, the intruder can interject packets or modify
+
+
+Cheriton [page 3]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+packets while updating the checksum, making mockery out of the
+transport-level claim of "reliable delivery".
+
+In contrast, VMTP provides multicast, real-time datagrams and security,
+addressing precisely these weaknesses.
+
+In general, VMTP is designed with the next generation of communication
+systems in mind. These communication systems are characterized as
+follows. RPC, page-level file access and other request-response
+behavior dominates. In addition, the communication substrate, both
+local and wide-area, provides high data rates, low error rates and
+relatively low delay. Finally, intelligent, high-performance network
+interfaces are common and in fact required to achieve performance that
+approximates the network capability. However, VMTP is also designed to
+function acceptably with existing networks and network interfaces.
+
+
+1.2. Relation to Other Protocols
+
+VMTP is a transport protocol that fits into the layered Internet
+protocol environment. Figure 1-1 illustrates the place of VMTP in the
+protocol hierarchy.
+
+
+ +-----------+ +----+ +-----------------+ +------+
+ |File Access| |Time| |Program Execution| |Naming|... Application
+ +-----------+ +----+ +-----------------+ +------+ Layer
+ | | | | |
+ +-----------+-----------+-------------+------+
+ |
+ +------------------+
+ | RPC Presentation | Presentation
+ +------------------+ Layer
+ |
+ +------+ +--------+
+ | TCP | | VMTP | Transport
+ +------+ +--------+ Layer
+ | |
+ +-----------------------------------+
+ | Internet Protocol & ICMP | Internetwork
+ +-----------------------------------+ Layer
+
+ Figure 1-1: Relation to Other Protocols
+
+The RPC presentation level is not currently defined in the Internet
+suite of protocols. Appendix II defines a proposed RPC presentation
+level for use with VMTP and assumed for the definition of the VMTP
+management procedures. There is also a need for the definition of the
+
+
+Cheriton [page 4]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+Application layer protocols listed above.
+
+If internetwork services are not required, VMTP can be used without the
+IP layer, layered directly on top of the network or data link layers.
+
+
+1.3. Document Overview
+
+The next chapter gives an overview of the protocol, covering naming,
+message structure, reliability, flow control, streaming, real-time,
+security, byte-ordering and management. Chapter 3 describes the VMTP
+packet formats. Chapter 4 describes the client VMTP protocol operation
+in terms of pseudo-code for event handling. Chapter 5 describes the
+server VMTP protocol operation in terms of pseudo-code for event
+handling. Chapter 6 summarizes the state of the protocol, some
+remaining issues and expected directions for the future. Appendix I
+lists some standard Response codes. Appendix II describes the RPC
+presentation protocol proposed for VMTP and used with the VMTP
+management procedures. Appendix III lists the VMTP management
+procedures. Appendix IV proposes initial approaches for handling entity
+identification for VMTP. Appendix V proposes initial authentication
+domains for VMTP. Appendix VI provides some details for implementing
+VMTP on top of IP. Appendix VII provides some suggestions on host
+implementation of VMTP, focusing on data structures and support
+functions. Appendix VIII describes a proposed program interface for
+UNIX 4.3 BSD and its descendants and related systems.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 5]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+2. Protocol Overview
+
+VMTP provides an efficient, reliable, optionally secure transport
+service in the message transaction or request-response model with the
+following features:
+
+ - Host address-independent naming with provision for multiple
+ forms of names for endpoints as well as associated (security)
+ principals. (See Sections 2.1, 2.2, 3.1 and Appendix IV.)
+
+ - Multi-packet request and response messages, with a maximum
+ size of 4 megaoctets per message. (Sections 2.3 and 2.14.)
+
+ - Selective retransmission. (Section 2.13.) and rate-based flow
+ control to reduce overrun and the cost of overruns. (Section
+ 2.5.6.)
+
+ - Secure message transactions with provision for a variety of
+ encryption schemes. (Section 2.6.)
+
+ - Multicast message transactions with multiple response messages
+ per request message. (Section 2.7.)
+
+ - Support for real-time communication with idempotent message
+ transactions with minimal server overhead and state (Section
+ 2.5.3), datagram request message transactions with no
+ response, optional header-only checksum, priority processing
+ of transactions, conditional delivery and preemptive handling
+ of requests (Section 2.8)
+
+ - Forwarded message transactions as an optimization for certain
+ forms of nested remote procedure calls or message
+ transactions. (Section 2.9.)
+
+ - Multiple outstanding (asynchronous) message transactions per
+ client. (Section 2.11.)
+
+ - An integrated management module, defined with a remote
+ procedure call interface on top of VMTP providing a variety of
+ communication services (Section 2.10.)
+
+ - Simple subset implementation for simple clients and simple
+ servers. (Section 2.16.)
+
+This chapter provides an overview of the protocol as introduction to the
+basic ideas and as preparation for the subsequent chapters that describe
+the packet formats and event processing procedures in detail.
+
+
+Cheriton [page 6]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+In overview, VMTP provides transport communication between network-
+visible entities via message transactions. A message transaction
+consists of a request message sent by the client, or requestor, to a
+group of server entities followed by zero or more response messages to
+the client, at most one from each server entity. A message is
+structured as a message control portion and a segment data portion. A
+message is transmitted as one or more packet groups. A packet group is
+one or more packets (up to a maximum of 32 packets) grouped by the
+protocol for acknowledgment, sequencing, selective retransmission and
+rate control.
+
+Entities and VMTP operations are managed using a VMTP management
+mechanism that is accessed through a procedural interface (RPC)
+implemented on top of VMTP. In particular, information about a remote
+entity is obtained and maintained using the Probe VMTP management
+operation. Also, acknowledgment information and requests for
+retransmission are sent as notify requests to the management module.
+(In the following description, reference to an "acknowledgment" of a
+request or a response refers to a management-level notify operation that
+is acknowledging the request or response.)
+
+
+2.1. Entities, Processes and Principals
+
+VMTP defines and uses three main types of identifiers: entity
+identifiers, process identifiers and principal identifiers, each 64-bits
+in length. Communication takes place between network-visible entities,
+typically mapping to, or representing, a message port or procedure
+invocation. Thus, entities are the VMTP communication endpoints. The
+process associated with each entity designates the agent behind the
+communication activity for purposes of resource allocation and
+management. For example, when a lock is requested on a file, the lock
+is associated with the process, not the requesting entity, allowing a
+process to use multiple entity identifiers to perform operations without
+lock conflict between these entities. The principal associated with an
+entity specifies the permissions, security and accounting designation
+associated with the entity. The process and principal identifiers are
+included in VMTP solely to make these values available to VMTP users
+with the security and efficiency provided by VMTP. Only the entity
+identifiers are actively used by the protocol.
+
+Entity identifiers are required to have three properties;
+
+Uniqueness Each entity identifier is uniquely defined at any given
+ time. (An entity identifier may be reused over time.)
+
+Stability An entity identifier does not change between valid
+
+
+Cheriton [page 7]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ meanings without suitable provision for removing
+ references to the entity identifier. Certain entity
+ identifiers are strictly stable, (i.e. never changing
+ meaning), typically being administratively assigned
+ (although they need not be bound to a valid entity at
+ all times), often called well-known identifiers. All
+ other entity identifiers are required to be T-stable,
+ not change meaning without having remained invalid for
+ at least a time interval T.
+
+Host address independent
+ An entity identifier is unique independent of the host
+ address of its current host. Moreover, an entity
+ identifier is not tied to a single Internet host
+ address. An entity can migrate between hosts, reside on
+ a mobile host that changes Internet addresses or reside
+ on a multi-homed host. It is up to the VMTP
+ implementation to determine and maintain up to date the
+ host addresses of entities with which it is
+ communicating.
+
+The stability of entity identifiers guarantees that an entity identifier
+represents the same logical communication entity and principal (in the
+security sense) over the time that it is valid. For example, if an
+entity identifier is authenticated as having the privileges of a given
+user account, it continues to have those privileges as long as it is
+continuously valid (unless some explicit notice is provided otherwise).
+Thus, a file server need not fully authenticate the entity on every file
+access request. With T-stable identifiers, periodically checking the
+validity of an entity identifier with period less than T seconds detects
+a change in entity identifier validity.
+
+A group of entities can form an entity group, which is a set of zero or
+more entities identified by a single entity identifier. For example,
+one can have a single entity identifier that identifies the group of
+name servers. An entity identifier representing an entity group is
+drawn from the same name space as entity identifiers. However, single
+entity identifiers are flagged as such by a bit in the entity
+identifier, indicating that the identifier is known to identify at most
+one entity. In addition to the group bit, each entity identifier
+includes other standard type flags. One flag indicates whether the
+identifier is an alias for an entity in another domain (See Section 2.2
+below.). Another flag indicates, for an entity group identifier,
+whether the identifier is a restricted group or not. A restricted group
+is one in which an entity can be added only by another entity with group
+management authorization. With an unrestricted group, an entity is
+allowed to add itself. If an entity identifier does not represent a
+
+
+Cheriton [page 8]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+group, a type bit indicates whether the entity uses big-endian or
+little-endian data representation (corresponding to Motorola 680X0 and
+VAX byte orders, respectively). Further specification of the format of
+entity identifiers is contained in Section 3.1 and Appendix IV.
+
+An entity identifier identifies a Client, a Server or a group of
+Servers <1>. A Client is always identified by a T-stable identifier. A
+server or group of servers may be identified by a a T-stable identifier
+(group or single entity) or by strictly stable (statically assigned)
+entity group identifier. The same T-stable identifier can be used to
+identify a Client and Server simultaneously as long as both are
+logically associated with the same entity. The state required for
+reliable, secure communication between entities is maintained in client
+state records (CSRs), which include the entity identifier of the Client,
+its principal, its current or next transaction identifier and so on.
+
+
+2.2. Entity Domains
+
+An entity domain is an administration or an administration mechanism
+that guarantees the three required entity identifier properties of
+uniqueness, stability and host address independence for the entities it
+administers. That is, entity identifiers are only guaranteed to be
+unique and stable within one entity domain. For example, the set of all
+Internet hosts may function as one domain. Independently, the set of
+hosts local to one autonomous network may function as a separate domain.
+Each entity domain is identified by an entity domain identifier, Domain.
+Only entities within the same domain may communicate directly via VMTP.
+However, hosts and entities may participate in multiple entity domains
+simultaneously, possibly with different entity identifiers. For
+example, a file server may participate in multiple entity domains in
+order to provide file service to each domain. Each entity domain
+specifies the algorithms for allocation, interpretation and mapping of
+entity identifiers.
+
+Domains are necessary because it does not appear feasible to specify one
+universal VMTP entity identification administration that covers all
+entities for all time. Domains limit the number of entities that need
+to be managed to maintain the uniqueness and stability of the entity
+
+_______________
+
+<1> Terms such as Client, Server, Request, Response, etc. are
+capitalized in this document when they refer to their specific meaning
+in VMTP.
+
+
+Cheriton [page 9]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+name space. Domains can also serve to separate entities of different
+security levels. For instance, allocation of a unclassified entity
+identifier cannot conflict with secret level entity identifiers because
+the former is interpreted only in the unclassified domain, which is
+disjoint from the secret domain.
+
+It is intended that there be a small number of domains. In particular,
+there should be one (or a few) domains per installation "type", rather
+than per installation. For example, the Internet is expected to use one
+domain per security level, resulting in at most 8 different domains.
+Cluster-based internetwork architectures, those with a local cluster
+protocol distinct from the wide-area protocol, may use one domain for
+local use and one for wide-area use.
+
+Additional details on the specification of specific domains is provided
+in Appendix IV.
+
+
+2.3. Message Transactions
+
+The message transaction is the unit of interaction between a Client that
+initiates the transaction and one or more Servers. A message
+transaction starts with a request message generated by a client. At
+the service interface, a server becomes involved with a transaction by
+receiving and accepting the request. A server terminates its
+involvement with a transaction by sending a response message. In a
+group message transaction, the server entity designated by the client
+corresponds to a group of entities. In this case, each server in the
+group receives a copy of the request. In the client's view, the
+transaction is terminated when it receives the response message or, in
+the case of a group message transaction, when it receives the last
+response message. Because it is normally impractical to determine when
+the last response message has been received. the current transaction is
+terminated by VMTP when the next transaction is initiated.
+
+Within an entity domain, a transaction is uniquely identified by the
+tuple (Client, Transaction, ForwardCount). where Transaction is a
+32-bit number and ForwardCount is a 4-bit value. A Client uses
+monotonically increasing Transaction identifiers for new message
+transactions. Normally, the next higher transaction number, modulo
+2**32, is used for the next message transaction, although there are
+cases in which it skips a small range of Transaction identifiers. (See
+the description of the STI control flag.) The ForwardCount is used when
+a message transaction is forwarded and is zero otherwise.
+
+A Client generates a stream of message transactions with increasing
+transaction identifiers, directed at a diversity of Servers. We say a
+
+
+Cheriton [page 10]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+Client has a transaction outstanding if it has invoked a message
+transaction, but has not received the last Response (or possibly any
+Response). Normally, a Client has only one transaction outstanding at a
+time. However, VMTP allows a Client to have multiple message
+transactions outstanding simultaneously, supporting streamed,
+asynchronous remote procedure call invocations. In addition, VMTP
+supports nested calls where, for example, procedure A calls procedure B
+which calls procedure C, each on a separate host with different client
+entity identifiers for each call but identified with the same process
+and principal.
+
+
+2.4. Request and Response Messages
+
+A message transaction consists of a request message and one or more
+Response messages. A message is structured as message control block
+(MCB) and segment data, passed as parameters, as suggested below.
+
+ +-----------------------+
+ | Message Control Block |
+ +-----------------------+
+ +-----------------------------------+
+ | segment data |
+ +-----------------------------------+
+
+In the request message, the MCB specifies control information about the
+request plus an optional data segment. The MCB has the following
+format:
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ + ServerEntityId (8 octets) +
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Flags | RequestCode |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ + CoresidentEntity (8 octets) +
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ > User Data (12 octets) <
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | MsgDelivery |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | SegmentSize |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+The ServerEntityId is the entity to which the Request MCB is to be sent
+(or was sent, in the case of reception). The Flags indicate various
+options in the request and response handling as well as whether the
+
+
+Cheriton [page 11]
+
+
+RFC 1045 VMTP February 1988
+
+
+CoresidentEntity, MsgDelivery and SegmentSize fields are in use. The
+RequestCode field specifies the type of Request. It is analogous to a
+packet type field of the Ethernet, acting as a switch for higher-level
+protocols. The CoresidentEntity field, if used, designates a subgroup
+of the ServerEntityId group to which the Request should be routed,
+namely those members that are co-resident with the specified entity (or
+entity group). The primary intended use is to specify the manager for a
+particular service that is co-resident with a particular entity, using
+the well-known entity group identifier for the service manager in the
+ServerEntityId field and the identifier for the entity in the
+CoresidentEntity field. The next 12 octets are user- or
+application-specified.
+
+The MsgDelivery field is optionally used by the RPC or user level to
+specify the portions of the segment data to transmit and on reception,
+the portions received. It provides the client and server with
+(optional) access to, and responsibility for, a simple selective
+transmission and reception facility. For example, a client may request
+retransmission of just those portions of the segment that it failed to
+receive as part of the original Response. The primary intended use is
+to support highly efficient multi-packet reading from a file server.
+Exploiting user-level selective retransmission using the MsgDelivery
+field, the file server VMTP module need not save multi-packet Responses
+for retransmission. Retransmissions, when needed, are instead handled
+directly from the file server buffers.
+
+The SegmentSize field indicates the size of the data segment, if
+present. The CoresidentEntity, MsgDelivery and SegmentSize fields are
+usable as additional user data if they are not otherwise used.
+
+The Flags field provides a simple mechanism for the user level to
+communicate its use of VMTP options with the VMTP module as well as for
+VMTP modules to communicate this use among themselves. The use of these
+options is generally fixed for each remote procedure so that an RPC
+mechanism using VMTP can treat the Flags as an integral part of the
+RequestCode field for the purpose of demultiplexing to the correct stub.
+
+A Response message control block follows the same format except the
+Response is sent from the Server to the Client and there is no
+Coresident Entity field (and thus 20 octets of user data).
+
+
+2.5. Reliability
+
+VMTP provides reliable, sequenced transfer of request and response
+messages as well as several variants, such as unreliable datagram
+requests. The reliability mechanisms include: transaction identifiers,
+
+
+Cheriton [page 12]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+checksums, positive acknowledgment of messages and timeout and
+retransmission of lost packets.
+
+
+2.5.1. Transaction Identifiers
+
+Each message transaction is uniquely identified by the pair (Client,
+Transaction). (We defer discussion of the ForwardCount field to Section
+2.9.) The 32-bit transaction identifier is initialized to a random
+value when the Client entity is created or allocated its entity
+identifier. The transaction identifier is incremented at the end of
+each message transaction. All Responses with the same specified
+(Client, Transaction) pair are associated with this Request.
+
+The transaction identifier is used for duplicate suppression at the
+Server. A Server maintains a state record for each Client for which it
+is processing a Request, identified by (Client, Transaction). A Request
+with the same (Client, Transaction) pair is discarded as a duplicate.
+(The ForwardCount field must also be equal.) Normally, this record is
+retained for some period after the Response is sent, allowing the Server
+to filter out subsequent duplicates of this Request. When a Request
+arrives and the Server does not have a state record for the sending
+Client, the Server takes one of three actions:
+
+ 1. The Server may send a Probe request, a simple query
+ operation, to the VMTP management module associated with the
+ requesting Client to determine the Client's current
+ Transaction identifier (and other information), initialize a
+ new state record from this information, and then process the
+ Request as above.
+
+ 2. The Server may reason that the Request must be a new request
+ because it does not have a state record for this Client if it
+ keeps these state records for the maximum packet lifetime of
+ packets in the network (plus the maximum VMTP retransmission
+ time) and it has not been rebooted within this time period.
+ That is, if the Request is not new either the Request would
+ have exceeded the maximum packet lifetime or else the Server
+ would have a state record for the Client.
+
+ 3. The Server may know that the Request is idempotent or can be
+ safely redone so it need not care whether the Request is a
+ duplicate or not. For example, a request for the current
+ time can be responded to with the current time without being
+ concerned whether the Request is a duplicate. The Response
+ is discarded at the Client if it is no longer of interest.
+
+
+
+Cheriton [page 13]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+2.5.2. Checksum
+
+Each VMTP packet contains a checksum to allow the receiver to detect
+corrupted packets independent of lower level checks. The checksum field
+is 32 bits, providing greater protection than the standard 16-bit IP
+checksum (in combination with an improved checksum algorithm). The
+large packets, high packet rates and general network characteristics
+expected in the future warrant a stronger checksum mechanism.
+
+The checksum normally covers both the VMTP header and the segment data.
+Optionally (for real-time applications), the checksum may apply only to
+the packet header, as indicated by the HCO control bit being set in the
+header. The checksum field is placed at the end of the packet to allow
+it to be calculated as part of a software copy or as part of a hardware
+transmission or reception packet processing pipeline, as expected in the
+next generation of network interfaces. Note that the number of header
+and data octets is an integral multiple of 8 because VMTP requires that
+the segment data be padded to be a multiple of 64 bits. The checksum
+field is appended after the padding, if any. The actual algorithm is
+described in Section 3.2.
+
+A zero checksum field indicates that no checksum was transmitted with
+the packet. VMTP may be used without a checksum only when there is a
+host-to-host error detection mechanism and the VMTP security facility is
+not being used. For example, one could rely on the Ethernet CRC if
+communication is restricted to hosts on the same Ethernet and the
+network interfaces are considered sufficiently reliable.
+
+
+2.5.3. Request and Response Acknowledgment
+
+VMTP assumes an unreliable datagram network and internetwork interface.
+To guarantee delivery of Requests and Response, VMTP uses positive
+acknowledgments, retransmissions and timeouts.
+
+A Request is normally acknowledged by receipt of a Response associated
+with the Request, i.e. with the same (Client, Transaction). With
+streamed message transactions, it may also be acknowledged by a
+subsequent Response that acknowledges previous Requests in addition to
+the transaction it explicitly identifies. A Response may be explicitly
+acknowledged by a NotifyVmtpServer operation requested of the manager
+for the Server. In the case of streaming, this is a cumulative
+acknowledgment, acknowledging all Responses with a lower transaction
+identifier as well.) In addition, with non-streamed communication, a
+subsequent Request from the same Client acknowledges Responses to all
+previous message transactions (at least in the sense that either the
+client received a Response or is no longer interested in Responses to
+
+
+Cheriton [page 14]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+those earlier message transactions). Finally, a client response timeout
+(at the server) acknowledges a Response at least in the sense that the
+server need not be prepared to retransmit the Response subsequently.
+Note that there is no end-to-end guarantee of the Response being
+received by the client at the application level.
+
+
+2.5.4. Retransmissions
+
+In general, a Request or Response is retransmitted periodically until
+acknowledged as above, up to some maximum number of retransmissions.
+VMTP uses parameters RequestRetries(Server) and ResponseRetries(Client)
+that indicate the number of retransmissions for the server and client
+respectively before giving up. We suggest the value 5 be used for both
+parameters based on our experience with VMTP and Internet packet loss.
+Smaller values (such as 3) could be used in low loss environments in
+which fast detection of failed hosts or communication channels is
+required. Larger values should be used in high loss environments where
+transport-level persistence is important.
+
+In a low loss environment, a retransmission only includes the MCB and
+not the segment data of the Request or Response, resulting in a single
+(short) packet on retransmission. The intended recipient of the
+retransmission can request selective retransmission of all or part of
+the segment data as necessary. The selective retransmission mechanism
+is described in Section 2.13.
+
+If a Response is specified as idempotent, the Response is neither
+retransmitted nor stored for retransmission. Instead, the Client must
+retransmit the Request to effectively get the Response retransmitted.
+The server VMTP module responds to retransmissions of the Request by
+passing the Request on to the server again to have it regenerate the
+Response (by redoing the operation), rather than saving a copy of the
+Response. Only Request packets for the last transaction from this
+client are passed on in this fashion; older Request packets from this
+client are discarded as delayed duplicates. If a Response is not
+idempotent, the VMTP module must ensure it has a copy of the Response
+for retransmission either by making a copy of the Response (either
+physically or copy-on-write) or by preventing the Server from continuing
+until the Response is acknowledged.
+
+
+2.5.5. Timeouts
+
+There is one client timer for each Client with an outstanding
+transaction. Similarly, there is one server timer for each Client
+transaction that is "active" at the server, i.e. there is a transaction
+
+
+Cheriton [page 15]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+record for a Request from the Client.
+
+When the client transmits a new Request (without streaming), the client
+timer is set to roughly the time expected for the Response to be
+returned. On timeout, the Request is retransmitted with the APG
+(Acknowledge Packet Group) bit set. The timeout is reset to the
+expected roundtrip time to the Server because an acknowledgment should
+be returned immediately unless a Response has been sent. The Request
+may also be retransmitted in response to receipt of a VMTP management
+operation indicating that selected portions of the Request message
+segment need to be retransmitted. With streaming, the timeout applies
+to the oldest outstanding message transaction in the run of outstanding
+message transactions. Without streaming, there is one message
+transaction in the run, reducing to the previous situation. After the
+first packet of a Response is received, the Client resets the timeout to
+be the time expected before the next packet in the Response packet group
+is received, assuming it is a multi-packet Response. If not, the timer
+is stopped. Finally, the client timer is used to timeout waiting for
+second and subsequent Responses to a multicast Request.
+
+The client timer is set at different times to four different values:
+
+TC1(Server) The expected time required to receive a Response from
+ the Server. Set on initial Request transmission plus
+ after its management module receives a NotifyVmtpClient
+ operation, acknowledging the Request.
+
+TC2(Server) The estimated round trip delay between the client and
+ the server. Set when retransmitting after receiving no
+ Response for TC1(Server) time and retransmitting the
+ Request with the APG bit set.
+
+TC3(Server) The estimated maximum expected interpacket time for
+ multi-packet Responses from the Server. Set when
+ waiting for subsequent Response packets within a packet
+ group before timing out.
+
+TC4 The time to wait for additional Responses to a group
+ Request after the first Response is received. This is
+ specified by the user level.
+
+These values are selected as follows. TC1 can be set to TC2 plus a
+constant, reflecting the time within which most servers respond to most
+requests. For example, various measurements of VMTP usage at Stanford
+indicate that 90 percent of the servers respond in less than 200
+milliseconds. Setting TC1 to TC2 + 200 means that most Requests receive
+a Response before timing out and also that overhead for retransmission
+
+
+Cheriton [page 16]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+for long running transactions is insignificant. A sophisticated
+implementation may make the estimation of TC1 further specific to the
+Server.
+
+TC2 may be estimated by measuring the time from when a Probe request is
+sent to the Server to when a response is received. TC2 can also be
+measured as the time between the transmission of a Request with the APG
+bit set to receipt of a management operation acknowledging receipt of
+the Request.
+
+When the Server is an entity group, TC1 and TC2 should be the largest of
+the values for the members of the group that are expected to respond.
+This information may be determined by probing the group on first use
+(and using the values for the last responses to arrive). Alternatively,
+one can resort to default values.
+
+TC3 is set initially to 10 times the transmission time for the maximum
+transmission unit (MTU) to be used for the Response. A sophisticated
+implementation may record TC3 per Server and refine the estimate based
+on measurements of actual interpacket gaps. However, a tighter estimate
+of TC3 only improves the reaction time when a packet is lost in a packet
+group, at some cost in unnecessary retransmissions when the estimate
+becomes overly tight.
+
+The server timer, one per active Client, takes on the following values:
+
+TS1(Client) The estimated maximum expected interpacket time. Set
+ when waiting for subsequent Request packets within a
+ packet group before timing out.
+
+TS2(Client) The time to wait to hear from a client before
+ terminating the server processing of a Request. This
+ limits the time spent processing orphan calls, as well
+ as limiting how out of date the server's record of the
+ Client state can be. In particular, TS2 should be
+ significantly less than the minimum time within which it
+ is reasonable to reuse a transaction identifier.
+
+TS3(Client) Estimated roundtrip time to the Client,
+
+TS4(Client) The time to wait after sending a Response (or last
+ hearing from a client) before discarding the state
+ associated with the Request which allows it to filter
+ duplicate Request packets and regenerate the Response.
+
+TS5(Client) The time to wait for an acknowledgment after sending a
+ Response before retransmitting the Response, or giving
+
+
+Cheriton [page 17]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ up (after some number of retransmissions).
+
+TS1 is set the same as TC3.
+
+The suggested value for TS2 is TC1 + 3*TC2 for this server, giving the
+Client time to timeout waiting for a Response and retransmit 3 Request
+packets, asking for acknowledgments.
+
+TS3 is estimated the same as TC1 except that refinements to the estimate
+use measurements of the Response-to-acknowledgment times.
+
+In the general case, TS4 is set large enough so that a Client issuing a
+series of closely-spaced Requests to the same Server reuses the same
+state record at the Server end and thus does not incur the overhead of
+recreating this state. (The Server can recreate the state for a Client
+by performing a Probe on the Client to get the needed information.) It
+should also be set low enough so that the transaction identifier cannot
+wrap around and so that the Server does not run out of CSR's. We
+suggest a value in the range of 500 milliseconds. However, if the
+Server accepts non-idempotent Requests from this Client without doing a
+Probe on the Client, the TS4 value for this CSR is set to at least 4
+times the maximum packet lifetime.
+
+TS5 is TS3 plus the expected time for transmission and reception of the
+Response. We suggest that the latter be calculated as 3 times the
+transmission time for the Response data, allowing time for reception,
+processing and transmission of an acknowledgment at the Client end. A
+sophisticated implementation may refine this estimate further over time
+by timing acknowledgments to Responses.
+
+
+2.5.6. Rate Control
+
+VMTP is designed to deal with the present and future problem of packet
+overruns. We expect overruns to be the major cause of dropped packets
+in the future. A client is expected to estimate and adjust the
+interpacket gap times so as to not overrun a server or intermediate
+nodes. The selective retransmission mechanism allows the server to
+indicate that it is being overrun (or some intermediate point is being
+overrun). For example, if the server requests retransmission of every
+Kth block, the client should assume overrun is taking place and increase
+the interpacket gap times. The client passes the server an indication
+of the interpacket gap desired for a response. The client may have to
+increase the interval because packets are being dropped by an
+intermediate gateway or bridge, even though it can handle a higher rate.
+A conservative policy is to increase the interpacket gap whenever a
+packet is lost as part of a multi-packet packet group.
+
+
+Cheriton [page 18]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+The provision of selective retransmission allows the rate of the client
+and the server to "push up" against the maximum rate (and thus lose
+packets) without significant penalty. That is, every time that packet
+transmission exceeds the rate of the channel or receiver, the recovery
+cost to retransmit the dropped packets is generally far less than
+retransmitting from the first dropped packet.
+
+The interpacket gap is expressed in 1/32nd's of the MTU packet
+transmission time. The minimum interpacket gap is 0 and the maximum gap
+that can be described in the protocol is 8 packet times. This places a
+limit on the slowest receivers that can be efficiently used on a
+network, at least those handling multi-packet Requests and Responses.
+This scheme also limits the granularity of adjustment. However, the
+granularity is relative to the speed of the network, as opposed to an
+absolute time. For entities on different networks of significantly
+different speed, we assume the interconnecting gateways can buffer
+packets to compensate<2>. With different network speeds and intermediary
+nodes subject to packet loss, a node must adjust the interpacket gap
+based on packet loss. The interpacket gap parameter may be of limited
+use.
+
+
+2.6. Security
+
+VMTP provides an (optional) secure mode that protects against the usual
+security threats of peeking, impostoring, message tampering and replays.
+Secure VMTP must be used to guarantee any of the transport-level
+reliability properties unless it is guaranteed that there are no
+intruders or agents that can modify packets and update the packet
+checksums. That is, non-secure VMTP provides no guarantees in the
+presence of an intelligent intruder.
+
+The design closely follows that described by Birrell [1]. Authenticated
+information about a remote entity, including an encryption/decryption
+key, is obtained and maintained using a VMTP management operation, the
+authenticated Probe operation, which is executed as a non-secure VMTP
+message transaction. If a server receives a secure Request for which
+the server has no entity state, it sends a Probe request to the VMTP
+
+_______________
+
+<2> Gateways must also employ techniques to preserve or intelligently
+modify (if appropriate) the interpacket gaps. In particular, they must
+be sure not to arbitrarily remove interpacket gaps as a result of their
+forwarding of packets.
+
+
+Cheriton [page 19]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+management module of the client, "challenging" it to provide an
+authenticator that both authenticates the client as being associated
+with a particular principal as well as providing a key for
+encryption/decryption. The principal can include a real and effective
+principal, as used in UNIX <3>. Namely, the real principal is the
+principal on whose behalf the Request is being performed whereas the
+effective principal is the principal of the module invoking the request
+or remote procedure call.
+
+Peeking is prevented by encrypting every Request and Response packet
+with a working Key that is shared between Client and Server.
+Impostoring and replays are detected by comparing the Transaction
+identifier with that stored in the corresponding entity state record
+(which is created and updated by VMTP as needed). Message tampering is
+detected by encryption of the packet including the Checksum field. An
+intruder cannot update the checksum after modifying the packet without
+knowing the Key. The cost of fully encrypting a packet is close to the
+cost of generating a cryptographic checksum (and of course, encryption
+is needed in the general case), so there is no explicit provision for
+cryptographic checksum without packet encryption.
+
+A Client determines the Principal of the Server and acquires an
+authenticator for this Server and Principal using a higher level
+protocol. The Server cannot decrypt the authenticator or the Request
+packets unless it is in fact the Principal expected by the Client.
+
+An encrypted VMTP packet is flagged by the EPG bit in the VMTP packet
+header. Thus, encrypted packets are easily detected and demultiplexed
+from unencrypted packets. An encrypted VMTP packet is entirely
+encrypted except for the Client, Version, Domain, Length and Packet
+Flags fields at the beginning of the packet. Client identifiers can be
+assigned, changed and used to have no real meaning to an intruder or to
+only communicate public information (such as the host Internet address).
+They are otherwise just a random means of identification and
+demultiplexing and do not therefore divulge any sensitive information.
+Further secure measures must be taken at the network or data link levels
+if this information or traffic behavior is considered sensitive.
+
+VMTP provides multiple authentication domains as well as an encryption
+qualifier to accommodate different encryption algorithms and their
+
+_______________
+
+<3> Principal group membership must be obtained, if needed, by a
+higher level protocol.
+
+
+Cheriton [page 20]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+corresponding security/performance trade-offs. (See Appendix V.) A
+separate key distribution and authentication protocol is required to
+handle generation and distribution of authenticators and keys. This
+protocol can be implemented on top of VMTP and can closely follow the
+Birrell design as well.
+
+Security is optional in the sense that messages may be secure or
+non-secure, even between consecutive message transactions from the same
+client. It is also optional in that VMTP clients and servers are not
+required to implement secure VMTP (although they are required to respond
+intelligently to attempts to use secure VMTP). At worst, a Client may
+fail to communicate with a Server if the Server insists on secure
+communication and the Client does not implement security or vice versa.
+However, a failure to communicate in this case is necessary from a
+security standpoint.
+
+
+2.7. Multicast
+
+The Server entity identifier in a message transaction can identify an
+entity group, in which case the Request is multicast to every Entity in
+this group (on a best-efforts basis). The Request is retransmitted
+until at least one Response is received (or an error timeout occurs)
+unless it is a datagram Request. The Client can receive multiple
+Responses to the Request.
+
+The VMTP service interface does not directly provide reliable multicast
+because it is expensive to provide, rarely needed by applications, and
+can be implemented by applications using the multiple Response feature.
+However, the protocol itself is adequate for reliable multicast using
+positive acknowledgments. In particular, a sophisticated Client
+implementation could maintain a list of members for each entity group of
+interest and retransmit the Request until acknowledged by all members.
+No modifications are required to the Server implementations.
+
+VMTP supports a simple form of subgroup addressing. If the CRE bit is
+set in a Request, the Request is delivered to the subgroup of entities
+in the Server group that are co-resident with one or more entities in
+the group (or individual entity) identified by the CoresidentEntity
+field of the Request. This is commonly used to send to the manager
+entity for a particular entity, where Server specifies the group of such
+managers. Co-resident means "using the same VMTP module", and logically
+on the same network host. In particular, a Probe request can be sent to
+the particular VMTP management module for an entity by specifying the
+VMTP management group as the Server and the entity in question as the
+CoResidentEntity.
+
+
+
+Cheriton [page 21]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+As an experimental aspect of the protocol, VMTP supports the Server
+sending a group Response which is sent to the Client as well as members
+of the destination group of Servers to which the original Request was
+sent. The MDG bit indicates whether the Client is a member of this
+group, allowing the Server module to determine whether separately
+addressed packet groups are required to send the Response to both the
+Client and the Server group. Normally, a Server accepts a group
+Response only if it has received the Request and not yet responded to
+the Client. Also, the Server must explicitly indicate it wants to
+accept group Responses. Logically, this facility is analogous to
+responding to a mail message sent to a distribution list by sending a
+copy of the Response to the distribution list.
+
+
+2.8. Real-time Communication
+
+VMTP provides three forms of support for real-time communication, in
+addition to its standard facilities, which make it applicable to a wide
+range of real-time applications. First, a priority is transmitted in
+each Request and Response which governs the priority of its handling.
+The priority levels are intended to correspond roughly to:
+
+ - urgent/emergency.
+
+ - important
+
+ - normal
+
+ - background.
+
+with additional gradations for each level. The interpretation and
+implementation of these priority levels is otherwise host-specific, e.g.
+the assignment to host processing priorities.
+
+Second, datagram Requests allow the Client to send a datagram to another
+entity or entity group using the VMTP naming, transmission and delivery
+mechanism, but without blocking, retransmissions or acknowledgment.
+(The client can still request acknowledgment using the APG bit although
+the Server does not expect missing portions of a multi-packet datagram
+Request to be retransmitted even if some are not received.) A datagram
+Request in non-streamed mode supersedes all previous Requests from the
+same Client. A datagram Request in stream mode is queued (if necessary)
+after previous datagram Requests on the same stream. (See Section
+2.11.)
+
+Finally, VMTP provides several control bit flags to modify the handling
+of Requests and Responses for real-time requirements. First, the
+
+
+Cheriton [page 22]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+conditional message delivery (CMD) flag causes a Request to be discarded
+if the recipient is not waiting for it when it arrives, similarly for
+the Response. This option allows a client to send a Request that is
+contingent on the server being able to process it immediately. The
+header checksum only (HCO) flag indicates that the checksum has been
+calculated only on the VMTP header and not on the data segment.
+Applications such as voice and video can avoid the overhead of
+calculating the checksum on data whose utility is insensitive to typical
+bit errors without losing protection on the header information.
+Finally, the No Retransmission (NRT) flag indicates that the recipient
+of a message should not ask for retransmission if part of the message is
+missing but rather either use what was received or discard it.
+
+None of these facilities introduce new protocol states. In fact, the
+total processing overhead in the normal case is a bit flag test for CMD,
+HCO or NRT plus assignment of priority on packet transmission and
+reception. (In fact, CMD and NRT are not tested in the normal case.)
+The additional code complexity is minimal. We feel that the overhead
+for providing these real-time facilities is minimal and that these
+facilities are both important and adequate for a wide class of real-time
+applications.
+
+Several of the normal facilities of VMTP appear useful for real-time
+applications. First, multicast is useful for distributed, replicated
+(fault-tolerant) real-time applications, allowing efficient state query
+and update for (for example) sensors and control state. Second, the DGM
+or idempotent flag for Responses has some real-time benefits, namely: a
+Request is redone to get the latest values when the Response is lost,
+rather than just returning the old values. The desirability of this
+behavior is illustrated by considering a request for the current time of
+day. An idempotent handling of this request gives better accuracy in
+returning the current time in the case that a retransmission is
+necessary. Finally, the request-response semantics (in the absence of
+streaming) of each new Request from a Client terminating the previous
+message transactions from that Client, if any, provides the "most recent
+is most important" handling of processing that most real-time
+applications require.
+
+In general, a key design goal of VMTP was provide an efficient
+general-purpose transport protocol with the features required for
+real-time communication. Further experience is required to determine
+whether this goal has been achieved.
+
+
+
+
+
+
+
+Cheriton [page 23]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+2.9. Forwarded Message Transactions
+
+A Server may invoke another Server to handle a Request. It is fairly
+common for the invocation of the second Server to be the last action
+performed by the first Server as part of handling the Request. For
+example, the original Server may function primarily to select a process
+to handle the Request. Also, the Server may simply check the
+authorization on the Request. Describing this situation in the context
+of RPC, a nested remote procedure call may be the last action in the
+remote procedure and the return parameters are exactly those of the
+nested call. (This situation is analogous to tail recursion.)
+
+As an optimization to support this case, VMTP provides a Forward
+operation that allows the server to send the nested Request to the other
+server and have this other server respond directly to the Client.
+
+If the message transaction being forwarded was not multicast, not secure
+or the two Servers are the same principal and the ForwardCount of the
+Request is less than the maximum forward count of 15, the Forward
+operation is implemented by the Server sending a Request onto the next
+Server with the forwarded Request identified by the same Client and
+Transaction as the original Request and a ForwardCount one greater than
+the Request received from the Client. In this case, the new Server
+responds directly to the Client. A forwarded Request is illustrated in
+the following figure.
+
+ +---------+ Request +----------+
+ | Client +---------------->| Server 1 |
+ +---------+ +----------+
+ ^ |
+ | | forwarded Request
+ | V
+ | Response +----------+
+ +----------------------| Server 2 |
+ +----------+
+
+If the message transaction does not meet the above requirements, the
+Server's VMTP module issues a nested call and simply maps the returned
+Response to a Response to original Request without further Server-level
+processing. In this case, the only optimization over a user-level
+nested call is one fewer VMTP service operation; the VMTP module handles
+the return to the invoking call directly. The Server may also use this
+form of forwarding when the Request is part of a stream of message
+transactions. Otherwise, it must wait until the forwarded message
+transaction completes before proceeding with the subsequent message
+transactions in the stream.
+
+
+
+Cheriton [page 24]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+Implementation of the user-level Forward operation is optional,
+depending on whether the server modules require this facility. Handling
+an incoming forwarded Request is a minor modification of handling a
+normal incoming Request. In particular, it is only necessary to examine
+the ForwardCount field when the Transaction of the Request matches that
+of the last message transaction received from the Client. Thus, the
+additional complexity in the VMTP module for the required forwarding
+support is minimal; the complexity is concentrated in providing a highly
+optimized user-level Forward primitive, and that is optional.
+
+
+2.10. VMTP Management
+
+VMTP management includes operations for creating, deleting, modifying
+and querying VMTP entities and entity groups. VMTP management is
+logically implemented by a VMTP management server module that is invoked
+using a message transaction addressed to the Server, VMTP_MANAGER_GROUP,
+a well-known group entity identifier, in conjunction with Coresident
+Entity mechanism introduced in Section 2.7. A particular Request may
+address the local module, the module managing a particular entity, the
+set of modules managing those entities contained in a specific group or
+all management modules, as appropriate.
+
+The VMTP management procedures are specified in Appendix III.
+
+
+2.11. Streamed Message Transactions
+
+Streamed message transactions refer to two or more message transactions
+initiated by a Client before it receives the response to the first
+message transaction, with each transaction being processed and responded
+to in order but asynchronous relative to the initiation of the
+transactions. A Client streams messages transactions, and thereby has
+multiple message transactions outstanding, by sending them as part of a
+single run of message transactions. A run of message transactions is a
+sequence of message transactions with the same Client and Server and
+consecutive Transaction identifiers, with all but the first and last
+Requests and Responses flagged with the NSR (Not Start Run) and NER
+(Not End Run) control bits. (Conversely, the first Request and
+Response does not have the NSR set and the last Request and Response
+does not have the NER bit set.) The message transactions in a run use
+
+
+
+
+
+
+
+
+Cheriton [page 25]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+consecutive transaction identifiers (except if the STI bit <4> is used
+in one, in which case the transaction identifier for the next message
+transaction is 256 greater, rather than 1).
+
+The Client retains a record for each outstanding transaction until it
+gets a Response or is timed out in error. The record provides the
+information required to retransmit the Request. On retransmission
+timeout, the client retransmits the last Request for which it has not
+received a Response the same as is done with non-streamed communication.
+(I.e. there need be only one timeout for all the outstanding message
+transactions associated with a single client.)
+
+The consecutive transaction identifiers within a run of message
+transactions are used as sequence numbers for error control. The Server
+handles each message transaction in the sequence specified by its
+transaction identifier. When it receives a message transaction that is
+not marked as the beginning of a run, it checks that it previously
+received a message transaction with the predecessor transaction
+identifier, either 1 less than the current one or 256 less if the
+previous one had the STI bit set. If not, the Server sends a
+NotifyVmtpClient operation to the Client's manager indicating either:
+(1) the first message transaction was not fully received, or else (2) it
+has no record of the last one received. If the NRT control flag is set,
+it does not await nor expect retransmission but proceeds with handling
+this Request. This flag is used primarily when datagram Requests are
+used as part of a stream of message transactions. If NRT was not
+specified, the Client must retransmit from the first message transaction
+not fully received (either at all or in part) before the Server can
+proceed with handling this run of Requests or else restart the run of
+message transactions.
+
+The Client expects to receive the Responses in a consecutive sequence,
+using the Transaction identifier to detect missing Responses. Thus, the
+Server must return Responses in sequence except possibly for some gaps,
+as follows. The Server can specify in the PGcount field in a Response,
+the number of consecutively previous Responses that this Response
+
+
+
+
+_______________
+
+<4> The STI bit is used by the Client to effectively allocate 255
+transaction identifiers for use by the Server in returning a large
+Response or stream of Responses.
+
+
+Cheriton [page 26]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+corresponds to, up to a maximum of 255 previous Responses <5>. Thus,
+for example, a Response with Transaction identifier 46 and PGcount 3
+represents Responses 43, 44, 45 and 46. This facility allows the Server
+to eliminate sending Responses to Requests that require no Response,
+effectively batching the Responses into one. It also allows the Server
+to effectively maintain strictly consecutive sequencing when the Client
+has skipped 256 Transaction identifiers using the STI bit and the Server
+does not have that many Responses to return.
+
+If the Client receives a Response that is not consecutive, it
+retransmits the Request(s) for which the Response(s) is/are missing
+(unless, of course, the corresponding Requests were sent as datagrams).
+The Client should wait at the end of a run of message transactions for
+the last one to complete.
+
+When a Server receives a Request with the NSR bit clear and a higher
+transaction identifier than it currently has for the Client, it
+terminates all processing and discards Responses associated with the
+previous Requests. Thus, a stream of message transactions is
+effectively aborted by starting a new run, even if the Server was in the
+middle of handling the previous run.
+
+Using a mixture of datagram and normal Requests as part of a stream of
+message transactions, particularly with the use of the NRT bit, can lead
+to complex behavior under packet loss. It is recommended that a run of
+message transactions be all of one type to avoid problems, i.e. all
+normal or all datagrams. Finally, when a Server forwards a Request that
+is part of a run, it must suspend further processing of the subsequent
+Requests until the forwarded Request has been handled, to preserve order
+of processing. The simplest handling of this situation is to use a real
+nested call when forwarding with streamed message transactions.
+
+Flow control of streamed message transactions relies on rate control at
+the Client plus receipt (or non-receipt) of management notify operations
+indicating the presence of overrunning. A Client must reduce the number
+of outstanding message transactions at the Server when it receives a
+NotifyVmtpServer operation with the MSGTRANS_OVERFLOW ResponseCode. The
+transact parameter indicates the last packet group that was accepted.
+
+
+_______________
+
+<5> PGcount actually corresponds to packet groups which are described
+in Section 2.13. This (simplified) description is accurate when there
+is one Request or Response per packet group.
+
+
+Cheriton [page 27]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+The implementation of multiple outstanding message transactions requires
+the ability to record, timeout and buffer multiple outstanding message
+transactions at the Client end as well as the Server end. However, this
+facility is optional for both the Client and the Server. Client systems
+with heavy-weight processes and high network access cost are most likely
+to benefit from this facility. Servers that serve a wide variety of
+client machines should implement streaming to accommodate these types of
+clients.
+
+
+2.12. Fault-Tolerant Applications
+
+One approach to fault-tolerant systems is to maintain a log of all
+messages sent at each node and replay the messages at a node when the
+node fails, after restarting it from the last checkpoint <6>. As an
+experimental facility, VMTP provides a Receive Sequence Number field in
+the NotifyVmtpClient and NotifyVmtpServer operations as well as the Next
+Receive Sequence (NRS) flag in the Response packet to allow a sender to
+log a receive sequence number with each message sent, allowing the
+packets to be replayed at a recovering node in the same sequence as they
+were originally received, thereby recovering to the same state as
+before.
+
+Basically, each sending node maintains a receive sequence number for
+each receiving node. On sending a Request to a node, it presume that
+the receive sequence number is one greater than the one it has recorded
+for that node. If not, the receiving node sends a notify operation
+indicating the receive sequence number assigned the Request. The NRS in
+the Response confirms that the Request message was the next receive
+sequence number, so the sender can detect if it failed to receive the
+notify operation in the previous case. With Responses, the packets are
+ordered by the Transaction identifier except for multicast message
+transactions, in which there may be multiple Responses with the same
+identification. In this case, NotifyVmtpServer operations are used to
+provide receive sequence numbers.
+
+This experimental extension of the protocol is focused on support for
+fault-tolerant real-time distributed systems required in various
+critical applications. It may be removed or extended, depending on
+further investigations.
+
+_______________
+
+<6> The sender-based logging is being investigated by Willy Zwaenepoel
+of Rice University.
+
+
+Cheriton [page 28]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+2.13. Packet Groups
+
+A message (whether Request or Response) is sent as one or more packet
+groups. A packet group is one or more packets, each containing the same
+transaction identification and message control block. Each packet is
+formatted as below with the message control block logically embedded in
+the VMTP header.
+
+ +------------------------------------++---------------------+
+ | VMTP Header || |
+ +------------+-----------------------|| segment data |
+ |VMTP Control| Message Control Block || |
+ +------------+-----------------------++---------------------+
+
+The some fields of the VMTP control portion of the packet and data
+segment portion can differ between packets within the same packet group.
+
+The segment data portion of a packet group represents up to 16
+kilooctets of the segment specified in the message control block. The
+portion contained in each packet is indicated by the PacketDelivery
+field contained in the VMTP header. The PacketDelivery field as a bit
+mask has a similar interpretation to the MsgDelivery field in that each
+bit corresponds to a segment data block of 512 octets. The
+PacketDelivery field limits a packet group to 16 kilooctets and a
+maximum of 32 VMTP packets (with a minimum of 1 packet). Data can be
+sent in fewer packets by sending multiple data blocks per packet. We
+require that the underlying datagram service support delivery of (at
+minimum) the basic 580 octet VMTP packet <7>. To illustrate the use of
+the PacketDelivery field, consider for example the Ethernet which has a
+MTU of 1536 octets. so one would send 2 512-octet segment data blocks
+per packet. (In fact, if a third block is last in the segment and less
+than 512 octets and fits in the packet without making it too big, an
+Ethernet packet could contain three data blocks. Thus, an Ethernet
+packet group for a segment of size 0x1D00 octets (14.5 blocks) and
+MsgDelivery 0x000074FF consists of 6 packets indicated as follows <8>.
+
+_______________
+
+<7> Note that with a 20 octet IP header, a VMTP packet is 600
+octets. We propose the convention that any host implementing VMTP
+implicitly agrees to accept IP/VMTP packets of at least 600 octets.
+
+<8> We use the C notation 0xHHHH to represent a hexadecimal number.
+
+
+Cheriton [page 29]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Packet
+ Delivery 1 1 1 1 1 1 1 1 0 0 1 0 1 0 1 0 0 0 0 0 0 . . .
+ 0000 0400 0800 0C00 1000 1400 1800 1C00
+ +----+----+----+----+----+----+----+-+
+ Segment |....|....|....|....|....|....|....|.|
+ +----+----+----+----+----+----+----+-+
+ : : : : : : : / / :
+ v v v v v v v /| v
+ +----+----+----+----+ +----+ +---+
+ Packets | 1 | 2 | 3 | 4 | | 5 | | 6 |
+ +----+----+----+----+ +----+ +---+
+
+Each '.' is 256 octets of data. The PacketDelivery masks for the 6
+packets are: 0x00000003, 0x0000000C, 0x00000030, 0x000000C0, 0x00001400
+and 0x00006000, indicating the segment blocks contained in each of the
+packets. (Note that the delivery bits are in little endian order.)
+
+A packet group is sent as a single "blast" of packets with no explicit
+flow control. However, the sender should estimate and transmit at a
+rate of packet transmission to avoid congesting the network or
+overwhelming the receiver, as described in Section 2.5.6. Packets in a
+packet group can be sent in any order with no change in semantics.
+
+When the first packet of a packet group is received (assuming the Server
+does not decide to discard the packet group), the Server saves a copy of
+the VMTP packet header, indicates it is currently receiving a packet
+group, initializes a "current delivery mask" (indicating the data in the
+segment received so far) to 0, accepts this packet (updating the current
+delivery mask) and sets the timer for the packet group. Subsequent
+packets in the packet group update the current delivery mask.
+
+Reception of a packet group is terminated when either the current
+delivery mask indicates that all the packets in the packet group have
+been received or the packet group reception timer expires (set to TC3 or
+TS1). If the packet group reception timer expires, if the NRT bit is
+set in the Control flags then the packet group is discarded if not
+complete unless MDM is set. In this case, the MsgDelivery field in the
+message control block is set to indicate the segment data blocks
+actually received and the message control block and segment data
+received is delivered to application level.
+
+If NRT is not set and not all data blocks have been received, a
+NotifyVmtpClient (if a Request) or NotifyVmtpServer (if a Response) is
+sent back with a PacketDelivery field indicating the blocks received.
+The source of the packet group is then expected to retransmit the
+missing blocks. If not all blocks of a Request are received after
+RequestAckRetries(Client) retransmissions, the Request is discarded and
+
+
+Cheriton [page 30]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+a NotifyVmtpClient operation with an error response code is sent to the
+client's manager unless MDM is set. With a Response, there are
+ResponseAckRetries(Server) retransmissions and then, if MDM is not set,
+the requesting entity is returned the message control block with an
+indication of the amount of segment data received extending contiguously
+from the start of the segment. E.g. if the sender sent 6 512-octet
+blocks and only the first two and the last two arrived, the receiver
+would be told that 1024 octets were received. The ResponseCode field is
+set to BAD_REPLY_SEGMENT. (Note that VMTP is only able to indicate the
+specific segment blocks received if MDM is set.)
+
+The parameters RequestAckRetries(Client) and ResponseAckRetries(Server)
+could be set on a per-client and per-server basis in a sophisticated
+implementation based on knowledge of packet loss.
+
+If the APG flag is set, a NotifyVmtpClient or NotifyVmtpServer
+operation is sent back at the end of the packet group reception,
+depending on whether it is a Request or a Response.
+
+At minimum, a Server should check that each packet in the packet group
+contains the same Client, Server, Transaction identifier and SegmentSize
+fields. It is a protocol error for any field other than the Checksum,
+packet group control flags, Length and PacketDelivery in the VMTP header
+to differ between any two packets in one packet group. A packet group
+containing a protocol error of this nature should be discarded.
+
+Notify operations should be sent (or invoked) in the manager whenever
+there is a problem with a unicast packet. i.e. negative acknowledgments
+are always sent in this case. In the case of problems with multicast
+packets, the default is to send nothing in response to an error
+condition unless there is some clear reason why no other node can
+respond positively. For example, the packet might be a Probe for an
+entity that is known to have been recently existing on the receiving
+host but now invalid and could not have migrated. In this case, the
+receiving host responds to the Probe indicating the entity is
+nonexistent, knowing that no other host can respond to the Probe. For
+packets and packet groups that are received and processed without
+problems, a Notify operation is invoked only if the APG bit is set.
+
+
+2.14. Runs of Packet Groups
+
+A run of packet groups is a sequence of packet groups, all Request
+packets or all Response packets, with the same Client and consecutive
+transaction identifiers, all but the first and last packets flagged with
+the NSR (Not Start Run) and NER (Not End Run) control bits. When each
+packet group in the run corresponds to a single Request or Response, it
+
+
+Cheriton [page 31]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+is identical to a run of message transactions. (See Section 2.11)
+However, a Request message or a Response message may consists of up to
+256 packet groups within a run, for a maximum of 4 megaoctets of segment
+data. A message that is continued in the next packet group in the run
+is flagged in the current packet group by the CMG flag. Otherwise, the
+next packet group in the run (if any) is treated as a separate Request
+or Response.
+
+Normally, each Request and Response message is sent as a single packet
+group and each run consists of a single packet group. In this case
+neither NSR or NER are set. For multi-packet group messages, the
+PacketDelivery mask in the i-th packet group of a message corresponds to
+the portion of the segment offset by i-1 times 16 kilooctets,
+designating the the first packet group to have i = 1.
+
+
+2.15. Byte Order
+
+For purposes of transmission and reception, the MCB is treated as
+consisting of 8 32-bit fields and the segment is a sequence of bytes.
+VMTP transmits the MCB in big-endian order, performing byte-swapping, if
+necessary, before transmission. A little-endian host must byte-swap the
+MCB on reception. (The data segment is transmitted as a sequence of
+bytes with no reordering.) The byte order of the sender of a message is
+indicated by the LEE bit in the entity identifier for the sender, the
+Client field if a Request and the Server field if a Response. The
+sender and receiver of a message are required to agree in some higher
+level protocol (such as an RPC presentation protocol) on who does
+further swapping of the MCB and data segment if required by the types of
+the data actually being transmitted. For example, the segment data may
+contain a record with 8-bit, 16-bit and 32-bit fields, so additional
+transformation is required to move the segment from a host of one byte
+order to another.
+
+VMTP to date has used a higher-level presentation protocol in which
+segment data is sent in the native order of the sending host and
+byte-swapped as necessary by the receiving host. This approach
+minimizes the byte-swapping overhead between machines of common byte
+order (including when the communication is transparently local to one
+host), avoids a strong bias in the protocol to one byte-order, and
+allows for the sending entity to be sending to a group of hosts with
+different byte orders. (Note that the byte-swap overhead for the MCB is
+minimal.) The presentation-level overhead is minimal because most
+common operations, such as file access operations, have parameters that
+fit the MCB and data segment data types exactly.
+
+
+
+
+Cheriton [page 32]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+2.16. Minimal VMTP Implementation
+
+A minimal VMTP client needs to be able to send a Request packet group
+and receive a Response packet group as well as accept and respond to
+Requests sent to its management module, including Probe and NotifyClient
+operations. It may also require the ability to invoke Probe and Notify
+operations to locate a Server and acknowledge responses. (the latter
+only if it is involved in transactions that are not idempotent or
+datagram message transactions. However, a simple sensor, for example,
+can transmit VMTP datagram Requests indicating its current state with
+even less mechanism.) The minimal client thus requires very little code
+and is suitable as a basis for (e.g.) a network boot loader.
+
+A minimal VMTP server implements idempotent, non-encrypted message
+transactions, possibly with no segment data support. It should use an
+entity state record for each Request but need only retain it while
+processing the Request. Without segment data larger than a packet,
+there is no need for any timers, buffering (outside of immediate request
+processing) or queuing. In particular, it needs only as many records as
+message transactions it handles simultaneously (e.g. 1). The entity
+state record is required to recognize and respond to Request
+retransmissions during request processing.
+
+The minimal server need only receive Requests and and be able to send
+Response packets. It need have only a minimal management module
+supporting Probe operations. (Support for the NotifyVmtpClient
+operation is only required if it does not respond immediately to a
+Request.) Thus the VMTP support for say a time server, sensor, or
+actuator can be extremely simple. Note that the server need never issue
+a Probe operation if it uses the host address of the Request for the
+Response and does not require the Client information returned by the
+Probe operation. The minimal server should also support reception of
+forwarded Requests.
+
+
+2.17. Message vs. Procedural Request Handling
+
+A request-response protocol can be used to implement two forms of
+semantics on reception. With procedural handling of a Request, a
+Request is handled by a process associated with the Server that
+effectively takes on the identity of the calling process, treating the
+Request message as invoking a procedure, and relinquishing its
+association to the calling process on return. VMTP supports multiple
+nested calls spanning multiple machines. In this case, the distributed
+call stack that results is associated with a single process from the
+standpoint of authentication and resource management, using the
+ProcessId field supported by VMTP. The entity identifiers effectively
+
+
+Cheriton [page 33]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+link these call frames together. That is, the Client field in a Request
+is effectively the return link to the previous call frame.
+
+With message handling of a Request, a Request message is queued for a
+server process. The server process dequeues, reads, processes and
+responds to the Request message, executing as a separate process.
+Subsequent Requests to the same server are queued until the server asks
+to receive the next Request.
+
+Procedural semantics have the advantage of allowing each Request (up to
+the resource limits of the Server) to execute concurrently at the
+Server, with Request-specific synchronization. Message semantics have
+the advantage that Requests are serialized at the Server and that the
+request processing logically executes with the priority, protection and
+independent execution of a separate process. Note that procedural and
+message handling of a request appear no differently to the client
+invoking the message transaction, except possibly for differences in
+performance.
+
+We view the two Request handling approaches as appropriate under
+different circumstances. VMTP supports both models.
+
+
+2.18. Bibliography
+
+The basic protocol is similar to that used in the original form of the V
+kernel [3, 4] as well as the transport protocol of Birrell and
+Nelson's [2] remote procedure call mechanism. An earlier version of the
+protocol was described in SIGCOMM'86 [6]. The rate-based flow control
+is similar to the techniques of Netblt [9]. The support for idempotency
+draws, in part, on the favorable experience with idempotency in the V
+distributed system. Its use was originally inspired by the Woodstock
+File Server [11]. The multicast support draws on the multicast
+facilities in V [5] and is designed to work with, and is now implemented
+using, the multicast extensions to the Internet [8] described in RFC 966
+and 988. The secure version of the protocol is similar to that
+described by Birrell [1] for secure RPC. The use of runs of packet
+groups is similar to Fletcher and Watson's delta-T protocol [10]. The
+use of "management" operations implemented using VMTP in place of
+specialized packet types is viewed as part of a general strategy of
+using recursion to simplify protocol architectures [7].
+
+Finally, this protocol was designed, in part, to respond to the
+requirements identified by Braden in RFC 955. We believe that VMTP
+satisfies the requirements stated in RFC 955.
+
+
+
+
+Cheriton [page 34]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+
+[1] A.D. Birrell, "Secure Communication using Remote Procedure
+ Calls", ACM. Trans. on Computer Systems 3(1), February, 1985.
+
+
+[2] A. Birrell and B. Nelson, "Implementing Remote Procedure Calls",
+ ACM Trans. on Computer Systems 2(1), February, 1984.
+
+
+[3] D.R. Cheriton and W. Zwaenepoel, "The Distributed V Kernel and its
+ Performance for Diskless Workstations", In Proceedings of the 9th
+ Symposium on Operating System Principles, ACM, 1983.
+
+
+[4] D.R. Cheriton, "The V Kernel: A Software Base for Distributed
+ Systems", IEEE Software 1(2), April, 1984.
+
+
+[5] D.R. Cheriton and W. Zwaenepoel, "Distributed Process Groups in
+ the V Kernel", ACM Trans. on Computer Systems 3(2), May, 1985.
+
+
+[6] D.R. Cheriton, "VMTP: A Transport Protocol for the Next
+ Generation of Communication Systems", In Proceedings of
+ SIGCOMM'86, ACM, Aug 5-7, 1986.
+
+
+[7] D.R. Cheriton, "Exploiting Recursion to Simplify an RPC
+ Communication Architecture", in preparation, 1988.
+
+
+[8] D.R. Cheriton and S.E. Deering, "Host Groups: A Multicast
+ Extension for Datagram Internetworks", In 9th Data Communication
+ Symposium, IEEE Computer Society and ACM SIGCOMM, September, 1985.
+
+
+[9] D.D. Clark and M. Lambert and L. Zhang, "NETBLT: A Bulk Data
+ Transfer Protocol", Technical Report RFC 969, Defense Advanced
+ Research Projects Agency, 1985.
+
+
+[10] J.G. Fletcher and R.W. Watson, "Mechanism for a Reliable Timer-
+ based Protocol", Computer Networks 2:271-290, 1978.
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 35]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+
+
+[11] D. Swinehart and G. McDaniel and D. Boggs, "WFS: A Simple File
+ System for a Distributed Environment", In Proc. 7th Symp.
+ Operating Systems Principles, 1979.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 36]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+3. VMTP Packet Formats
+
+VMTP uses 2 basic packet formats corresponding to Request packets and
+Response packets. These packet formats are identical in most of the
+fields to simplify the implementation.
+
+We first describe the entity identifier format and the packet fields
+that are used in general, followed by a detailed description of each of
+the packet formats. These fields are described below in detail. The
+individual packet formats are described in the following subsections.
+The reader and VMTP implementor may wish to refer to Chapters 4 and 5
+for a description of VMTP event handling and only refer to this detailed
+description as needed.
+
+
+3.1. Entity Identifier Format
+
+The 64-bit non-group entity identifiers have the following substructure.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |R| |L|R|
+ |A|0|E|E| Domain-specific structure
+ |E| |E|S|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ Domain-specific structure |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+The field meanings are as follows:
+
+RAE Remote Alias Entity - the entity identifier identifies
+ an entity that is acting as an alias for some entity
+ outside this entity domain. This bit is used by
+ higher-level protocols. For instance, servers may take
+ extra security and protection measures with aliases.
+
+GRP Group - 0, for non-group entity identifiers.
+
+LEE Little-Endian Entity - the entity transmits data in
+ little-endian (VAX) order.
+
+RES Reserved - must be 0.
+
+The 64-bit entity group identifiers have the following substructure.
+
+
+
+
+Cheriton [page 37]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |R| |U|R|
+ |A|1|G|E| Domain-specific structure
+ |E| |P|S|
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ Domain-specific structure |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+The field meanings are as follows:
+
+RAE Remote Alias Entity - same as for non-group entity
+ identifier.
+
+GRP Group - 1, for entity group identifiers.
+
+UGP Unrestricted Group - no restrictions are placed on
+ joining this group. I.e. any entity can join limited
+ only by implementation resources.
+
+RES Reserved - must be 0.
+
+The all-zero entity identifier is reserved and guaranteed to be
+unallocated in all domains. In addition, a domain may reserve part of
+the entity identifier space for statically allocated identifiers.
+However, this is domain-specific.
+
+Description of currently defined entity identifier domains is provided
+in Appendix IV.
+
+
+3.2. Packet Fields
+
+Client 64-bit identifier for the client entity associated with
+ this packet. The structure, allocation and binding of
+ this identifier is specific to the specified Domain. An
+ entity identifier always includes 4 types bits as
+ specified in Section 3.1.
+
+Version The 3-bit identifier specifying the version of the
+ protocol. Current version is version 0.
+
+Domain The 13-bit identifier specifying the naming and
+ administration domain for the client and server named in
+ the packet.
+
+
+
+Cheriton [page 38]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+Packet Flags: 3 bits. (The normal case has none of the flags set.)
+
+ HCO Header checksum only - checksum has only been calculated
+ on the header. This is used in some real-time
+ applications where the strict correctness of the data is
+ not needed.
+
+ EPG Encrypted packet group - part of a secure message
+ transaction.
+
+ MPG Multicast packet group - packet was multicast on
+ transmission.
+
+Length A 13-bit field that specifies the number of 32-bit words
+ in the segment data portion of the packet (if any),
+ excluding the checksum field. (Every VMTP packet is
+ required to be a multiple of 64 bits, possibly by
+ padding out the segment data.) The minimum legal Length
+ is 0, the maximum length is 4096 and it must be an even
+ number.
+
+Control Flags: 9 bits. (The normal case has none of the flags set.)
+
+ NRS Next Receive Sequence - the associated Request message
+ (in a Response) or previous Response (if a Request) was
+ received consecutive with the last Request from this
+ entity. That is, there was no interfering messages
+ received.
+
+ APG Acknowledge Packet Group - Acknowledge packet group on
+ receipt. If a Request, send back a Request to the
+ client's manager providing an update on the state of the
+ transaction as soon as the request packet group is
+ received, independent of the response being available.
+ If a Response, send an update to the server's manager as
+ soon as possible after response packet group is received
+ providing an update on the state of the transaction at
+ the client
+
+ NSR Not Start Run - 1 if this packet is not part of the
+ first packet group of a run of packet groups.
+
+ NER Not End Run - 1 if this packet is not part of the last
+ packet group of a run of packet groups.
+
+ NRT No Retransmission - do not ask for retransmissions of
+ this packet group if not all received within timeout
+
+
+Cheriton [page 39]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ period, just deliver or discard.
+
+ MDG Member of Destination Group - this packet is sent to a
+ group and the client is a member of this group.
+
+ CMG Continued Message - the message (Request or Response) is
+ continued in the next packet group. The next packet
+ group has to be part of the same run of packet groups.
+
+ STI Skip Transaction Identifiers - the next transaction
+ identifier that the Client plans to use is the current
+ transaction plus 256, if part of the same run and at
+ least this big if not. In a Request, this authorizes
+ the Server to send back up to 256 packet groups
+ containing the Response.
+
+ DRT Delay Response Transmission - set by request sender if
+ multiple responses are expected (as indicated by the MRD
+ flag in the RequestCode) and it may be overrun by
+ multiple responses. The responder(s) should then
+ introduce a short random delay in sending the Response
+ to minimize the danger of overrunning the Client. This
+ is normally only used for responding to multicast
+ Requests where the Client may be receiving a large
+ number of Responses, as indicated by the MRD flag in the
+ Request flags. Otherwise, the Response is sent
+ immediately.
+
+RetransmitCount:
+ 3 bits - the ordinal number of transmissions of this
+ packet group prior to this one, modulo 8. This field is
+ used in estimation of roundtrip times. This count may
+ wrap around during a message transaction. However, it
+ should be sufficient to match acknowledgments and
+ responses with a particular transmission.
+
+ForwardCount: 4 bits indicating the number of times this Request has
+ been forwarded. The original Request is always sent
+ with a ForwardCount of 0.
+
+Interpacket Gap: 8 bits.
+ Indicates the recommended time to use between subsequent
+ packet transmissions within a multi-packet packet group
+ transmission. The Interpacket Gap time is in 1/32nd of
+ a network packet transmission time for a packet of size
+ MTU for the node. (Thus, the maximum gap time is 8
+ packet times.)
+
+
+Cheriton [page 40]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+PGcount: 8 bits
+ The number of packet groups that this packet group
+ represents in addition to that specified by the
+ Transaction field. This is used in acknowledging
+ multiple packet groups in streamed communication.
+
+Priority 4-bit identifier for priority for the processing of this
+ request both on transmission and reception. The
+ interpretation is:
+
+ 1100 urgent/emergency
+
+ 1000 important
+
+ 0000 normal
+
+ 0100 background
+
+ Viewing the higher-order bit as a sign bit (with 1
+ meaning negative), low values are high priority and high
+ values are low priority. The low-order 2 bits indicate
+ additional (lower) gradations for each level.
+
+Function Code: 1 bit - types of VMTP packets. If the low-order bit of
+ the function code is 0, the packet is sent to the
+ Server, else it is sent to the Client.
+
+ 0 Request
+
+ 1 Response
+
+Transaction: 32 bits:
+ Identifier for this message transaction.
+
+PacketDelivery: 32 bits:
+ Delivery indicates the segment blocks contained in this
+ packet. Each bit corresponds to one 512-octet block of
+ segment data. A 1 bit in the i-th bit position
+ (counting the LSB as 0) indicates the presence of the
+ i-th segment block.
+
+Server: 64 bits
+ Entity identifier for the server or server group
+ associated with this transaction. This is the receiver
+ when a Request packet and the sender when a Response
+ packet.
+
+
+
+Cheriton [page 41]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+Code: 32 bits The Request Code and Response Code, set either at the
+ user level or VMTP level depending on use and packet
+ type. Both the Request and Response codes include 8
+ high-order bits from the following set of control bits:
+
+ CMD Conditional Message Delivery - only deliver the request
+ or response if the receiving entity is waiting for it at
+ the time of delivery, otherwise drop the message.
+
+ DGM DataGram Message - indicates that the message is being
+ sent as a datagram. If a Request message, do not wait
+ for reply, or retransmit. If a Response message, treat
+ this message transaction as idempotent.
+
+ MDM Message Delivery Mask - indicates that the MsgDelivery
+ field is being used. Otherwise, the MsgDelivery field
+ is available for general use.
+
+ SDA Segment Data Appended - segment data is appended to the
+ message control block, with the total size of the
+ segment specified by the SegmentSize field. Otherwise,
+ the segment data is null and the SegmentSize field is
+ not used by VMTP and available for user- or RPC-level
+ uses.
+
+ CRE CoResident Entity - indicates that the CoResidentEntity
+ field in the message should be interpreted by VMTP.
+ Otherwise, this field is available for additional user
+ data.
+
+ MRD Multiple Responses Desired - multiple Responses are
+ desired to to this Request if it is multicast.
+ Otherwise, the VMTP module can discard subsequent
+ Responses after the first Response.
+
+ PIC Public Interface Code - Values for Code with this bit
+ set are reserved for definition by the VMTP
+ specification and other standard protocols defined on
+ top of VMTP.
+
+ RES Reserved for future use. Must be 0.
+
+CoResidentEntity
+ 64-bit Identifier for an entity or group of entities
+ with which the Server entity or entities must be
+ co-resident, i.e. route only to entities (identified by
+ Server) on the same host(s) as that specified by
+
+
+Cheriton [page 42]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ CoResidentEntity, Only meaningful if CRE is set in the
+ Code field.
+
+User Data 12 octets Space in the header for the VMTP user to
+ specify user-specific control and data.
+
+MsgDelivery: 32 bits
+ The segment blocks being transmitted (in total) in this
+ packet group following the conventions for the
+ PacketDelivery field. This field is ignored by the
+ protocol and treated as an additional user data field if
+ MDM is 0. On transmission, the user level sets the
+ MsgDelivery to indicate those portions of the segment to
+ be transmitted. On receipt, the MsgDelivery field is
+ modified by the VMTP module to indicate the segment data
+ blocks that were actually received before the message
+ control block is passed to the user or RPC level. In
+ particular, the kernel does not discard the packet group
+ if segment data blocks are missing. A Server or Client
+ entity receiving a message with a MsgDelivery in use
+ must check the field to ensure adequate delivery and
+ retry the operation if necessary.
+
+SegmentSize: 32 bits
+ Size of segment in octets, up to a maximum of 16
+ kilooctets without streaming and 4 megaoctets with
+ streaming, if SDA is set. Otherwise, this field is
+ ignored by the protocol and treated as an additional
+ user data field.
+
+Segment Data: 0-16 kilooctets
+ 0 octets if SDA is 0, else the portion of the segment
+ corresponding to the Delivery Mask, limited by the
+ SegmentSize and the MTU, padded out to a multiple of 64
+ bits.
+
+Checksum: 32 bits.
+ The 32-bit checksum for the header and segment data.
+
+
+The VMTP checksum algorithm <9> develops a 32-bit checksum by computing
+
+_______________
+
+<9> This algorithm and description are largely due to Steve Deering of
+Stanford University.
+
+
+Cheriton [page 43]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+two 16-bit, ones-complement sums (like IP), each covering different
+parts of the packet. The packet is divided into clusters of 16 16-bit
+words. The first, third, fifth,... clusters are added to the first sum,
+and the second, fourth, sixth,... clusters are added to the second sum.
+Addition stops at the end of the packet; there is no need to pad out to
+a cluster boundary (although it is necessary that the packet be an
+integral multiple of 64 bits; padding octets may have any value and are
+included in the checksum and in the transmitted packet). If either of
+the resulting sums is zero, it is changed to 0xFFFF. The two sums are
+appended to the transmitted packet, with the first sum being transmitted
+first. Four bytes of zero in place of the checksum may be used to
+indicate that no checksum was computed.
+
+The 16-bit, ones-complement addition in this algorithm is the same as
+used in IP and, therefore, subject to the same optimizations. In
+particular, the words may be added up 32-bits at a time as long as the
+carry-out of each addition is added to the sum on the following
+addition, using an "add-with-carry" type of instruction. (64-bit or
+128-bit additions would also work on machines that have registers that
+big.)
+
+A particular weakness of this algorithm (shared by IP) is that it does
+not detect the erroneous swapping of 16-bit words, which may easily
+occur due to software errors. A future version of VMTP is expected to
+include a more secure algorithm, but such an algorithm appears to
+require hardware support for efficient execution.
+
+Not all of these fields are used in every packet. The specific packet
+formats are described below. If a field is not mentioned in the
+description of a packet type, its use is assumed to be clear from the
+above description.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 44]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+3.3. Request Packet
+
+The Request packet (or packet group) is sent from the client to the
+server or group of servers to solicit processing plus the return of zero
+or more responses. A Request packet is identified by a 0 in the LSB of
+the fourth 32-bit word in the packet.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ + Client (8 octets) +
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |Ver | |H|E|M| |
+ |sion | Domain |C|P|P| Length |
+ | | |O|G|G| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |N|A|N|N|N|M|C|S|D|Retra|Forward| Inter- | |R|R|R| |
+ |R|P|S|E|R|D|M|T|R|nsmit| Count | Packet | Prior |E|E|E|0|
+ |S|G|R|R|T|G|G|I|T|Count| | Gap | -ity |S|S|S| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Transaction |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | PacketDelivery |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ + Server (8 octets) +
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |C|D|M|S|R|C|M|P| |
+ |M|G|D|D|E|R|R|I| RequestCode |
+ |D|M|M|A|S|E|D|C| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ + CoResidentEntity (8 octets) +
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ > User Data (12 octets) <
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | MsgDelivery |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | SegmentSize |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ > segment data, if any <
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Checksum |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 3-1: Request Packet Format
+
+The fields of the Request packet are set according to the semantics
+described in Section 3.2 with the following qualifications.
+
+
+Cheriton [page 45]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+InterPacketGap The estimated interpacket gap time the client would like
+ for the Response packet group to be sent by the Server
+ in responding to this Request.
+
+Transaction Identifier for transaction, at least one greater than
+ the previously issued Request from this Client.
+
+Server Server to which this Request is destined.
+
+RequestCode Request code for this request, indicating the operation
+ to perform.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 46]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+3.4. Response Packet
+
+The Response packet is sent from the Server to the Client in response to
+a Request, identified by a 1 in the LSB of the fourth 32-bit word in the
+packet.
+
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ + Client (8 octets) +
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |Ver | |H|E|M| |
+ |sion | Domain |C|P|P| Length |
+ | | |O|G|G| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |N|A|N|N|N|R|C|S|R|Retra|Forward| | |R|R|R| |
+ |R|P|S|E|R|E|M|T|E|nsmit| Count | PGcount | Prior |E|E|E|1|
+ |S|G|R|R|T|S|G|I|S|Count| | | -ity |S|S|S| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Transaction |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | PacketDelivery |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ + Server (8 octets) +
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ |C|D|M|S|R|R|R|R| |
+ |M|G|D|D|E|E|E|E| ResponseCode |
+ |D|M|M|A|S|S|S|S| |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ > UserData (20 octets) <
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | MsgDelivery |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Segment Size |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ > segment data, if any <
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Checksum |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ Figure 3-2: Response Packet Format
+
+The fields of the Response packet are set according to the semantics
+described in Section 3.2 with the following qualifications.
+
+Client, Version, Domain, Transaction
+ Match those in the Request packet group to which this is
+
+
+Cheriton [page 47]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ a response.
+
+STI 1 if this Response is using one or more of the
+ transaction identifiers skipped by the Client after the
+ Request to which this is a Response. STI in the Request
+ essentially allocates up to 256 transaction identifiers
+ for the Server to use in a run of Response packet
+ groups.
+
+RetransmitCount The retransmit count from the last Request packet
+ received to which this is a response.
+
+ForwardCount The number of times the corresponding Request was
+ forwarded before this Response was generated.
+
+PGcount The number of consecutively previous packet groups that
+ this response is acknowledging in addition to the one
+ identified by the Transaction identifier.
+
+Server Server sending this response. This may differ from that
+ originally specified in the Request packet if the
+ original Server was a server group, or the request was
+ forwarded.
+
+The next two chapters describes the protocol operation using these
+packet formats, with the the Client and the Server portions described
+separately.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 48]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+4. Client Protocol Operation
+
+This chapter describes the operation of the client portion of VMTP in
+terms of the procedures for handling VMTP user events, packet reception
+events, management operations and timeout events. Note that the client
+portion of VMTP is separable from the server portion. It is feasible to
+have a node that only implements the client end of VMTP.
+
+To simplify the description, we define a client state record (CSR) plus
+some standard utility routines.
+
+
+4.1. Client State Record Fields
+
+In the following protocol description, there is one client state record
+(CSR) per (client,transaction) outstanding message transaction. Here is
+a suggested set of fields.
+
+Link Link to next CSR when queued in one of the transmission,
+ timeout or message queues.
+
+QueuePtr Pointer to queue head in which this CSR is contained or
+ NULL if none. Queue could be one of transmission queue,
+ timeout queue, server queue or response queue.
+
+ProcessIdentification
+ The process identification and address space.
+
+Priority Priority for processing, network service, etc.
+
+State One of the client states described below.
+
+FinishupFunc Procedure to be executed on the CSR when it is completes
+ its processing in transmission or timeout queues.
+
+TimeoutCount Time to remain in timeout queue.
+
+TimeoutLimit User-specified time after which the message transaction
+ is aborted. The timeout is infinite if set to zero.
+
+RetransCount Number of retransmissions since last hearing from the
+ Server.
+
+LastTransmitTime
+ The time at which the last packet was sent. This field
+ is used to calculate roundtrip times, using the
+ RetransmitCount to match the responding packet to a
+
+
+Cheriton [page 49]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ particular transmission. I.e. Response or management
+ NotifyVmtpClient operation to Request and a management
+ NotifyVmtpServer operation to a Response.
+
+TimetoLive Time to live to be used on transmission of IP packets.
+
+TransmissionMask
+ Bit mask indicating the portions of the segment to
+ transmit. Set before entering the transmission queue
+ and cleared incrementally as the 512-byte segment blocks
+ of the segment are transmitted.
+
+LocalClientLink Link to next CSR hashing to same hash index in the
+ ClientMap.
+
+LocalClient Entity identifier for client when this CSR is used to
+ send a Request packet.
+
+LocalTransaction
+ Transaction identifier for current message transaction
+ the local client has outstanding.
+
+LocalPrincipal Account identification, possibly including key and key
+ timeout.
+
+LocalDelivery Bit mask of segment blocks that have not been
+ acknowledged in the Request or have been received in the
+ Response, depending on the state.
+
+ResponseQueue Queue of CSR's representing the queued Responses for
+ this entity.
+
+VMTP Header Prototype VMTP header, used to generate and store the
+ header portion of a Request for transmission and
+ retransmission on timeout.
+
+SegmentDesc Description of the segment data associated with the CSR,
+ either the area storing the original Request data, the
+ area for receiving Request data, or the area storing the
+ Response data that is returned.
+
+HostAddr The network or internetwork host address to which the
+ Client last transmitted. This field also indicates the
+ type of the address, e.g. IP, Ethernet, etc.
+
+Note: the CSR can be combined with a light-weight process descriptor
+with considerable benefit if the process is designed to block when it
+
+
+Cheriton [page 50]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+issues a message transaction. In particular, by combining the two
+descriptors, the implementation saves time because it only needs to
+locate and queue one descriptor with various operations (rather than
+having to locate two descriptors). It also saves space, given that the
+VMTP header prototype provides space such as the user data field which
+may serve to store processor state for when the process is preempted.
+Non-preemptive blocking can use the process stack to store the processor
+state so only a program counter and stack pointer may be required in the
+process descriptor beyond what we have described. (This is the approach
+used in the V kernel.)
+
+
+4.2. Client Protocol States
+
+A Client State Record records the state of message transaction generated
+by this host, identified by the (Client, Transaction) values in the CSR.
+As a client originating a transaction, it is in one of the following
+states.
+
+AwaitingResponse
+ Waiting for a Response packet group to arrive with the
+ same (Client,Transaction) identification.
+
+ReceivingResponse
+ Waiting for additional packets in the Response packet
+ group it is currently receiving.
+
+"Other" Not waiting for a response, which can be Processing or
+ some other operating system state, or one of the Server
+ states if it also acts as a server.
+
+This covers all the states for a client.
+
+
+4.3. State Transition Diagrams
+
+The client state transitions are illustrated in Figure 4-1. The client
+goes into the state AwaitingResponse on sending a request unless it is a
+datagram request. In the AwaitingResponse state, it can timeout and
+retry and eventually give up and return to the processing state unless
+it receives a Response. (A NotifyVmtpClient operation resets the
+timeout but does not change the state.) On receipt of a single packet
+response, it returns to the processing state. Otherwise, it goes to
+ReceivingResponse state. After timeout or final response packet is
+received, the client returns to the processing state. The processing
+state also includes any other state besides those associated with
+issuing a message transaction.
+
+
+Cheriton [page 51]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ +------------+
+ | Processing |<--------------------|
+ | |<-------------| |
+ | |<---| | |
+ +|------^--^-+ Single Last |
+ Transmit | | Packet Response |
+ | | | Response Packet |
+ | | | | | |
+ +-DGM->+ Timeout | | Final timeout
+ | | | | |
+ +V-----------+ | +-----------+
+ | Awaiting |----+ | Receiving |->Response-+
+ | Response |->Response->| Response | |
+ | | (multi- | |<----------+
+ +-|--------^-+ packet) +----------^+
+ V | | |
+ +-Timeout+ +>Timeout+
+
+ Figure 4-1: Client State Transitions
+
+
+4.4. User Interface
+
+The RPC or user interface to VMTP is implementation-dependent and may
+use systems calls, functions or some other mechanism. The list of
+requests that follow is intended to suggest the basic functionality that
+should be available.
+
+Send( mcb, timeout, segptr, segsize )
+ Initiate a message transaction to the server and request
+ message specified by mcb and return a response in mcb,
+ if it is received within the specified timeout period
+ (or else return USER_TIMEOUT in the Code field). The
+ segptr parameter specifies the location from which the
+ segment data is sent and the location into which the
+ response data is to be delivered. The segsize field
+ indicates the maximum length of this area.
+
+GetResponse( responsemcb, timeout, segptr, segsize )
+ Get the next response sent to this client as part of the
+ current message transaction, returning the segment data,
+ if any, into the memory specified by segptr and segsize.
+
+This interface assumes that there is a client entity associated with the
+invoking process that is to be used with these operations. Otherwise,
+the client entity must be specified as an additional parameter.
+
+
+
+Cheriton [page 52]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+4.5. Event Processing
+
+The following events may occur in the VMTP client:
+
+ - User Requests
+
+ * Send
+
+ * GetResponse
+
+ - Packet Arrival
+
+ * Response Packet
+
+ * Request
+
+ The minimal Client implementation handles Request packets for
+ its VMTP management (server) module and sends NotifyVmtpClient
+ requests in response to others, indicating the specified
+ server does not exist.
+
+ - Management Operation - NotifyVmtpClient
+
+ - Timeouts
+
+ * Client Retransmission Timeout
+
+The handling of these events is described in detail in the following
+subsections.
+
+We first describe some conventions and procedures used in the
+description. A field of the received packet is indicated as (for
+example) p.Transaction, for the Transaction field. Optional portions of
+the code, such as the streaming handling code are prefixed with a "|" in
+the first column.
+
+MapClient( client )
+ Return pointer to CSR for client with the specified
+ clientId, else NULL.
+
+SendPacketGroup( csr )
+ Send the packet group (Request, Response) according to
+ that specified by the CSR.
+
+NotifyClient( csr, p, code )
+ Invoke the NotifyVmtpClient operation with the
+ parameters csr.RemoteClient, p.control,
+
+
+Cheriton [page 53]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ csr.ReceiveSeqNumber, csr.RemoteTransaction and
+ csr.RemoteDelivery, and code. If csr is NULL, use
+ p.Client, p.Transaction and p.PacketDelivery instead and
+ the global ReceiveSequenceNumber, if supported. This
+ function simplifies the description over calling
+ NotifyVmtpClient directly in the procedural
+ specification below. (See Appendix III.)
+
+NotifyServer( csr, p, code )
+ Invoke the NotifyVmtpServer operation with the
+ parameters p.Server, csr.LocalClient,
+ csr.LocalTransaction, csr.LocalDelivery and code. Use
+ p.Client, P.Transaction and 0 for the clientId, transact
+ and delivery parameters if csr is NULL. This function
+ simplifies the description over calling NotifyVmtpServer
+ directly in the procedural specification below. (See
+ Appendix III.)
+
+DGMset(p) True if DGM bit set in packet (or csr) else False.
+ (Similar functions are used for other bits.)
+
+Timeout( csr, timeperiod, func )
+ Set or reset timer on csr record for timeperiod later
+ and invoke func if the timeout expires.
+
+
+4.6. Client User-invoked Events
+
+A user event occurs when a VMTP user application invokes one of the VMTP
+interface procedures.
+
+
+4.6.1. Send
+
+Send( mcb, timeout, segptr, segsize )
+ map to main CSR for this client.
+ increment csr.LocalTransaction
+ Init csr and check parameters and segment if any.
+ Set SDA if sending appended data.
+ Flush queued replies from previous transaction, if any.
+ if local non-group server then
+ deliver locally
+ await response
+ return
+ if GroupId(server) then
+ Check for and deliver to local members.
+ if CRE request and non-group local CR entity then
+
+
+Cheriton [page 54]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ await response
+ return
+ endif
+ set MDG if member of this group.
+ endif
+ clear csr.RetransCount
+ set csr.TransmissionMask
+ set csr.TimeLimit to timeout
+ set csr.HostAddr for csr.Server
+ SendPacketGroup( csr )
+ if DGMset(csr) then
+ return
+ endif
+ set csr.State to AwaitingResponse
+ Timeout( rootcsr, TC1(csr.Server), LocalClientTimeout )
+ return
+end Send
+
+Notes:
+
+ 1. Normally, the HostAddr is extracted from the ServerHost
+ cache, which maps server entity identifiers to host
+ addresses. However, on cache miss, the client first queries
+ the network using the ProbeEntity operation, as specified in
+ Appendix III, determining the host address from the Response.
+ The ProbeEntity operation is handled as a separate message
+ transaction by the Client.
+
+The stream interface incorporates a parameter to pass a responseHandler
+procedure that is invoked when the message transaction completes.
+
+StreamSend( mcb, timeout, segptr, segsize, responseHandler )
+ map to main CSR for this client.
+| Allocate a new csr if root in use.
+| lastcsr := First csr for last request.
+| if STIset(lastcsr)
+| csr.LocalTransaction := lastcsr.LocalTransaction + 256
+| else
+| csr.LocalTransaction := lastcsr.LocalTransaction + 1
+ Init csr and check parameters and segment if any.
+ . . . ( rest is the same as for the normal Send)
+
+Notes:
+
+ 1. Each outstanding message transaction is represented by a CSR
+ queued on the root CSR for this client entity. The root CSR
+ is used to handle timeouts, etc. On timeout, the last packet
+
+
+Cheriton [page 55]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ from the last packet group is retransmitted (with or without
+ the segment data).
+
+
+4.6.2. GetResponse
+
+GetResponse( req, timeout, segptr, segsize )
+ csr := CurrentCSR;
+ if responses queued then return next response
+ (in req, segptr to max of segsize )
+ if timeout is zero then return KERNEL_TIMEOUT error
+ set state to AWAITING_RESPONSE
+ Timeout( csr, timeout, ReturnKernelTimeout );
+end GetResponse
+
+Notes:
+
+ 1. GetResponse is only used with multicast Requests, which is
+ the only case in which multiple (different) Responses should
+ be received.
+
+ 2. A response must remain queued until the next message
+ transaction is invoked to filter out duplicates of this
+ response.
+
+ 3. If the response is incomplete (only relevant if a
+ multi-packet response), then the client may wait for the
+ response to be fully received, including issuing requests for
+ retransmission (using NotifyVmtpServer operations) before
+ returning the response.
+
+ 4. As an optimization, a response may be stored in the CSR of
+ the client. In this case, the response must be transferred
+ to a separate buffer (for duplicate suppression) before
+ waiting for another response. Using this optimization, a
+ response buffer is not allocated in the common case of the
+ client receiving only one response.
+
+
+4.7. Packet Arrival
+
+In general, on packet reception, a packet is mapped to the client state
+record, decrypted if necessary using the key in the CSR. It then has
+its checksum verified and then is transformed to the right byte order.
+The packet is then processed fully relative to its packet function code.
+It is discarded immediately if it is addressed to a different domain
+than the domain(s) in which the receiving host participates.
+
+
+Cheriton [page 56]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+For each of the 2 packet types, we assume a procedure called with a
+pointer p to the VMTP packet and psize, the size of the packet in
+octets. Thus, generic packet reception is:
+
+if not LocalDomain(p.Domain) then return;
+
+csr := MapClient( p.Client )
+
+if csr is NULL then
+ HandleNoCsr( p, psize )
+ return
+
+if Secure(p) then
+ if SecureVMTP not supported then
+ { Assume a Request. }
+ if not Multicast(p) then
+ NotifyClient(NULL, p, SECURITY_NOT_SUPPORTED )
+ return
+ endif
+| Decrypt( csr.Key, p, psize )
+
+if p.Checksum not null then
+ if not VerifyChecksum(p, psize) then return;
+if OppositeByteOrder(p) then ByteSwap( p, psize )
+if psize not equal sizeof(VmtpHeader) + 4*p.Length then
+ NotifyClient(NULL, p, VMTP_ERROR )
+ return
+Invoke Procedure[p.FuncCode]( csr, p, psize )
+Discard packet and return
+
+Notes:
+
+ 1. The Procedure[p.FuncCode] refers to one of the 2 procedures
+ corresponding to the two different packet types of VMTP,
+ Requests and Responses.
+
+ 2. In all the following descriptions, a packet is discarded on
+ "return" unless otherwise stated.
+
+ 3. The procedure HandleNoCSR is a management routine that
+ allocates and initializes a CSR and processes the packet or
+ else sends an error indication to the sender of the packet.
+ This procedure is described in greater detail in Section
+ 4.8.1.
+
+
+
+
+
+Cheriton [page 57]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+4.7.1. Response
+
+This procedure handles incoming Response packets.
+
+HandleResponse( csr, p, psize )
+ if not LocalClient( csr ) then
+ if Multicast then return
+| if Migrated( p.Client ) then
+| NotifyServer(csr, p ENTITY_MIGRATED )
+| else
+ NotifyServer(csr, p, ENTITY_NOT_HERE )
+ return
+ endif
+
+ if NSRset(p) then
+ if Streaming not supported then
+ NotifyServer(csr, p, STREAMING_NOT_SUPPORTED )
+ return STREAMED_RESPONSE
+| Find csr corresponding to p.Transaction
+| if none found then
+| NotifyServer(csr, p, BAD_TRANSACTION_ID )
+| return
+ else
+ if csr.LocalTransaction not equal p.Transaction then
+ NotifyServer(csr, p, BAD_TRANSACTION_ID )
+ return
+ endif
+ Locate reply buffer rb for this p.Server
+ if found then
+ if rb.State is not ReceivingResponse then
+ { Duplicate }
+ if APGset(p) or NERset(p) then
+ { Send Response to stop response packets. }
+ NotifyServer(csr, p, RESPONSE_DISCARDED )
+ endif
+ return
+ endif
+ { rb.State is ReceivingRequest}
+ if new segment data then retain in CSR segment area.
+ if packetgroup not complete then
+ Timeout( rb, TC3(p.Server), LocalClientTimeout )
+ return;
+ endif
+ goto EndPacketGroup
+ endif
+ { Otherwise, a new response message. }
+
+
+
+Cheriton [page 58]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ if (NSRset(p) or NERset(p)) and NoStreaming then
+ NotifyServer(csr, p, VMTP_ERROR )
+ return
+| if NSRset(p) then
+| { Check consecutive with previous packet group }
+| Find last packet group CSR from p.Server.
+| if p.Transaction not
+| lastcsr.RemoteTransaction+1 mod 2**32 then
+| { Out of order packet group }
+| NotifyServer(csr, p, BAD_TRANSACTION_ID)
+| return
+| endif
+| if lastcsr not completed then
+| NotifyServer(lastcsr, p, RETRY )
+| endif
+| if CMG(lastcsr) then
+| Add segment data to lastcsr Response
+| Notify lastcsr with new packet group.
+| Clear lastcsr.VerifyInterval
+| else
+| if lastcsr available then
+| use it for this packet group
+| else allocate and initialize new CSR
+| Save message and segment data in new CSR area.
+| endif
+| else { First packet group }
+ Allocate and init reply buffer rb for this response.
+ if allocation fails then
+ NotifyServer(csr, p, BUSY )
+ return
+ Set rb.State to ReceivingResponse
+ Copy message and segment data to rb's segment area
+ and set rb.PacketDelivery to that delivered.
+ Save p.Server host address in ServerHost cache.
+ endif
+ if packetgroup not complete then
+ Timeout( rb, TS1(p.Client), LocalClientTimeout )
+ return;
+ endif
+endPacketGroup:
+ { We have received last packet in packet group. }
+ if APGset(p) then NotifyServer(csr, p, OK )
+| if NERset(p) and CMGset(p) then
+| Queue waiting for continuation packet group.
+| Timeout( rb, TC2(rb.Server), LocalClientTimeout )
+| return
+| endif
+
+
+Cheriton [page 59]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ { Deliver response message. }
+ Deliver response to Client, or queue as appropriate.
+end HandleResponse
+
+Notes:
+
+ 1. The mechanism for handling streaming is optional and can be
+ replaced with the tests for use of streaming. Note that the
+ server should never stream at the Client unless the Client
+ has streamed at the Server or has used the STI control bit.
+ Otherwise, streamed Responses are a protocol error.
+
+ 2. As an optimization, a Response can be stored into the CSR for
+ the Client rather than allocating a separate CSR for a
+ response buffer. However, if multiple responses are handled,
+ the code must be careful to perform duplicate detection on
+ the Response stored there as well as those queued. In
+ addition, GetResponse must create a queued version of this
+ Response before allowing it to be overwritten.
+
+ 3. The handling of Group Responses has been omitted for brevity.
+ Basically, a Response is accepted if there has been a Request
+ received locally from the same Client and same Transaction
+ that has not been responded to. In this case, the Response
+ is delivered to the Server or queued.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 60]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+4.8. Management Operations
+
+VMTP uses management operations (invoked as remote procedure calls) to
+effectively acknowledge packet groups and request retransmissions. The
+following routine is invoked by the Client's management module on
+request from the Server.
+
+NotifyVmtpClient( clientId,ctrl,receiveSeqNumber,transact,delivery,code)
+ Get csr for clientId
+ if none then return
+ if RemoteClient( csr ) and not NotifyVmtpRemoteClient then
+ return
+| else (for streaming)
+| Find csr with same LocalTransaction as transact
+| if csr is NULL then return
+ if csr.State not AwaitingResponse then return
+ if ctrl.PGcount then ack previous packet groups.
+ select on code
+ case OK:
+ Notify ack'ed segment blocks from delivery
+ Clear csr.RetransCount;
+ Timeout( csr, TC1(csr.Server), LocalClientTimeout )
+ return
+ case RETRY:
+ Set csr.TransmissionMask to missing segment blocks,
+ as specified by delivery
+ SendPacketGroup( csr )
+ Timeout( csr, TC1(csr.Server), LocalClientTimeout )
+ case RETRY_ALL
+ Set csr.TransmissionMask to retransmit all blocks.
+ SendPacketGroup( csr )
+ Timeout( csr, TC1(csr.Server), LocalClientTimeout )
+| if streaming then
+| Restart transmission of packet groups,
+| starting from transact+1
+ return
+ case BUSY:
+ if csr.TimeLimit exceeded then
+ Set csr.Code to USER_TIMEOUT
+ return Response to application
+ return;
+ Set csr.TransmissionMask for full retransmission
+ Clear csr.RetransCount
+ Timeout( csr, TC1(csr.Server), LocalClientTimeout )
+ return
+ case ENTITY_MIGRATED:
+ Get new host address for entity
+
+
+Cheriton [page 61]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Set csr.TransmissionMask for full retransmission
+ Clear csr.RetransCount
+ SendPacketGroup( csr )
+ Timeout( csr, TC1(csr.Server), LocalClientTimeout )
+ return
+
+ case STREAMING_NOT_SUPPORTED:
+ Record that server does not support streaming
+ if CMG(csr) then forget this packet group
+ else resend Request as separate packet group.
+ return
+ default:
+ Set csr.Code to code
+ return Response to application
+ return;
+ endselect
+end NotifyVmtpClient
+
+Notes:
+
+ 1. The delivery parameter indicates the segment blocks received
+ by the Server. That is, a 1 bit in the i-th position
+ indicates that the i-th segment block in the segment data of
+ the Request was received. All subsequent NotifyVmtpClient
+ operations for this transaction should be set to acknowledge
+ a superset of the segment blocks in this packet. In
+ particular, the Client need not be prepared to retransmit the
+ segment data once it has been acknowledged by a Notify
+ operation.
+
+
+4.8.1. HandleNoCSR
+
+HandleNoCSR is called when a packet arrives for which there is no CSR
+matching the client field of the packet.
+
+HandleNoCSR( p, psize )
+ if Secure(p) then
+ if SecureVMTP not supported then
+ { Assume a Request }
+ if not Multicast(p) then
+ NotifyClient(NULL,p,SECURITY_NOT_SUPPORTED)
+ return
+ endif
+ HandleRequestNoCSR( p, psize )
+ return
+ endif
+
+
+Cheriton [page 62]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ if p.Checksum not null then
+ if not VerifyChecksum(p, psize) then return;
+ if OppositeByteOrder(p) then ByteSwap( p, psize )
+ if psize not equal sizeof(VmtpHeader) + 4*p.Length then
+ NotifyClient(NULL, p, VMTP_ERROR )
+ return
+
+ if p.FuncCode is Response then
+| if Migrated( p.Client ) then
+| NotifyServer(csr, p ENTITY_MIGRATED )
+| else
+ NotifyServer(csr, p, NONEXISTENT_ENTITY )
+ return
+ endif
+
+ if p.FuncCode is Request then
+ HandleRequestNoCSR( p, psize )
+ return
+end HandleNoCSR
+
+Notes:
+
+ 1. The node need only check to see if the client entity has
+ migrated if in fact it supports migration of entities.
+
+ 2. The procedure HandleRequestNoCSR is specified in Section
+ 5.8.1. In the minimal client version, it need only handle
+ Probe requests and can do so directly without allocating a
+ new CSR.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 63]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+4.9. Timeouts
+
+A client with a message transaction in progress has a single timer
+corresponding to the first unacknowledged request message. (In the
+absence of streaming, this request is also the last request sent.) This
+timeout is handled as follows:
+
+LocalClientTimeout( csr )
+ select on csr.State
+ case AwaitingResponse:
+ if csr.RetransCount > MaxRetrans(csr.Server) then
+ terminate Client's message transactions up to
+ and including the current message transaction.
+ set return code to KERNEL_TIMEOUT
+ return
+ increment csr.RetransCount
+ Resend current packet group with APG set.
+ Timeout( csr, TC2(csr.Server), LocalClientTimeout )
+ return
+ case ReceivingResponse:
+ if DGMset(csr) or csr.RetransCount > Max then
+ if MDMset(csr) then
+ Set MCB.MsgDeliveryMask to blocks received.
+ else
+ Set csr.Code to BAD_REPLY_SEGMENT
+ return to user Client
+ endif
+ increment csr.RetransCount
+ NotifyServer with RETRY
+ Timeout( csr, TC3(csr.Server), LocalClientTimeout )
+ return
+ end select
+end LocalClientTimeout
+
+Notes:
+
+ 1. A Client can only request retransmission of a Response if the
+ Response is not idempotent. If idempotent, it must
+ retransmit the Request. The Server should generally support
+ the MsgDeliveryMask for Requests that it treats as idempotent
+ and that require multi-packet Responses. Otherwise, there is
+ no selective retransmission for idempotent message
+ transactions.
+
+ 2. The current packet group is the last one transmitted. Thus,
+ with streaming, there may be several packet groups
+ outstanding that precede the current packet group.
+
+
+Cheriton [page 64]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ 3. The Request packet group should be retransmitted without the
+ segment data, resulting in a single short packet in the
+ retransmission. The Server must then send a
+ NotifyVmtpClient with a RETRY or RETRY_ALL code to get the
+ segment data transmitted as needed. This strategy minimizes
+ the overhead on the network and the server(s) for
+ retransmissions.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 65]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+5. Server Protocol Operation
+
+This section describes the operation of the server portion of the
+protocol in terms of the procedures for handling VMTP user events,
+packet reception events and timeout events. Each server is assumed to
+implement the client procedures described in the previous chapter.
+(This is not strictly necessary but it simplifies the exposition.)
+
+
+5.1. Remote Client State Record Fields
+
+The CSR for a server is extended with the following fields, in addition
+to the ones listed for the client version.
+
+RemoteClient Identifier for remote client that sent the Request that
+ this CSR is handling.
+
+RemoteClientLink
+ Link to next CSR hashing to same hash index in the
+ ClientMap.
+
+RemoteTransaction
+ Transaction identifier for Request from remote client.
+
+RemoteDelivery The segment blocks received so far as part of a Request
+ or yet to be acknowledged as part of a Response.
+
+VerifyInterval Time interval since there was confirmation that the
+ remote Client was still valid.
+
+RemotePrincipal Account identification, possibly including key and key
+ timeout for secure communication.
+
+
+5.2. Remote Client Protocol States
+
+A CSR in the server end is in one of the following states.
+
+AwaitingRequest Waiting for a Request packet group. It may be marked as
+ waiting on a specific Client, or on any Client.
+
+ReceivingRequest
+ Waiting to receive additional Request packets in a
+ multi-packet group Request.
+
+Responded The Response has been sent and the CSR is timing out,
+ providing duplicate suppression and retransmission (if
+
+
+Cheriton [page 66]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ the Response was not idempotent).
+
+ResponseDiscarded
+ Response has been acknowledged or has timed out so
+ cannot be retransmitted. However, duplicates are still
+ filtered and CSR can be reused for new message
+ transaction.
+
+Processing Executing on behalf of the Client.
+
+Forwarded The message transaction has been forwarded to another
+ Server that is to respond directly to the Client.
+
+
+5.3. State Transition Diagrams
+
+The CSR state transitions in the server are illustrated in Figure 5-1.
+The CSR generally starts in the AwaitingRequest state. On receipt of a
+Request, the Server either has an up-to-date CSR for the Client or else
+it sends a Probe request (as a separate VMTP message transaction) to the
+VMTP management module associated with the Client. In the latter case,
+the processing of the Request is delayed until a Response to the Probe
+request is received. At that time, the CSR information is brought up to
+date and the Request is processed. If the Request is a single-packet
+request, the CSR is then set in the Processing state to handle the
+request. Otherwise (a multi-packet Request), the CSR is put into the
+ReceivingResponse state, waiting to receive subsequent Request packets
+that constitute the Request message. It exits the ReceivingRequest
+state on timeout or on receiving the last Request packet. In the former
+case, the request is delivered with an indication of the portion
+received, using the MsgDelivery field if MDM is set. After request
+processing is complete, either the Response is sent and the CSR enters
+the Responded state or the message transaction is forwarded and the CSR
+enters the Forwarded state.
+
+In the Responded state, if the Response is not marked as idempotent, the
+Response is retransmitted on receipt of a retransmission of the
+corresponding Request, on receipt of a NotifyVmtpServer operation
+requesting retransmission or on timeout at which time APG is set,
+requesting an acknowledgment from the Client. The Response is
+retransmitted some maximum number of times at which time the Response is
+discarded and the CSR is marked accordingly. If a Request or a
+NotifyVmtpServer operation is received expecting retransmission of the
+Response after the CSR has entered the ResponseDiscarded state, a
+NotifyVmtpClient operation is sent back (or invoked in the Client
+management module) indicating that the response was discarded unless the
+Request was multicast, in which case no action is taken. After a
+
+
+Cheriton [page 67]
+
+
+RFC 1045 VMTP February 1988
+
+
+ (Retransmit Forwarded Request and NotifyVmtpClient)
+ Request/
+ Ack/
+ +Timeout+
+ V |
+ +-|-------^-+
+ | |
+ +-Time-| Forwarded |<-------------+
+ | out +-----------+ |
+ | |
+ | (Retransmit Response) |
+ | Request |
+ V Ack |
+ | +-Timeout-+ |
+ | V | |
+ +---------+ Ack/ +|---------^+ |
+ +-Time-|Response |<-Timeout--| Responded | |
+ | out |Discarded| +----^------+ |
+ | +---------+ | |
+ | +------------+ | |
+ | | |->-Send Response-+ |
+ | | |->-forward Request--------+
+ +->| Processing |<----------------------+
+ | | |<----------------+ |
+ | | |<---| | |
+ | +-|--------^-+ | Last |
+ | Receive | | Request |
+ | | Timeout Single Packet |
+ | | | Packet | Timeout
+ | | | Request ^ ^
+ | | | ^ +|-----|--+
+ | +-V--------|-+ | |Receiving|<-+Time
+ +->| Awaiting |->--+->Request->| Request |--+ out
+ | Request | | (multi- +---------+
+ +------|-----+ ^ packet)
+ Request |
+ | Response
+ Send Probe to
+ | Probe
+ +---V----+ |
+ |Awaiting| ^
+ |Response|-->--+
+ |to Probe|
+ +--------+
+
+ Figure 5-1: Remote Client State Transitions
+
+timeout corresponding to the time required to filter out duplicates, the
+
+Cheriton [page 68]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+CSR returns either to the AwaitingRequest state or to the Processing
+state. Note that "Ack" refers to acknowledgment by a Notify operation.
+
+A Request that is forwarded leaves the CSR in the Forwarded state. In
+the Forwarded state, the forwarded Request is retransmitted
+periodically, expecting NotifyRemoteClient operations back from the
+Server to which the Request was forwarded, analogous to the Client
+behavior in the AwaitingResponse state. In this state, a
+NotifyRemoteClient from this Server acknowledges the Request or asks
+that it be retransmitted or reports an error. A retransmission of the
+Request from the Client causes a NotifyVmtpClient to be returned to the
+Client if APG is set. The CSR leaves the Forwarded state after timing
+out in the absence of NotifyRemoteClient operations from the forward
+Server or on receipt of a NotifyRemoteClient operation indicating the
+forward Server has sent a Response and received an acknowledgement. It
+then enters the ResponseDiscarded state.
+
+Receipt of a new Request from the same Client aborts the current
+transaction, independent of its state, and initiates a new transaction
+unless the new Request is part of a run of message transactions. If it
+is part of a run of message transactions, the handling follows the state
+diagram except the new Request is not Processed until there has been a
+response sent to the previous transaction.
+
+
+5.4. User Interface
+
+The RPC or user interface to VMTP is implementation-dependent and may
+use systems calls, functions or some other mechanism. The list of
+requests that follow is intended to suggest the basic functionality that
+should be available.
+
+AcceptMessage( reqmcb, segptr, segsize, client, transid, timeout )
+ Accept a new Request message in the specified reqmcb
+ area, placing the segment data, if any, in the area
+ described by segptr and segsize. This returns the
+ Server in the entityId field of the reqmcb and actual
+ segment size in the segsize parameters. It also returns
+ the Client and Transaction for this message transaction
+ in the corresponding parameters. This procedure
+ supports message semantics for request processing. When
+ a server process executes this call, it blocks until a
+ Request message has been queued for the server.
+ AcceptMessage returns after the specified timeout period
+ if a message has not been received by that time.
+
+RespondMessage( responsemcb, client, transid, segptr )
+
+
+Cheriton [page 69]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Respond to the client with the specified response
+ message and segment, again with message semantics.
+
+RespondCall( responsemcb, segptr )
+ Respond to the client with the specified response
+ message and segment, with remote procedure call
+ semantics. This procedure does not return. The
+ lightweight process that executes this procedure is
+ matched to a stack, program counter, segment area and
+ priority from the information provided in a
+ ModifyService call, as specified in Appendix III.
+
+ForwardMessage( requestmcb, transid, segptr, segsize, forwardserver )
+ Forward the client to the specified forwardserver with
+ the request specified in mcb.
+
+ForwardCall( requestmcb, segptr, segsize, forwardserver )
+ Forward the client transaction to the specified
+ forwardserver with the request specified by requestmcb.
+ This procedure does not return.
+
+GetRemoteClientId()
+ Return the entityId for the remote client on whose
+ behave the process is executing. This is only
+ applicable in the procedure call model of request
+ handling.
+
+GetForwarder( client )
+ Return the entity that forwarded this Request, if any.
+
+GetProcess( client )
+ Return an identifier for the process associated with
+ this client entity-id.
+
+GetPrincipal( client )
+ Return the principal associated with this client
+ entity-id.
+
+
+5.5. Event Processing
+
+The following events may occur in VMTP servers.
+
+ - User Requests
+
+ * Receive
+
+
+
+Cheriton [page 70]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ * Respond
+
+ * Forward
+
+ * GetForwarder
+
+ * GetProcess
+
+ * GetPrincipal
+
+ - Packet Arrival
+
+ * Request Packet
+
+ - Management Operations
+
+ * NotifyVmtpServer
+
+ - Timeouts
+
+ * Client State Record Timeout
+
+The handling of these events is described in detail in the following
+subsections. The conventions of the previous chapter are followed,
+including the use of the various subroutines in the description.
+
+
+5.6. Server User-invoked Events
+
+A user event occurs when a VMTP server invokes one of the VMTP interface
+procedures.
+
+
+5.6.1. Receive
+
+AcceptMessage(reqmcb, segptr, segsize, client, transid, timeout)
+ Locate server's request queue.
+ if request is queued then
+ Remember CSR associated with this Request.
+ return Request in reqmcb, segptr and segsize
+ and client and transaction id.
+ Wait on server's request queue for next request
+ up time timeout seconds.
+end ReceiveCall
+
+Notes:
+
+
+
+Cheriton [page 71]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ 1. If a multi-packet Request is partially received at the time
+ of the AcceptMessage, the process waits until it completes.
+
+ 2. The behavior of a process accepting a Request as a
+ lightweight thread is similar except that the process
+ executes using the Request data logically as part of the
+ requesting Client process.
+
+
+5.6.2. Respond
+
+RespondCall is described as one case of the Respond transmission
+procedure; RespondMessage is similar.
+
+RespondCall( responsemcb, responsesegptr )
+ Locate csr for this client.
+ Check segment data accessible, if any
+ if local client then
+ Handle locally
+ return
+ endif
+ if responsemcb.Code is RESPONSE_DISCARDED then
+ Mark as RESPONSE_DISCARDED
+ return
+ SendPacketGroup( csr )
+ set csr.State to Responded.
+ if DGM reply then { Idempotent }
+ release segment data
+ Timeout( csr, TS4(csr.Client), FreeCsr );
+ else { Await acknowledgement or new Request else ask for ack. }
+ Timeout( csr, TS5(csr.Client), RemoteClientTimeout )
+end RespondCall
+
+Notes:
+
+ 1. RespondMessage is similar except the Server process must be
+ synchronized with the release of the segment data (if any).
+
+ 2. The non-idempotent Response with segment data is sent first
+ without a request for an acknowledgement. The Response is
+ retransmitted after time TS5(client) if no acknowledgment or
+ new Request is received from the client in the meantime. At
+ this point, the APG bit is sent.
+
+ 3. The MCB of the Response is buffered in the client CSR, which
+ remains for TS4 seconds, sufficient to filter old duplicates.
+ The segment data (if any) must be retained intact until: (1)
+
+
+Cheriton [page 72]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ after transmission if idempotent or (2) after acknowledged or
+ timeout has occurred if not idempotent. Techniques such as
+ copy-on-write might be used to keep a copy of the Response
+ segment data without incurring the cost of a copy.
+
+
+5.6.3. Forward
+
+Forwarding is logically initiating a new message transaction between the
+Server (now acting as a Client) and the server to which the Request is
+forwarded. When the second server returns a Response, the same Response
+is immediately returned to the Client. The forwarding support in VMTP
+preserves these semantics while providing some performance optimizations
+in some cases.
+
+ForwardCall( req, segptr, segsize, forwardserver )
+ Locate csr for this client.
+ Check segment data accessible, if any
+
+ if local client or Request was multicast or secure
+ or csr.ForwardCount == 15 then
+ Handle as a new Send operation
+ return
+ if forwardserver is local then
+ Handle locally
+ return
+ Set csr.funccode to Request
+ Increment csr.ForwardCount
+ Set csr.State to Responded
+ SendPacketGroup( csr ) { To ForwardServer }
+ Timeout( csr, TS4(csr.Client), FreeAlien )
+end ForwardCall
+
+Notes:
+
+ 1. A Forward is logically a new call or message transaction. It
+ must be really implemented as a new message transaction if
+ the original Request was multicast or secure (with the
+ optional further refinement that it can be used with a secure
+ message transaction when the Server and ForwardServer are the
+ same principal and the Request was not multicast).
+
+ 2. A Forward operation is never handled as an idempotent
+ operation because it requires knowledge that the
+ ForwardServer will treat the forwarded operation as
+ idempotent as well. Thus, a Forward operation that includes
+ a segment should set APG on the first transmission of the
+
+
+Cheriton [page 73]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ forwarded Request to get an acknowledgement for this data.
+ Once the acknowledgement is received, the forwarding Server
+ can discard the segment data, leaving only the basic CSR to
+ handle retransmissions from the Client.
+
+
+5.6.4. Other Functions
+
+GetRemoteClient is a simple local query of the CSR. GetProcess and
+GetPrincipal also extract this information from the CSR. A server
+module may defer the Probe callback to the Client to get that
+information until it is requested by the Server (assuming it is not
+using secure communication and duplicate suppression is adequate without
+callback.) GetForwarder is implemented as a callback to the Client,
+using a GetRequestForwarder VMTP management operation. Additional
+management procedures for VMTP are described in Appendix III.
+
+
+5.7. Request Packet Arrival
+
+The basic packet reception follows that described for the Client
+routines. A Request packet is handled by the procedure HandleRequest.
+
+HandleRequest( csr, p, psize )
+
+ if LocalClient(csr) then
+ { Forwarded Request on local Client }
+ if csr.LocalTransaction != p.Transaction then return
+ if csr.State != AwaitingResponse then return
+ if p.ForwardCount < csr.ForwardCount then
+ Discard Request and return.
+ Find a CSR for Client as a remote Client.
+ if not found then
+ if packet group complete then
+ handle as a local message transaction
+ return
+ Allocate and init CSR
+ goto newTransaction
+ { Otherwise part of current transaction }
+ { Handle directly below. }n
+ if csr.RemoteTransaction = p.Transaction then
+ { Matches current transaction }
+ if OldForward(p.ForwardCount,csr.ForwardCount) then
+ return
+ if p.ForwardCount > csr.ForwardCount then
+ { New forwarded transaction }
+ goto newTransaction
+
+
+Cheriton [page 74]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ { Otherwise part of current transaction }
+ if csr.State = ReceivingRequest then
+ if new segment data then retain in CSR segment area.
+ if Request not complete then
+ Timeout( csr, TS1(p.Client), RemoteClientTimeout )
+ return;
+ endif
+ goto endPacketGroup
+ endif
+ if csr.State is Responded then
+ { Duplicate }
+ if csr.Code is RESPONSE_DISCARDED
+ and Multicast(p) then
+ return
+ endif
+ if not DGM(csr) then { Not idempotent }
+ if SegmentData(csr) then set APG
+ { Resend Response or Request, if Forwarded }
+ SendPacketGroup( csr )
+ timeout=if SegmentData(csr) then TS5(csr.Client)
+ else TS4(csr.Client)
+ Timeout( csr, timeout, RemoteClientTimeout )
+ return
+ { Else idempotent - fall thru to newTransaction }
+ else { Presume it is a retransmission }
+ NotifyClient( csr, p, OK )
+ return
+ else if OldTransaction(csr.RemoteTransact,p.Transaction) then
+ return
+ { Otherwise, a new message transaction. }
+newTransaction:
+ Abort handling of previous transactions for this Client.
+
+ if (NSRset(p) or NERset(p)) and NoStreaming then
+ NotifyClient( csr, p, STREAMING_NOT_SUPPORTED )
+ return
+| if NSRset(p) then { Streaming }
+| { Check that consecutive with previous packet group }
+| Find last packet group CSR from this client.
+| if p.Transaction not lastcsr.RemoteTransaction+1 mod 2**32
+| and not STIset(lastcsr) or
+| p.Transaction not lastcsr.RemoteTransaction+256 mod **32
+| then
+| { Out of order packet group }
+| NotifyClient(csr, p, BAD_TRANSACTION_ID )
+| return
+| endif
+
+
+Cheriton [page 75]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+| if lastcsr not completed then
+| NotifyClient( lastcsr, p, RETRY )
+| endif
+| if lastcsr available then use it for this packet group
+| else allocate and initialize new CSR
+| if CMG(lastcsr) then
+| Add segment data to lastcsr Request
+| Keep csr as record of this packet group.
+| Clear lastcsr.VerifyInterval
+| endif
+| else { First packet group }
+ if MultipleRemoteClients(csr) then ScavengeCsrs(p.Client)
+ Set csr.RemoteTransaction, csr.Priority
+ Copy message and segment data to csr's segment area
+ and set csr.PacketDelivery to that delivered.
+ Clear csr.PacketDelivery
+ Clear csr.VerifyInterval
+ SaveNetworkAddress( csr, p )
+ endif
+ if packetgroup not complete then
+ Timeout( csr, TS3(p.Client), RemoteClientTimeout )
+ return;
+ endif
+endPacketGroup:
+ { We have received complete packet group. }
+ if APG(p) then NotifyClient( csr, p, OK )
+ endif
+| if NERset(p) and CMG(p) then
+| Queue waiting for continuation packet group.
+| Timeout( csr, TS3(csr.Client), RemoteClientTimeout )
+| return
+| endif
+ { Deliver request message. }
+ if GroupId(csr.Server) then
+ For each server identified by csr.Server
+ Replicate csr and associated data segment.
+ if CMDset(csr) and Server busy then
+ Discard csr and data
+ else
+ Deliver or invoke csr for each Server.
+ if not DGMset(csr) then queue for Response
+ else Timeout( csr, TS4(csr.Client), FreeCsr )
+ endfor
+ else
+ if CMDset(csr) and Server busy then
+ Discard csr and data
+ else
+
+
+Cheriton [page 76]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Deliver or invoke csr for this server.
+ if not DGMset(csr) then queue for Response
+ else Timeout( csr, TS4(csr.Client), FreeCsr )
+ endif
+end HandleRequest
+
+Notes:
+
+ 1. A Request received that specifies a Client that is a local
+ entity should be a Request forwarded by a remote server to a
+ local Server.
+
+ 2. An alternative structure for handling a Request sent to a
+ group when there are multiple local group members is to
+ create a remote CSR for each group member on reception of the
+ first packet and deliver a copy of each packet to each such
+ remote CSR as each packet arrives.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 77]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+5.8. Management Operations
+
+VMTP uses management operations (invoked as remote procedure calls) to
+effectively acknowledge packet groups and request retransmissions. The
+following routine is invoked by the Server's management module on
+request from the Client.
+
+NotifyVmtpServer(server,clientId,transact,delivery,code)
+ Find csr with same RemoteTransaction and RemoteClient
+ as clientId and transact.
+ if not found or csr.State not Responded then return
+ if DGMset(csr) then
+ if transmission of Response in progress then
+ Abort transmission
+ if code is migrated then
+ restart transmission with new host addr.
+ if Retry then Report protocol error
+ return
+ endif
+ select on code
+ case RETRY:
+ if csr.RetransCount > MaxRetrans(clientId) then
+ if response data segment then
+ Discard data and mark as RESPONSE_DISCARDED
+| if NERset(csr) and subsequent csr then
+| Deallocate csr and use later csr for
+| future duplicate suppression
+| endif
+ return
+ endif
+ increment csr.RetransCount
+ Set csr.TransmissionMask to missing segment blocks,
+ as specified by delivery
+ SendPacketGroup( csr )
+ Timeout( csr, TS3(csr.Client), RemoteClientTimeout )
+ case BUSY:
+ if csr.TimeLimit exceeded then
+ if response data segment then
+ Discard data and mark as RESPONSE_DISCARDED
+| if NERset(csr) and subsequent csr then
+| Deallocate csr and use later csr for
+| future duplicate suppression
+| endif
+ endif
+ endif
+ Set csr.TransmissionMask for full retransmission
+ Clear csr.RetransCount
+
+
+Cheriton [page 78]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Timeout( csr, TS3(csr.Server), RemoteClientTimeout )
+ return
+
+ case ENTITY_MIGRATED:
+ Get new host address for entity
+ Set csr.TransmissionMask for full retransmission
+ Clear csr.RetransCount
+ SendPacketGroup( csr )
+ Timeout( csr, TS3(csr.Server), RemoteClientTimeout )
+ return
+
+ case default:
+ Abort transmission of Response if in progress.
+ if response data segment then
+ Discard data and mark as RESPONSE_DISCARDED
+ if NERset(csr) and subsequent csr then
+ Deallocate csr and use later csr for
+ future duplicate suppression
+ endif
+ return
+ endselect
+end NotifyVmtpServer
+
+Notes:
+
+ 1. A NotifyVmtpServer operation requesting retransmission of
+ the Response is acceptable only if the Response was not
+ idempotent. When the Response is idempotent, the Client must
+ be prepared to retransmit the Request to effectively request
+ retransmission of the Response.
+
+ 2. A NotifyVmtpServer operation may be received while the
+ Response is being transmitted. If an error return, as an
+ efficiency, the transmission should be aborted, as suggested
+ when the Response is a datagram.
+
+ 3. A NotifyVmtpServer operation indicating OK or an error
+ allows the Server to discard segment data and not provide for
+ subsequent retransmission of the Response.
+
+
+5.8.1. HandleRequestNoCSR
+
+When a Request is received from a Client for which the node has no CSR,
+the node allocates and initializes a CSR for this Client and does a
+callback to the Client's VMTP management module to get the Principal,
+Process and other information associated with this Client. It also
+
+
+Cheriton [page 79]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+checks that the TransactionId is correct in order to filter out
+duplicates.
+
+HandleRequestNoCSR( p, psize )
+| if Secure(p) then
+| Allocate and init CSR
+| SaveSourceHostAddr( csr, p )
+| ProbeRemoteClient( csr, p, AUTH_PROBE )
+| if no response or error then
+| delete CSR
+| return
+| Decrypt( csr.Key, p, psize )
+| if p.Checksum not null then
+| if not VerifyChecksum(p, psize) then return;
+| if OppositeByteOrder(p) then ByteSwap( p, psize )
+| if psize not equal sizeof(VmtpHeader) + 4*p.Length then
+| NotifyClient(NULL, p, VMTP_ERROR )
+| return
+| HandleRequest( csr, p, psize )
+| return
+ if Server does not exist then
+ NotifyClient( csr, p, NONEXISTENT_ENTITY )
+ return
+ endif
+ if security required by server then
+ NotifyClient(csr, p, SECURITY_REQUIRED )
+ return
+ endif
+ Allocate and init CSR
+ SaveSourceHostAddr( csr, p );
+ if server requires Authentication then
+ ProbeRemoteClient( csr, p, AUTH_PROBE )
+ if no response or error then
+ delete CSR
+ return
+ endif
+ { Setup immediately as a new message transaction }
+ set csr.Server to p.Server
+ set csr.RemoteTransaction to p.Transaction-1
+
+ HandleRequest( csr, p, psize )
+ endif
+
+Notes:
+
+ 1. A Probe request is always handled as a Request not requiring
+ authentication so it never generates a callback Probe to the
+
+
+Cheriton [page 80]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Client.
+
+ 2. If the Server host retains remote client CSR's for longer
+ than the maximum packet lifetime and the Request
+ retransmission time, and the host has been running for at
+ least that long, then it is not necessary to do a Probe
+ callback unless the Request is secure. A Probe callback can
+ take place when the Server asks for the Process or
+ PrincipalId associated with the Client.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 81]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+5.9. Timeouts
+
+The server must implement a timeout for remote client CSRs. There is a
+timeout for each CSR in the server.
+
+RemoteClientTimeout( csr )
+ select on csr.State
+ case Responded:
+ if RESPONSE_DISCARDED then
+ mark as timed out
+ Make a candidate for reuse.
+ return
+ if csr.RetransCount > MaxRetrans(Client) then
+ discard Response
+ mark CSR as RESPONSE_DISCARDED
+ Timeout(csr, TS4(Client), RemoteClientTimeout)
+ return
+ increment csr.RetransCount
+ { Retransmit Response or forwarded Request }
+ Set APG to get acknowledgement.
+ SendPacketGroup( csr )
+ Timeout( csr, TS3(Client), RemoteClientTimeout )
+ return
+ case ReceivingRequest:
+ if csr.RetransCount > MaxRetrans(csr.Client)
+ or DGMset(csr) or NRTset(csr) then
+ Modify csr.segmentSize and csr.MsgDelivery
+ to indicate packets received.
+ if MDMset(csr) then
+ Invoke processing on Request
+ return
+ else
+ discard Request and reuse CSR
+ (Note: Need not remember Request discarded.)
+ return
+ increment csr.RetransCount
+ NotifyClient( csr, p, RETRY )
+ Timeout( csr, TS3(Client), RemoteClientTimeout )
+ return
+ default:
+ Report error - invalid state for RemoteClientTimeout
+ endselect
+end RemoteClientTimeout
+
+Notes:
+
+ 1. When a CSR in the Responded state times out after discarding
+
+
+Cheriton [page 82]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ the Response, it can be made available for reuse, either by
+ the same Client or a different one. The CSR should be kept
+ available for reuse by the Client for as long as possible to
+ avoid unnecessary callback Probes.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 83]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+6. Concluding Remarks
+
+This document represents a description of the current state of the VMTP
+design. We are currently engaged in several experimental
+implementations to explore and refine all aspects of the protocol.
+Preliminary implementations are running in the UNIX 4.3BSD kernel and in
+the V kernel.
+
+Several issues are still being discussed and explored with this
+protocol. First, the size of the checksum field and the algorithm to
+use for its calculation are undergoing some discussion. The author
+believes that the conventional 16-bit checksum used with TCP and IP is
+too weak for future high-speed networks, arguing for at least a 32-bit
+checksum. Unfortunately, there appears to be limited theory covering
+checksum algorithms that are suitable for calculation in software.
+
+Implementation of the streaming facilities of VMTP is still in progress.
+This facility is expected to be important for wide-area, long delay
+communication.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 84]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+I. Standard VMTP Response Codes
+
+The following are the numeric values of the response codes used in VMTP.
+
+0 OK
+
+1 RETRY
+
+2 RETRY_ALL
+
+3 BUSY
+
+4 NONEXISTENT_ENTITY
+
+5 ENTITY_MIGRATED
+
+6 NO_PERMISSION
+
+7 NOT_AWAITING_MSG
+
+8 VMTP_ERROR
+
+9 MSGTRANS_OVERFLOW
+
+10 BAD_TRANSACTION_ID
+
+11 STREAMING_NOT_SUPPORTED
+
+12 NO_RUN_RECORD
+
+13 RETRANS_TIMEOUT
+
+14 USER_TIMEOUT
+
+15 RESPONSE_DISCARDED
+
+16 SECURITY_NOT_SUPPORTED
+
+17 BAD_REPLY_SEGMENT
+
+18 SECURITY_REQUIRED
+
+19 STREAMED_RESPONSE
+
+20 TOO_MANY_RETRIES
+
+21 NO_PRINCIPAL
+
+
+Cheriton [page 85]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+22 NO_KEY
+
+23 ENCRYPTION_NOT_SUPPORTED
+
+24 NO_AUTHENTICATOR
+
+25-63 Reserved for future VMTP assignment.
+
+Other values of the codes are available for use by higher level
+protocols. Separate protocol documents will specify further standard
+values.
+
+Applications are free to use values starting at 0x00800000 (hex) for
+application-specific return values.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 86]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+II. VMTP RPC Presentation Protocol
+
+For complete generality, the mapping of the procedures and the
+parameters onto VMTP messages should be defined by a RPC presentation
+protocol. In the absence of an accepted standard protocol, we define an
+RPC presentation protocol for VMTP as follows.
+
+Each procedure is assigned an identifying Request Code. The Request
+code serves effectively the same as a tag field of variant record,
+identifying the format of the Request and associated Response as a
+variant of the possible message formats.
+
+The format of the Request for a procedure is its Request Code followed
+by its parameters sequentially in the message control block until it is
+full.
+
+The remaining parameters are sent as part of the message segment data
+formatted according to the XDR protocol (RFC ??). In this case, the
+size of the segment is specified in the SegmentSize field.
+
+The Response for a procedure consists of a ResponseCode field followed
+by the return parameters sequentially in the message control block,
+except if there is a parameter returned that must be transmitted as
+segment data, its size is specified in the SegmentSize field and the
+parameter is stored in the SegmentData field.
+
+Attributes associated with procedure definitions should indicate the
+Flags to be used in the RequestCode. Request Codes are assigned as
+described below.
+
+
+II.1. Request Code Management
+
+Request codes are divided into Public Interface Codes and
+application-specific, according to whether the PIC value is set. An
+interface is a set of request codes representing one service or module
+function. A public interface is one that is to be used in multiple
+independently developed modules. In VMTP, public interface codes are
+allocated in units of 256 structured as
+
+ +-------------+----------------+-------------------+
+ | ControlFlags| Interface | Version/Procedure |
+ +-------------+----------------+-------------------+
+ 8 bits 16 bits 8 bits
+
+An interface is free to allocate the 8 bits for version and procedure as
+desired. For example, all 8 bits can be used for procedures. A module
+requiring more than 256 Version/Procedure values can be allocated
+
+Cheriton [page 87]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+multiple Interface values. They need not be consecutive Interface
+values.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 88]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+III. VMTP Management Procedures
+
+Standard procedures are defined for VMTP management, including creation,
+deletion and query of entities and entity groups, probing to get
+information about entities, and updating message transaction information
+at the client or the server.
+
+The procedures are implemented by the VMTP manager that constitutes a
+portion of every complete VMTP module. Each procedure is invoked by
+sending a Request to the VMTP manager that handles the entity specified
+in the operation or the local manager. The Request sent using the
+normal Send operation with the Server specified as the well-known entity
+group VMTP_MANGER_GROUP, using the CoResident Entity mechanism to direct
+the request to the specific manager that should handle the Request.
+(The ProbeEntity operation is multicast to the VMTP_MANAGER_GROUP if the
+host address for the entity is not known locally and the host address is
+determined as the host address of the responder. For all other
+operations, a ProbeEntity operation is used to determine the host
+address if it is not known.) Specifying co-resident entity 0 is
+interpreted as the co-resident with the invoking process. The
+co-resident entity identifier may also specify a group in which case,
+the Request is sent to all managers with members in this group.
+
+The standard procedures with their RequestCode and parameters are listed
+below with their semantics. (The RequestCode range 0xVV000100 to
+0xVV0001FF is reserved for use by the VMTP management routines, where VV
+is any choice of control flags with the PIC bit set. The flags are set
+below as required for each procedure.)
+
+0x05000101 - ProbeEntity(CREntity, entityId, authDomain) -> (code,
+ <staterec>)
+ Request and return information on the specified entity
+ in the specified authDomain, sending the Request to the
+ VMTP management module coresident with CREntity. An
+ error return is given if the requested information
+ cannot be provided in the specified authDomain. The
+ <staterec> returned is structured as the following
+ fields.
+
+ Transaction identifier
+ The current or next transaction
+ identifier being used by the probed
+ entity.
+
+ ProcessId: 64 bits
+ Identifier for client process. The
+ meaning of this is specified as part of
+
+
+Cheriton [page 89]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ the Domain definition.
+
+ PrincipalId The identifier for the principal or
+ account associated with the process
+ specified by ProcessId. The meaning of
+ this field is specified as part of the
+ Domain definition.
+
+ EffectivePrincipalId
+ The identifier for the principal or
+ account associated with the Client port,
+ which may be different from the
+ PrincipalId especially if this is an
+ nested call. The meaning of this field
+ is specified as part of the Domain
+ definition.
+
+ The code field indicates whether this is an error
+ response or not. The codes and their interpretation
+ are:
+
+ OK
+ No error. Probe was completed OK.
+
+ NONEXISTENT_ENTITY
+ Specified entity does not exist.
+
+ ENTITY_MIGRATED
+ The entity has migrated and is no longer at the host to
+ which the request was sent.
+
+ NO_PERMISSION
+ Entity has refused to provide ProbeResponse.
+
+ VMTP_ERROR
+ The Request packet group was in error relative to the
+ VMTP protocol specification.
+
+ "default"
+ Some type of error - discard ProbeResponse.
+
+0x0D000102 - AuthProbeEntity(CREntity,entityId,authDomain,randomId) ->
+ (code,ProbeAuthenticator,EncryptType,EntityAuthenticator)
+
+ Request authentication of the entity specified by
+ entityId from the VMTP manager coresident with CREntity
+ in authDomain authentication domain, returning the
+
+
+Cheriton [page 90]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ information contained in the return parameters. The
+ fields are set the same as that specified for the basic
+ ProbeResponse except as noted below.
+
+ ProbeAuthenticator
+ 20 bytes consisting of the EntityId, the
+ randomId and the probed Entity's current
+ Transaction value plus a 32-bit checksum
+ for these two fields (checksummed using
+ the standard packet Checksum algorithm),
+ all encrypted with the Key supplied in
+ the Authenticator.
+
+ EncryptType An identifier that identifies the
+ variant of encryption method being used
+ by the probed Entity for packets it
+ transmits and packets it is able to
+ receive. (See Appendix V.) The
+ high-order 8 bits of the EncryptType
+ contain the XOR of the 8 octets of the
+ PrincipalId associated with private key
+ used to encrypt the EntityAuthenticator.
+ This value is used by the requestor or
+ Client as an aid in locating the key to
+ decrypt the authenticator.
+
+ EntityAuthenticator
+ (returned as segment data) The
+ ProcessId, PrincipalId,
+ EffectivePrincipal associated with the
+ ProbedEntity plus the private
+ encryption/decryption key and its
+ lifetime limit to be used for
+ communication with the Entity. The
+ authenticator is encrypted with a
+ private key associated with the Client
+ entity such that it can be neither read
+ nor forged by a party not trusted by the
+ Client Entity. The format of the
+ Authenticator in the message segment is
+ shown in detail in Figure III-1.
+
+ Key: 64 bits Encryption key to be used for encrypting
+ and decrypting packets sent to and
+ received from the probed Entity. This
+ is the "working" key for packet
+ transmissions. VMTP only uses private
+
+
+Cheriton [page 91]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ +-----------------------------------------------+
+ | ProcessId (8 octets) |
+ +-----------------------------------------------+
+ | PrincipalId (8 octets) |
+ +-----------------------------------------------+
+ | EffectivePrincipalId (8 octets) |
+ +-----------------------------------------------+
+ | Key (8 octets) |
+ +-----------------------------------------------+
+ | KeyTimeLimit |
+ +-----------------------------------------------+
+ | AuthDomain |
+ +-----------------------------------------------+
+ | AuthChecksum |
+ +-----------------------------------------------+
+
+ Figure III-1: Authenticator Format
+
+ key encryption for data transmission.
+
+ KeyTimeLimit: 32 bits
+ The time in seconds since Dec. 31st,
+ 1969 GMT at which one should cease to
+ use the Key.
+
+ AuthDomain: 32 bits
+ The authentication domain in which to
+ interpret the principal identifiers.
+ This may be different from the
+ authDomain specified in the call if the
+ Server cannot provide the authentication
+ information in the request domain.
+
+ AuthChecksum: 32 bits
+ Contains the checksum (using the same
+ Checksum algorithm as for packet) of
+ KeyTimeLimit, Key, PrincipalId and
+ EffectivePrincipalId.
+
+ Notes:
+
+ 1. A authentication Probe Request and Response
+ are sent unencrypted in general because it is
+ used prior to there being a secure channel.
+ Therefore, specific fields or groups of
+ fields checksummed and encrypted to prevent
+ unauthorized modification or forgery. In
+
+
+Cheriton [page 92]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ particular, the ProbeAuthenticator is
+ checksummed and encrypted with the Key.
+
+ 2. The ProbeAuthenticator authenticates the
+ Response as responding to the Request when
+ its EntityId, randomId and Transaction values
+ match those in the Probe request. The
+ ProbeAutenticator is bound to the
+ EntityAutenticator by being encrypted by the
+ private Key contained in that authenticator.
+
+ 3. The authenticator is encrypted such that it
+ can be decrypted by a private key, known to
+ the Client. This authenticator is presumably
+ obtained from a key distribution center that
+ the Client trusts. The AuthChecksum prevents
+ undetected modifications to the
+ authenticator.
+
+0x05000103 - ProbeEntityBlock( entityId ) -> ( code, entityId )
+ Check whether the block of 256 entity identifiers
+ associated with this entityId are in use. The entityId
+ returned should match that being queried or else the
+ return value should be ignored and the operation redone.
+
+0x05000104 - QueryVMTPNode( entityId ) -> (code, MTU, flags, authdomain,
+ domains, authdomains, domainlist)
+ Query the VMTP management module for entityId to get
+ various module- or node-wide parameters, including: (1)
+ MTU - Maximum transmission unit or packet size handled
+ by this node. (2) flags- zero or more of the following
+ bit fields:
+
+ 1 Handles streamed Requests.
+
+ 2 Can issue streamed message transactions
+ for clients.
+
+ 4 Handles secure Requests.
+
+ 8 Can issue secure message transactions.
+
+ The authdomain indicates the primary authentication
+ domain supported. The domains and authdomains
+ parameters indicate the number of entity domains and
+ authentication domains supported by this node, which are
+ listed in the data segment parameter domainlist if
+
+
+Cheriton [page 93]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ either parameter is non-zero. (All the entity domains
+ precede the authentication domains in the data segment.)
+
+0x05000105 - GetRequestForwarder( CREntity, entityId1 ) -> (code,
+ entityId2, principal, authDomain)
+ Return the forwarding server's entity identifer and
+ principal for the forwarder of entityId1. CREntity
+ should be zero to get the local VMTP management module.
+
+0x05000106 - CreateEntity( entityId1 ) -> ( code, entityId2 )
+ Create a new entity and return its entity identifier in
+ entityId2. The entity is created local to the entity
+ specified in entityId1 and local to the requestor if
+ entityId1 is 0.
+
+0x05000107 - DeleteEntity( entityId ) -> ( code )
+ Delete the entity specified by entityId, which may be a
+ group. If a group, the deletion is only on a best
+ efforts basis. The client must take additional measures
+ to ensure complete deletion if required.
+
+0x0D000108 -QueryEntity( entityId ) -> ( code, descriptor )
+ Return a descriptor of entityId in arg of a maximum of
+ segmentSize bytes.
+
+0x05000109 - SignalEntity( entityId, arg )->( code )
+ Send the signal specified by arg to the entity specified
+ by entityId. (arg is 32 bits.)
+
+0x0500010A - CreateGroup(CREntity,entityGroupId,entityId,perms)->(code)
+ Request that the VMTP manager local to CREntity create
+ an new entity group, using the specified entityGroupId
+ with entityId as the first member and permissions
+ "perms", a 32-bit field described later. The invoker is
+ registered as a manager of the new group, giving it the
+ permissions to add or remove members. (Normally
+ CREntity is 0, indicating the VMTP manager local to the
+ requestor.)
+
+0x0500010B - AddToGroup(CREntity, entityGroupId, entityId,
+ perms)->(code)
+ Request that the VMTP manager local to CREntity add the
+ specified entityId to the entityGroupId with the
+ specified permissions. If entityGroupId specifies a
+ restricted group, the invoker must have permission to
+ add members to the group, either because the invoker is
+
+
+Cheriton [page 94]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ a manager of the group or because it was added to the
+ group with the required permissions. If CREntity is 0,
+ then the local VMTP manager checks permissions and
+ forwards the request with CREntity set to entityId and
+ the entityId field set to a digital signature (see
+ below) of the Request by the VMTP manager, certifying
+ that the Client has the permissions required by the
+ Request. (If entityGroupId specifies an unrestricted
+ group, the Request can be sent directly to the handling
+ VMTP manager by setting CREntity to entityId.)
+
+0x0500010C - RemoveFromGroup(CREntity, entityGroupId, entityId)->(code)
+ Request that the VMTP manager local to CREntity remove
+ the specified entityId from the group specified by
+ entityGroupId. Normally CREntity is 0, indicating the
+ VMTP manager local to the requestor. If CREntity is 0,
+ then the local VMTP manager checks permissions and
+ forwards the request with CREntity set to entityId and
+ the entityId field a digital signature of the Request by
+ the VMTP manager, certifying that the Client has the
+ permissions required by the Request.
+
+0x0500010D - QueryGroup( entityId )->( code, record )...
+ Return information on the specified entity. The
+ Response from each responding VMTP manager is (code,
+ record). The format of the record is (memberCount,
+ member1, member2, ...). The Responses are returned on a
+ best efforts basis; there is no guarantee that responses
+ from all managers with members in the specified group
+ will be received.
+
+0x0500010E - ModifyService(entityId,flags,count,pc,threadlist)->(code,
+ count)
+ Modify the service associated with the entity specified
+ by entityId. The flags may indicate a message service
+ model, in which case the call "count" parameter
+ indicates the maximum number of queued messages desired;
+ the return "count" parameter indicates the number of
+ queued message allowed. Alternatively, the "flags"
+ parameters indicates the RPC thread service model, in
+ which case "count" threads are requested, each with an
+ inital program counter as specified and stack, priority
+ and message receive area indicated by the threadlist.
+ In particular, "threadlist" consists of "count" records
+ of the form
+ (priority,stack,stacksize,segment,segmentsize), each one
+ assigned to one of the threads. Flags defined for the
+
+
+Cheriton [page 95]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ "flags" parameter are:
+
+ 1 THREAD_SERVICE - otherwise the message
+ model.
+
+ 2 AUTHENTICATION_REQUIRED - Sent a Probe
+ request to determine principal
+ associated with the Client, if not
+ known.
+
+ 4 SECURITY_REQUIRED - Request must be
+ encrypted or else reject.
+
+ 8 INCREMENTAL - treat the count value as
+ an increment (or decrement) relative to
+ the current value rather than an
+ absolute value for the maximum number of
+ queued messages or threads.
+
+ In the thread model, the count must be a positive
+ increment or else 0, which disables the service. Only a
+ count of 0 terminates currently queued requests or
+ in-progress request handling.
+
+0x4500010F -
+ NotifyVmtpClient(client,cntrl,recSeq,transact,delivery,code)->()
+
+ Update the state associated with the transaction
+ specified by client and transact, an entity identifier
+ and transaction identifier, respectively. This
+ operation is normally used only by another VMTP
+ management module. (Note that it is a datagram
+ operation.) The other parameters are as follows:
+
+ ctrl A 32-bit value corresponding to 4th
+ 32-bit word of the VMTP header of a
+ Response packet that would be sent in
+ response to the Request that this is
+ responding to. That is, the control
+ flags, ForwardCount, RetransmitCount and
+ Priority fields match those of the
+ Request. (The NRS flag is set if the
+ receiveSeqNumber field is used.) The
+ PGCount subfield indicates the number of
+ previous Request packet groups being
+ acknowledged by this Notify operation.
+ (The bit fields that are reserved in
+
+
+Cheriton [page 96]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ this word in the header are also
+ reserved here and must be zero.)
+
+ recSeq Sequence number of reception at the
+ Server if the NRS flag is set in the
+ ctrl parameter, otherwise reserved and
+ zero. (This is used for sender-based
+ logging of message activity for replay
+ in case of failure - an optional
+ facility.)
+
+ delivery Indicates the segment blocks of the
+ packet group have been received at the
+ Server.
+
+ code indicates the action the client should
+ take, as described below.
+
+ The VMTP management module should take action on this
+ operation according to the code, as specified below.
+
+ OK Do nothing at this time, continue
+ waiting for the response with a reset
+ timer.
+
+ RETRY Retransmit the request packet group
+ immediately with at least the segment
+ blocks that the Server failed to
+ receive, the complement of those
+ indicated by the delivery parameter.
+
+ RETRY_ALL Retransmit the request packet group
+ immediately with at least the segment
+ blocks that the Server failed to
+ receive, as indicated by the delivery
+ field plus all subsequently transmitted
+ packets that are part of this packet
+ run. (The latter is applicable only for
+ streamed message transactions.)
+
+ BUSY The server was unable to accept the
+ Request at this time. Retry later if
+ desired to continue with the message
+ transaction.
+
+ NONEXISTENT_ENTITY
+ Specified Server entity does not exist.
+
+
+Cheriton [page 97]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ ENTITY_MIGRATED The server entity has migrated and is no
+ longer at the host to which the request
+ was sent. The Server should attempt to
+ determine the new host address of the
+ Client using the VMTP management
+ ProbeEntity operation (described
+ earlier).
+
+ NO_PERMISSION Server has not authorized reception of
+ messages from this client.
+
+ NOT_AWAITING_MSG
+ The conditional message delivery bit was
+ set for the Request packet group and the
+ Server was not waiting for it so the
+ Request packet group was discarded.
+
+ VMTP_ERROR The Request packet group was in error
+ relative to the VMTP protocol
+ specification.
+
+ BAD_TRANSACTION_ID
+ Transaction identifier is old relative
+ to the transaction identifier held for
+ the Client by the Server.
+
+ STREAMING_NOT_SUPPORTED
+ Server does not support multiple
+ outstanding message transactions from
+ the same Client, i.e. streamed message
+ transactions.
+
+ SECURITY_NOT_SUPPORTED
+ The Request was secure and this Server
+ does not support security.
+
+ SECURITY_REQUIRED
+ The Server is refusing the Request
+ because it was not encrypted.
+
+ NO_RUN_RECORD Server has no record of previous packets
+ in this run of packet groups. This can
+ occur if the first packet group is lost
+ or if the current packet group is sent
+ significantly later than the last one
+ and the Server has discarded its client
+ state record.
+
+
+Cheriton [page 98]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+0x45000110 - NotifyVmtpServer(server,client,transact,delivery,code)->()
+ Update the server state associated with the transaction
+ specified by client and transact, an entity identifier
+ and transaction identifier, respectively. This
+ operation is normally used only by another VMTP
+ management module. (Note that it is a datagram
+ operation.) The other parameters are as follows:
+
+ delivery Indicates the segment blocks of the
+ Response packet group that have been
+ received at the Client.
+
+ code indicates the action the Server should
+ take, as listed below.
+
+ The VMTP management module should take action on this
+ operation according to the code, as specified below.
+
+ OK Client is satisfied with Response data.
+ The Server can discard the response
+ data, if any.
+
+ RETRY Retransmit the Response packet group
+ immediately with at least the segment
+ blocks that the Client failed to
+ receive, as indicated by the delivery
+ parameter. (The delivery parameter
+ indicates those segment blocks received
+ by the Client).
+
+ RETRY_ALL Retransmit the Response packet group
+ immediately with at least the segment
+ blocks that the Client failed to
+ receive, as indicated by the (complement
+ of) the delivery parameter. Also,
+ retransmit all Response packet groups
+ send subsequent to the specified packet
+ group.
+
+ NONEXISTENT_ENTITY
+ Specified Client entity does not exist.
+
+ ENTITY_MIGRATED The Client entity has migrated and is no
+ longer at the host to which the response
+ was sent.
+
+ RESPONSE_DISCARDED
+
+
+Cheriton [page 99]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ The Response was discarded and no longer
+ of interest to the Client. This may
+ occur if the conditional message
+ delivery bit was set for the Response
+ packet group and the Client was not
+ waiting for it so the Response packet
+ group was discarded.
+
+ VMTP_ERROR The Response packet group was in error
+ relative to the VMTP protocol
+ specification.
+
+0x41000111 -
+ NotifyRemoteVmtpClient(client,ctrl,recSeq,transact,delivery,code->()
+
+ The same as NotifyVmtpClient except the co-resident
+ addressing is not used. This operation is used to
+ update client state that is remote when a Request is
+ forwarded.
+
+Note the use of the CRE bit in the RequestCodes to route the request to
+the correct VMTP management module(s) to handle the request.
+
+
+III.1. Entity Group Management
+
+An entity in a group has a set of permissions associated with its
+membership, controling whether it can add or remove others, whether it
+can remove itself, and whether others can remove it from the group. The
+permissions for entity groups are as follows:
+VMTP_GRP_MANAGER 0x00000001 { Manager of group. }
+VMTP_REM_BY_SELF 0x00000002 { Can be removed self. }
+VMTP_REM_BY_PRIN 0x00000004 { Can be rem'ed by same principal}
+VMTP_REM_BY_OTHE 0x00000008 { Can be removed any others. }
+VMTP_ADD_PRIN 0x00000010 { Can add by same principal. }
+VMTP_ADD_OTHE 0x00000020 { Can add any others. }
+VMTP_REM_PRIN 0x00000040 { Can remove same principal. }
+VMTP_REM_OTHE 0x00000080 { Can remove any others. }
+
+To remove an entity from a restricted group, the invoker must have
+permission to remove that entity and the entity must have permissions
+that allow it to be removed by that entity. With an unrestricted group,
+only the latter condition applies.
+
+With a restricted group, a member can only be added by another entity
+with the permissions to add other entities. The creator of a group is
+given full permissions on a group. A entity adding another entity to a
+
+
+Cheriton [page 100]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+group can only give the entity it adds a subset of its permissions.
+With unrestricted groups, any entity can add itself to the group. It
+can also add other entities to the group providing the entity is not
+marked as immune to such requests. (This is an implementation
+restriction that individual entities can impose.)
+
+
+III.2. VMTP Management Digital Signatures
+
+As mentioned above, the entityId field of the AddToGroup and
+RemoveFromGroup is used to transmit a digital signature indicating the
+permission for the operation has been checked by the sending kernel.
+The digital signature procedures have not yet been defined. This field
+should be set to 0 for now to indicate no signature after the CREntity
+parameter is set to the entity on which the operation is to be
+performed.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 101]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+IV. VMTP Entity Identifier Domains
+
+VMTP allows for several disjoint naming domains for its endpoints. The
+64-bit entity identifier is only unique and meaningful within its
+domain. Each domain can define its own algorithm or mechanism for
+assignment of entity identifiers, although each domain mechanism must
+ensure uniqueness, stability of identifiers and host independence.
+
+
+IV.1. Domain 1
+
+For initial use of VMTP, we define the domain with Domain identifier 1
+as follows:
+
+ +-----------+----------------+------------------------+
+ | TypeFlags | Discriminator | Internet Address |
+ +-----------+----------------+------------------------+
+ 4 bits 28 bits 32 bits
+
+The Internet address is the Internet address of the host on which this
+entity-id is originally allocated. The Discriminator is an arbitrary
+value that is unique relative to this Internet host address. In
+addition, the host must guarantee that this identifier does not get
+reused for a long period of time after it becomes invalid. ("Invalid"
+means that no VMTP module considers in bound to an entity.) One
+technique is to use the lower order bits of a 1 second clock. The clock
+need not represent real-time but must never be set back after a crash.
+In a simple implementation, using the low order bits of a clock as the
+time stamp, the generation of unique identifiers is overall limited to
+no more than 1 per second on average. The type flags were described in
+Section 3.1.
+
+An entity may migrate between hosts. Thus, an implementation can
+heuristically use the embedded Internet address to locate an entity but
+should be prepared to maintain a cache of redirects for migrated
+entities, plus accept Notify operations indicating that migration has
+occurred.
+
+Entity group identifiers in Domain 1 are structured in one of two forms,
+depending on whether they are well-known or dynamically allocated
+identifiers. A well-known entity identifier is structured as:
+
+ +-----------+----------------+------------------------+
+ | TypeFlags | Discriminator |Internet Host Group Addr|
+ +-----------+----------------+------------------------+
+ 4 bits 28 bits 32 bits
+
+
+
+Cheriton [page 102]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+with the second high-order bit (GRP) set to 1. This form of entity
+identifier is mapped to the Internet host group address specified in the
+low-order 32 bits. The Discriminator distinguishes group identifiers
+using the same Internet host group. Well-known entity group identifiers
+should be allocated to correspond to the basic services provided by
+hosts that are members of the group, not specifically because that
+service is provided by VMTP. For example, the well-known entity group
+identifier for the domain name service should contain as its embedded
+Internet host group address the host group for Domain Name servers.
+
+A dynamically allocated entity identifier is structured as:
+
+ +-----------+----------------+------------------------+
+ | TypeFlags | Discriminator | Internet Host Addr |
+ +-----------+----------------+------------------------+
+ 4 bits 28 bits 32 bits
+
+with the second high-order bit (GRP) set to 1. The Internet address in
+the low-order 32 bits is a Internet address assigned to the host that
+dynamically allocates this entity group identifier. A dynamically
+allocated entity group identifier is mapped to Internet host group
+address 232.X.X.X where X.X.X are the low-order 24 bits of the
+Discriminator subfield of the entity group identifier.
+
+We use the following notation for Domain 1 entity identifiers <10> and
+propose it use as a standard convention.
+
+ <flags>-<discriminator>-<Internet address>
+
+where <flags> are [X]{BE,LE,RG,UG}[A]
+
+ X = reserved
+ BE = big-endian entity
+ LE = little-endian entity
+ RG = restricted group
+ UG = unrestricted group
+ A = alias
+
+and <discriminator> is a decimal integer and <Internet address> is in
+standard dotted decimal IP address notation.
+
+Examples:
+
+_______________
+
+<10> This notation was developed by Steve Deering.
+
+
+Cheriton [page 103]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+BE-25593-36.8.0.49 is big-endian entity #25593 created on host
+ 36.8.0.49.
+
+RG-1-224.0.1.0 is the well-known restricted VMTP managers group.
+
+UG-565338-36.8.0.77 is unrestricted entity group #565338 created on host
+ 36.8.0.77.
+
+LEA-7823-36.8.0.77 is a little-endian alias entity #7823 created on host
+ 36.8.0.77.
+
+This notation makes it easy to communicate and understand entity
+identifiers for Domain 1.
+
+The well-known entity identifiers specified to date are:
+
+VMTP_MANAGER_GROUP RG-1-224.0.1.0
+ Managers for VMTP operations.
+
+VMTP_DEFAULT_BECLIENT BE-1-224.0.1.0
+ Client entity identifier to use when a (big-endian) host
+ has not determined or been allocated any client entity
+ identifiers.
+
+VMTP_DEFAULT_LECLIENT LE-1-224.0.1.0
+ Client entity identifier to use when a (little-endian)
+ host has not determined or been allocated any client
+ entity identifiers.
+
+Note that 224.0.1.0 is the host group address assigned to VMTP and to
+which all VMTP hosts belong.
+
+Other well-known entity group identifiers will be specified in
+subsequent extensions to VMTP and in higher-level protocols that use
+VMTP.
+
+
+IV.2. Domain 3
+
+Domain 3 is reserved for embedded systems that are restricted to a
+single network and are independent of IP. Entity identifiers are
+allocated using the decentralized approach described below. The mapping
+of entity group identifiers is specific to the type of network being
+used and not defined here. In general, there should be a simple
+algorithmic mapping from entity group identifier to multicast address,
+similar to that described for Domain 1. Similarly, the values for
+default client identifier are specific to the type of network and not
+
+
+Cheriton [page 104]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+defined here.
+
+
+IV.3. Other Domains
+
+Definition of additional VMTP domains is planned for the future.
+Requests for allocation of VMTP Domains should be addressed to the
+Internet protocol administrator.
+
+
+IV.4. Decentralized Entity Identifier Allocation
+
+The ProbeEntityBlock operation may be used to determine whether a block
+of entity identifiers is in use. ("In use" means valid or reserved by a
+host for allocation.) This mechanism is used to detect collisions in
+allocation of blocks of entity identifiers as part of the implementation
+of decentralized allocation of entity identifiers. (Decentralized
+allocation is used in local domain use of VMTP such as in embedded
+systems- see Domain 3.)
+
+Basically, a group of hosts can form a Domain or sub-Domain, a group of
+hosts managing their own entity identifier space or subspace,
+respectively. As an example of a sub-Domain, a group of hosts in Domain
+1 all identified with a particular host group address can manage the
+sub-Domain corresponding to all entity identifiers that contain that
+host group address. The ProbeEntityBlock operation is used to allocate
+the random bits of these identifiers as follows.
+
+When a host requires a new block of entity identifiers, it selects a new
+block (randomly or by some choice algorithm) and then multicasts a
+ProbeEntityBlock request to the members of the (sub-)Domain some R
+times. If no response is received after R (re)transmissions, the host
+concludes that it is free to use this block of identifiers. Otherwise,
+it picks another block and tries again.
+
+Notes:
+
+ 1. A block of 256 identifiers is specified by an entity
+ identifier with the low-order 8 bits all zero.
+
+ 2. When a host allocates an initial block of entity identifiers
+ (and therefore does not yet have a specified entity
+ identifier to use) it uses VMTP_DEFAULT_BECLIENT (if
+ big-endian, else VMTP_DEFAULT_LECLIENT if little-endian) as
+ its client identifier in the ProbeEntityBlock Request and a
+ transaction identifier of 0. As soon as it has allocated a
+ block of entity identifiers, it should use these identifiers
+
+
+Cheriton [page 105]
+
+
+
+ RFC 1045 VMTP February 1988
+
+
+ for all subsequent communication. The default client
+ identifier values are defined for each Domain.
+
+ 3. The set of hosts using this decentralized allocation must not
+ be subject to network partitioning. That is, the R
+ transmissions must be sufficient to ensure that every host
+ sees the ProbeEntityBlock request and (reliably) sends a
+ response. (A host that detects a collision can retransmit
+ the response multiple times until it sees a new
+ ProbeEntityBlock operation from the same host/Client up to a
+ maximum number of times.) For instance, a set of machines
+ connected by a single local network may able to use this type
+ of allocation.
+
+ 4. To guarantee T-stability, a host must prevent reuse of a
+ block of identifiers if any of the identifiers in the block
+ are currently valid or have been valid less than T seconds
+ previously. To this end, a host must remember recently used
+ identifiers and object to their reuse in response to a
+ ProbeEntityBlock operation.
+
+ 5. Care is required in a VMTP implementation to ensure that
+ Probe operations cannot be discarded due to lack of buffer
+ space or queued or delayed so that a response is not
+ generated quickly. This is required not only to detect
+ collisions but also to provide accurate roundtrip estimates
+ as part of ProbeEntity operations.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 106]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+V. Authentication Domains
+
+A VMTP authentication domain defines the format and interpretation for
+principal identifiers and encryption keys. In particular, an
+authentication domain must specify a means by which principal
+identifiers are allocated and guaranteed unique and stable. The
+currently defined authentication domains are as follows (0 is reserved).
+
+Ideally, all entities within one entity domain are also associated with
+one authentication domain. However, authentication domains are
+orthogonal to entity domains. Entities within one domain may have
+different authentication domains. (In this case, it is generally
+necessary to have some correspondence between principals in the
+different domains.) Also, one entity identifier may be associated with
+multiple authentication domains. Finally, one authentication domain may
+be used across multiple entity domains.
+
+
+V.1. Authentication Domain 1
+
+A principal identifier is structured as follows.
+
+ +---------------------------+------------------------+
+ | Internet Address | Local User Identifier |
+ +---------------------------+------------------------+
+ 32 bits 32 bits
+
+The Internet Address may specify an individual host (such as a UNIX
+machine) or may specify a host group address corresponding to a cluster
+of machines operating under a single adminstration. In both cases,
+there is assumed to be an adminstration associated with the embedded
+Internet address that guarantees the uniqueness and stability of the
+User Identifier relative to the Internet address. In particular, that
+administration is the only one authorized to allocate principal
+identifiers with that Internet address prefix, and it may allocate any
+of these identifiers.
+
+In authentication domain 1, the standard EncryptionQualifiers are:
+
+0 Clear text - no encryption.
+
+1 use 64-bit CBC DES for encryption and decryption.
+
+
+V.2. Other Authentication Domains
+
+Other authentication domains will be defined in the future as needed.
+
+
+
+Cheriton [page 107]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+VI. IP Implementation
+
+VMTP is designed to be implemented on the DoD IP Internet Datagram
+Protocol (although it may also be implemented as a local network
+protocol directly in "raw" network packets.)
+
+VMTP is assigned the protocol number 81.
+
+With a 20 octet IP header and one segment block, a VMTP packet is 600
+octets. By convention, any host implementing VMTP implicitly agrees to
+accept VMTP/IP packets of at least 600 octets.
+
+VMTP multicast facilities are designed to work with, and have been
+implemented using, the multicast extensions to the Internet [8]
+described in RFC 966 and 988. The wide-scale use of full VMTP/IP
+depends on the availability of IP multicast in this form.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 108]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+VII. Implementation Notes
+
+The performance and reliability of a protocol in operation is highly
+dependent on the quality of its implementation, in addition to the
+"intrinsic" quality of the protocol design. One of the design goals of
+the VMTP effort was to produce an efficiently implementable protocol.
+The following notes and suggestions are based on experience with
+implementing VMTP in the V distributed system and the UNIX 4.3 BSD
+kernel. The following is described for a client and server handling
+only one domain. A multi-domain client or server would replicate these
+structures for each domain, although buffer space may be shared.
+
+
+VII.1. Mapping Data Structures
+
+The ClientMap procedure is implemented using a hash table that maps to
+the Client State Record whether this entity is local or remote, as shown
+in Figure VII-1.
+
+ +---+---+--------------------------+
+ ClientMap | | x | |
+ +---+-|-+--------------------------+
+ | +--------------+ +--------------+
+ +-->| LocalClient |--->| LocalClient |
+ +--------------+ +--------------+
+ | RemoteClient | | RemoteClient |-> ...
+ +--------------+ +--------------+
+ | | | |
+ | | | |
+ +--------------+ +--------------+
+
+ Figure VII-1: Mapping Client Identifier to CSR
+
+Local clients are linked through the LocalClientLink, similarly for the
+RemoteClientLink. Once a CSR with the specified Entity Id is found,
+some field or flag indicates whether it is identifying a local or remote
+Entity. Hash collisions are handled with the overflow pointers
+LocalClientLink and RemoteClientLink (not shown) in the CSR for the
+LocalClient and RemoteClient fields, respectively. Note that a CSR
+representing an RPC request has both a local and remote entity
+identifier mapping to the same CSR.
+
+The Server specified in a Request is mapped to a server descriptor using
+the ServerMap (with collisions handled by the overflow pointer.). The
+server descriptor is the root of a queue of CSR's for handling requests
+plus flags that modify the handling of the Request. Flags include:
+
+
+
+Cheriton [page 109]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ +-------+---+-------------------------+
+ ServerMap | | x | |
+ +-------+-|-+-------------------------+
+ | +--------------+
+ | | OverflowLink |
+ | +--------------+
+ +-->| Server |
+ +--------------+
+ | Flags | Lock |
+ +--------------+
+ | Head Pointer |
+ +--------------+
+ | Tail Pointer |
+ +--------------+
+
+ Figure VII-2: Mapping Server Identifiers
+
+THREAD_QUEUE Request is to be invoked directly as a remote procedure
+ invocation, rather than by a server process in the
+ message model.
+
+AUTHENTICATION_REQUIRED
+ Sent a Probe request to determine principal associated
+ with the Client, if not known.
+
+SECURITY_REQUIRED
+ Request must be encrypted or else reject.
+
+REQUESTS_QUEUED Queue contains waiting requests, rather than free CSR's.
+ Queue this request as well.
+
+SERVER_WAITING The server is waiting and available to handle incoming
+ Request immediately, as required by CMD.
+
+Alternatively, the Server identifiers can be mapped to a CSR using the
+MapToClient mechanism with a pointer in the CSR refering to the server
+descriptor, if any. This scheme is attractive if there are client CSR's
+associated with a service to allow it to communicate as a client using
+VMTP with other services.
+
+Finally, a similar structure is used to expand entity group identifiers
+to the local membership, as shown in Figure VII-3. A group identifier
+is hashed to an index in the GroupMap. The list of group descriptors
+rooted at that index in the GroupMap contains a group descriptor for
+each local member of the group. The flags are the group permissions
+defined in Appendix III.
+
+
+
+Cheriton [page 110]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ +-------+---+----------------------------------+
+ GroupMap | | x | |
+ +-------+-|-+----------------------------------+
+ | +--------------+
+ | | OverflowLink |
+ | +--------------+
+ +-->|EntityGroupId |
+ +--------------+
+ | Flags |
+ +--------------+
+ | Member Entity|
+ +--------------+
+
+ Figure VII-3: Mapping Group Identifiers
+
+Note that the same pool of descriptors could be used for the server and
+group descriptors given that they are similar in size.
+
+
+VII.2. Client Data Structures
+
+Each client entity is represented as a client state record. The CSR
+contains a VMTP header as well as other bookkeeping fields, including
+timeout count, retransmission count, as described in Section 4.1. In
+addition, there is a timeout queue, transmission queue and reception
+queue. Finally, there is a ServerHost cache that maps from server
+entity-id records to host address, estimated round trip time,
+interpacket gap, MTU size and (optimally) estimated processing time for
+this server entity.
+
+
+VII.3. Server Data Structures
+
+The server maintains a heap of client state records (CSR), one for each
+(Client, Transaction). (If streams are not supported, there is, at
+worst, a CSR per Client with which the server has communicated with
+recently.) The CSR contains a VMTP header as well as various
+bookkeeping fields including timeout count, retransmission count. The
+server maintains a hash table mapping of Client to CSR as well as the
+transmission, timeout and reception queues. In a VMTP module
+implementing both the client and server functions, the same timeout
+queue and transmission queue are used for both.
+
+
+
+
+
+
+
+Cheriton [page 111]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+VII.4. Packet Group transmission
+
+The procedure SendPacketGroup( csr ) transmits the packet group
+specified by the record CSR. It performs:
+
+ 1. Fragmentation of the segment data, if any, into packets.
+ (Note, segment data flagged by SDA bit.)
+
+ 2. Modifies the VMTP header for each packet as required e.g.
+ changing the delivery mask as appropriate.
+
+ 3. Computes the VMTP checksum.
+
+ 4. Encrypts the appropriate portion of the packet, if required.
+
+ 5. Prepends and appends network-level header and trailer using
+ network address from ServerHost cache, or from the responding
+ CSR.
+
+ 6. Transmits the packet with the interpacket gap specified in
+ the cache. This may involve round-robin scheduling between
+ hosts as well as delaying transmissions slightly.
+
+ 7. Invokes the finish-up procedure specified by the CSR record,
+ completing the processing. Generally, this finish-up
+ procedure adds the record to the timeout queue with the
+ appropriate timeout queue.
+
+The CSR includes a 32-bit transmission mask that indicates the portions
+of the segment to transmit. The SendPacketGroup procedure is assumed to
+handle queuing at the network transmission queue, queuing in priority
+order according to the priority field specified in the CSR record.
+(This priority may be reflected in network transmission behavior for
+networks that support priority.)
+
+The SendPacketGroup procedure only looks at the following fields of a
+CSR
+
+ - Transmission mask
+
+ - FuncCode
+
+ - SDA
+
+ - Client
+
+ - Server
+
+
+Cheriton [page 112]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ - CoResidentEntity
+
+ - Key
+
+It modifies the following fields
+
+ - Length
+
+ - Delivery
+
+ - Checksum
+
+In the case of encrypted transmission, it encrypts the entire packet,
+not including the Client field and the following 32-bits.
+
+If the packet group is a Response, (i.e. lower-order bit of function
+code is 1) the destination network address is determined from the
+Client, otherwise the Server. The HostAddr field is set either from the
+ServerHost cache (if a Request) or from the original Request if a
+Response, before SendPacketGroup is called.
+
+The CSR includes a timeout and TTL fields indicating the maximum time to
+complete the processing and the time-to-live for the packets to be
+transmitted.
+
+SendPacketGroup is viewed as the right functionality to implement for
+transmission in an "intelligent" network interface.
+
+Finally, it appears preferable to be able to assume that all portions of
+the segment remain memory-resident (no page faults) during transmission.
+In a demand-paged systems, some form of locking is required to keep the
+segment data in memory.
+
+
+VII.5. VMTP Management Module
+
+The implementation should implement the management operations as a
+separate module that is invoked from within the VMTP module. When a
+Request is received, either from the local user level or the network,
+for the VMTP management module, the management module is invoked as a
+remote or local procedure call to handle this request and return a
+response (if not a datagram request). By registering as a local server,
+the management module should minimize the special-case code required for
+its invocation. The management module is basically a case statement
+that selects the operation based on the RequestCode and then invokes the
+specified management operation. The procedure implementing the
+management operation, especially operations like NotifyVmtpClient and
+
+
+Cheriton [page 113]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+NotifyVmtpServer, are logically part of the VMTP module because they
+require full access to the basic data structures of the VMTP
+implementation.
+
+The management module should be implemented so that it can respond
+quickly to all requests, particularly since the timing of management
+interactions is used to estimate round trip time. To date, all
+implementations of the management module have been done at the kernel
+level, along with VMTP proper.
+
+
+VII.6. Timeout Handling
+
+The timeout queue is a queue of CSR records, ordered by timeout count,
+as specified in the CSR record. On entry into the timeout queue, the
+CSR record has the timeout field set to the time (preferable in
+milliseconds or similar unit) to remain in the queue plus the finishup
+field set to the procedure to execute on removal on timeout from the
+queue. The timeout field for a CSR in the queue is the time relative to
+the record preceding it in the queue (if any) at which it is to be
+removed. Some system-specific mechanism decrements the time for the
+record at the front of the queue, invoking the finishup procedure when
+the count goes to zero.
+
+Using this scheme, a special CSR is used to timeout and scan CSR's for
+non-recently pinged CSR's. That is, this CSR times out and invokes a
+finishup procedure that scans for non-recently pinged CSR that are
+"AwaitingResponse" and signals the request processing entity and deletes
+the CSR. It then returns to the timeout queue.
+
+The timeout mechanism tends to be specific to an operating system. The
+scheme described may have to be adapted to the operating system in which
+VMTP is to be implemented.
+
+This mechanism handles client request timeout and client response
+timeout. It is not intended to handle interpacket gaps given that these
+times are expected to be under 1 millisecond in general and possibly
+only a few microseconds.
+
+
+VII.7. Timeout Values
+
+Roundtrip timeout values are estimated by matching Responses or
+NotifyVmtpClient Requests to Request transmission, relying on the
+retransmitCount to identify the particular transmission of the Request
+that generated the response. A similar technique can be used with
+Responses and NotifyVmtpServer Requests. The retransmitCount is
+
+
+Cheriton [page 114]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+incremented each time the Response is sent, whether the retransmission
+was caused by timeout or retransmission of the Request.
+
+The ProbeEntity request is recommended as a basic way of getting
+up-to-date information about a Client as well as predictable host
+machine turnaround in processing a request. (VMTP assumes and requires
+an efficient, bounded response time implementation of the ProbeEntity
+operation.)
+
+Using this mechanism for measuring RTT, it is recommended that the
+various estimation and smoothing techniques developed for TCP RTT
+estimation be adapted and used.
+
+
+VII.8. Packet Reception
+
+Logically a network packet containing a VMTP packet is 5 portions:
+
+ - network header, possibly including lower-level headers
+
+ - VMTP header
+
+ - data segment
+
+ - VMTP checksum
+
+ - network trailer, etc.
+
+It may be advantageous to receive a packet fragmented into these
+portions, if supported by the network module. In this case, ideally the
+VMTP header may be received directly into a CSR, the data segment into a
+page that can be mapped, rather than copied, to its final destination,
+with VMTP checksum and network header in a separate area (used to
+extract the network address corresponding to the sender).
+
+Packet reception is described in detail by the pseudo-code in Section
+4.7.
+
+With a response, normally the CSR has an associated segment area
+immediately available so delivery of segment data is immediate.
+Similarly, server entities should be "armed" with CSR's with segment
+areas that provide for immediate delivery of requests. It is reasonable
+to discard segment data that cannot be immediately delivered in this
+way, providing that clients and servers are able to preallocate CSR's
+with segment areas for requests and responses. In particular, a client
+should be able to provide some number of additional CSR's for receiving
+multiple responses to a multicast request.
+
+
+Cheriton [page 115]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+The CSR data structure is intended to be the interface data structure
+for an intelligent network interface. For reception, the interface is
+"armed" with CSR's that may point to segment areas in main memory, into
+which it can deliver a packet group. Ideally, the interface handles all
+the processing of all packets, interacting with the host after receiving
+a complete Request or Response packet group. An implementation should
+use an interface based on SendPacketGroup(CSR) and
+ReceivePacketGroup(CSR) to facilitate the introduction of an intelligent
+network interface.
+
+ReceivePacketGroup(csr) provides the interface with a CSR descriptor and
+zero or more bytes of main memory to receive segment data. The CSR
+describes whether it is to receive responses (and if so, for which
+client) or requests (and if so for which server).
+
+The procedure ReclaimCSR(CSR) reclaims the specified record from the
+interface before it has been returned after receiving the specified
+packet group.
+
+A finishup procedure is set in the CSR to be invoked when the CSR is
+returned to the host by the normal processing sequence in the interface.
+Similarly, the timeout parameter is set to indicate the maximum time the
+host is providing for the routine to perform the specified function.
+The CSR and associated segment memory is returned to the host after the
+timeout period with an indication of progress after the timeout period.
+It is not returned earlier.
+
+
+VII.9. Streaming
+
+The implementation of streaming is optional in both VMTP clients and
+servers. Ideally, all performance-critical servers should implement
+streaming. In addition, clients that have high context switch overhead,
+network access overhead or expect to be communicating over long delay
+links should also implement streaming.
+
+A client stream is implemented by allocating a CSR for each outstanding
+message transaction. A stream of transactions is handled similarly to
+multiple outstanding transactions from separate clients except for the
+interaction between consecutive numbered transactions in a stream.
+
+For the server VMTP module, streamed message transactions to a server
+are queued (if accepted) subordinate to the first unprocessed CSR
+corresponding to this Client. Thus, streamed transactions from a given
+Client are always performed in the order specified by the transaction
+identifiers.
+
+
+
+Cheriton [page 116]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+If a server does not implement streaming, it must refuse streamed
+message transactions using the NotifyVmtpClient operation. Also, all
+client VMTP's that support streaming must support the streamed interface
+to a server that does not support streaming. That is, it must perform
+the message transactions one at a time. Consequently, a program that
+uses the streaming interface to a non-streaming server experiences
+degraded performance, but not failure.
+
+
+VII.10. Implementation Experience
+
+The implementation experience to date includes a partial implementation
+(minus the streaming and full security) in the V kernel plus a similar
+preliminary implementation in the 4.3 BSD Unix kernel. In the V kernel
+implementation, the CSR's are part of the (lightweight) process
+descriptor.
+
+The V kernel implementation is able to perform a VMTP message
+transaction with no data segment between two Sun-3/75's connected by 10
+Mb Ethernet in 2.25 milliseconds. It is also able to transfer data at
+4.7 megabits per second using 16 kilobyte Requests (but null checksums.)
+The UNIX kernel implementation running on Microvax II's achieves a basic
+message transaction time of 9 milliseconds and data rate of 1.9 megabits
+per second using 16 kilobyte Responses. This implementation is using
+the standard VMTP checksum.
+
+We hope to report more extensive implementation experience in future
+revisions of this document.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 117]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+VIII. UNIX 4.3 BSD Kernel Interface for VMTP
+
+UNIX 4.3 BSD includes a socket-based design for program interfaces to a
+variety of protocol families and types of protocols (streams,
+datagrams). In this appendix, we sketch an extension to this design to
+support a transaction-style protocol. (Some familiarity with UNIX 4.2/3
+IPC is assumed.) Several extensions are required to the system
+interface, rather than just adding a protocol, because no provision was
+made for supporting transaction protocols in the original design. These
+extensions include a new "transaction" type of socket plus new system
+calls invoke, getreply, probeentity, recreq, sendreply and forward.
+
+A socket of type transaction bound to the VMTP protocol type
+IPPROTO_VMTP is created by the call
+
+ s = socket(AF_INET, SOCK_TRANSACT, VMTP);
+
+This socket is bound to an entity identifier by
+
+ bind(s, &entityid, sizeof(entityid));
+
+The first address/port bound to a socket is considered its primary name
+and is the one used on packet transmission. A message transaction is
+invoked between the socket named by s and the Server specified by mcb by
+
+ invoke(s, mcb, segptr, seglen, timeout );
+
+The mcb is a message control block whose format was described in Section
+2.4. The message control block specifies the request to send plus the
+destination Server. The response message control block returned by the
+server is stored in mcb when invoke returns. The invoking process is
+blocked until a response is received or the message transaction times
+out unless the request is a datagram request. (Non-blocking versions
+with signals on completion could also be provided, especially with a
+streaming implementation.)
+
+For multicast message transactions (sent to an entity group), the next
+response to the current message transaction (if it arrives in less than
+timeout milliseconds) is returned by
+
+ getreply( s, mcb, segptr, maxseglen, timeout );
+
+The invoke operation sent to an entity group completes as soon as the
+first response is received. A request is retransmitted until the first
+reply is received (assuming the request is not a datagram). Thus, the
+system does not retransmit while getreply is timing out even if no
+replies are available.
+
+
+Cheriton [page 118]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+The state of an entity associated with entityId is probed using
+
+ probeentity( entityId, state );
+
+A UNIX process acting as a VMTP server accepts a Request by the
+operation
+
+ recvreq(s, mcb, segptr, maxseglen );
+
+The request message for the next queued transaction request is returned
+in mcb, plus the segment data of maximum length maxseglen, starting at
+segptr in the address space. On return, the message control block
+contains the values as set in invoke except: (1) the Client field
+indicates the Client that sent the received Request message. (2) the
+Code field indicates the type of request. (3) the MsgDelivery field
+indicates the portions of the segment actually received within the
+specified segment size, if MDM is 1 in the Code field. A segment block
+is marked as missing (i.e. the corresponding bit in the MsgDelivery
+field is 0) unless it is received in its entirety or it is all of the
+data in last segment contained in the segment.
+
+To complete a transaction, the reply specified by mcb is sent to the
+client specified by the MCB using
+
+ sendreply(s, mcb, segptr );
+
+The Client field of the MCB indicates the client to respond to.
+
+Finally, a message transaction specified by mcb is forwarded to
+newserver as though it were sent there by its original invoker using
+
+ forward(s, mcb, segptr, timeout );
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 119]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+Index
+
+ Acknowledgment 14
+ APG 16, 31, 39
+ Authentication domain 20
+
+ Big-endian 9
+
+ Checksum 14, 43
+ Checksum, not set 44
+ Client 7, 10, 38
+ Client timer 16
+ CMD 42, 110
+ CMG 32, 40
+ Co-resident entity 25
+ Code 42
+ CoResidentEntity 42, 43
+ CRE 21, 42
+
+ DGM 42
+ Digital signature, VMTP management 95, 101
+ Diskless workstations 2
+ Domain 9, 38
+ Domain 1 102
+ Domain 3 104
+
+ Entity 7
+ Entity domain 9
+ Entity group 8
+ Entity identifier 37
+ Entity identifier allocation 105
+ Entity identifier, all-zero 38
+ EPG 20, 39
+
+ Features 6
+ ForwardCount 24
+ Forwarding 24
+ FunctionCode 41
+
+ Group 8
+ Group message transaction 10
+ Group timeouts 16
+ GRP 37
+
+ HandleNoCSR 62
+ HandleRequestNoCSR 79
+ HCO 14, 23, 39
+
+
+Cheriton [page 120]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Host independence 8
+
+ Idempotent 15
+ Interpacket gap 18, 40
+ IP 108
+
+ Key 91
+
+ LEE 32, 37
+ Little-endian 9
+
+ MCB 118
+ MDG 22, 40
+ MDM 30, 42
+ Message control block 118
+ Message size 6
+ Message transaction 7, 10
+ MPG 39
+ MsgDelivery 43
+ MSGTRANS_OVERFLOW 27
+ Multicast 4, 21, 120
+ Multicast, reliable 21
+
+ Naming 6
+ Negative acknowledgment 31
+ NER 25, 31, 39
+ NRT 26, 30, 39
+ NSR 25, 27, 31, 39
+
+ Object-oriented 2
+ Overrun 18
+
+ Packet group 7, 29, 39
+ Packet group run 31
+ PacketDelivery 29, 31, 41
+ PGcount 26, 41
+ PIC 42
+ Principal 11
+ Priority 41
+ Process 11
+ ProcessId 89
+ Protocol number,IP 108
+
+ RAE 37
+ Rate control 18
+ Real-time 2, 4
+ Realtime 22
+
+
+Cheriton [page 121]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ Reliability 12
+ Request message 10
+ RequestAckRetries 30
+ RequestRetries 15
+ Response message 10
+ ResponseAckRetries 31
+ ResponseRetries 15
+ Restricted group 8
+ Retransmission 15
+ RetransmitCount 17
+ Roundtrip time 17
+ RPC 2
+ Run 31, 39
+ Run, message transactions 25
+
+ SDA 42
+ Security 4, 19
+ Segment block 41
+ Segment data 43
+ SegmentSize 42, 43
+ Selective retransmission 18
+ Server 7, 10, 41
+ Server group 8
+ Sockets, VMTP 118
+ STI 26, 40
+ Streaming 25, 55
+ Strictly stable 8
+ Subgroups 21
+
+ T-stable 8
+ TC1(Server) 16
+ TC2(Server) 16
+ TC3(Server) 16
+ TC4 16
+ TCP 2
+ Timeouts 15
+ Transaction 10, 41
+ Transaction identification 10
+ TS1(Client) 17
+ TS2(Client) 17
+ TS3(Client) 17
+ TS4(Client) 17
+ TS5(Client) 17
+ Type flags 8
+
+ UNIX interface 118
+ Unrestricted group 8, 38
+
+
+Cheriton [page 122]
+
+
+
+RFC 1045 VMTP February 1988
+
+
+ NotifyVmtpClient 7, 26, 27, 30
+ NotifyVmtpServer 7, 14, 30
+ User Data 43
+
+ Version 38
+ VMTP Management digital signature 95, 101
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Cheriton [page 123]
+