diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc713.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc713.txt')
-rw-r--r-- | doc/rfc/rfc713.txt | 1277 |
1 files changed, 1277 insertions, 0 deletions
diff --git a/doc/rfc/rfc713.txt b/doc/rfc/rfc713.txt new file mode 100644 index 0000000..1925a30 --- /dev/null +++ b/doc/rfc/rfc713.txt @@ -0,0 +1,1277 @@ +Request for Comments: 713 Jack Haverty (JFH@MIT-DMS) +NIC #34739 Apr 1976 + + + + + + + +I. ABSTRACT + + +A mechanism is defined for use by message servers in +transferring data between hosts. The mechanism, called the +MSDTP, is defined in terms of a model of the process as a +translation between two sets of items, the abstract entities +such as 'strings' and 'integers', and the formats used to +represent such data as a byte stream. + +A proposed organization of a general data transfer +mechanism is described, and the manner in which the MSDTP +would be used in that environment is presented. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -1- + +II. REFERENCES + + +Black, Edward H., "The DMS Message Composer", MIT Project +MAC, Programming Technology Division Document +SYS.16.02. + +Burchfiel, Jerry D., Leavitt, Elsie M., Shapiro, Sonya and +Strollo, Theodore R., compilers, "Tenex Users' Guide", +Bolt Beranek and Newman, Cambridge, Mass., May 1971, +revised January 1975, Descriptive sections on the TENEX +subsystems: MAlLER, p. 116-11; MAlLSTAT, p. 118-119; +READMAIL, p. 137; and SNDMSG, p. 165-170. + +Haverty, Jack, "Communications System Overview", MIT Project +MAC, Programming Technology Division Document +SYS.16.00. + +Haverty, Jack, "Communications System Daemon Manual", MIT +Project MAC, Programming Technology Division Document +SYS.16.01. + +ISI Information Automation Project, "Military Message +Processing System Design," Internal Project +Documentation (Out of Print), Jan. 1975 + +Message Services Committee, "Interim Report", Jan. 28, 1975 + +Mooers, Charlotte D., "Mailsys Message System: Manual For +Users", Bolt Beranek and Newman, Cambridge, Mass., June +1975 (draft). + +Myer, Theodore H., "Notes On The BBN Mail System", Bolt +Beranek and Newman, November 8, 1974. + +Myer, Theodore H., and Henderson, D. Austin, "Message +Transmission Protocol", Network Working Group RFC 680, +NIC 32116, April 30, 1975. + +Postel, Jon, "The PCPB8 Format", NSW Proposal, June 5, 1975 + +Tugender, R., and D. R. Oestreicher, "Basic Functional +Capabilities for a Military Message Processing +Service," ISI?RR-74-23., May 1975 + +Vezza, Al, "Message Services Committee Minority Report", +Jan. 1975 + + + + + + + + + + + -2- + +III. OVERVIEW + + +This document describes a mechanism developed for use +by message servers communicating over an eight-bit +byte-oriented network connection to move data structures and +associated data-typing information. It is presented here in +the hope that it may be of use to other projects which need +to transfer data structures between dissimilar hosts. + +A set of abstract entities called PRIMITIVE ITEMS is +enumerated. These are intended to include traditional data +types of general utility, such as integers, strings, and +arrays. + +A mechanism is defined for augmenting the set of +abstract data entities handled, to allow the introduction of +application-specific data, whose format and semantics are +understood by the application programs involved, but which +can be transmitted using common coding facilities. An +example might be a data structure called a 'file +specification', or a 'date'. Abstract data entities defined +using this mechanism will be termed SEMANTIC ITEMS, since +they are typically used to carry data having semantic +content in the application involved. + +Semantic and primitive items are collectively referred +to simply as ITEMS. + +The protocol next involves the definition of the format +of the byte stream used to convey items from machine to +machine. These encodings are described in terms of OBJECTS, +which are the physical byte streams transmitted. + +To complete the protocol, the rules for translating +between objects and items are presented as each object is +defined. + +An item is transmitted by being translated into an +object which is transmitted over the connection as a stream +of bytes to the receiver, and reconstructed there as an +item. The protocol mechanism may thus be viewed as a simple +translator. It enumerates a set of abstract entities, the +items, which are known to programmers, a set of entities in +byte-stream format, the objects, and the translation rules +for conversion between the sets. A site implementing the +MSDTP would typically provide a facility to convert between +objects and the local representation of the various items +handled. Applications using the MSDTP define their +interactions using items, without regard to the actual +formats in which such items are represented at various +machines. This permits programs to handle higher-level +concepts such as a character string, without concern for its +numerous representational formats. Such detail is handled +by the MSDTP. + + -3- + + +Finally, a discussion of a general data transfer +mechanism for communication between programs is presented, +and the manner in which the particular byte-oriented +protocol defined herein would be used in that environment is +discussed. + +Terminology, as introduced, is defined and highlighted +by capitalizing. + + +IV. PRIMITIVE DATA ITEMS + +The primitive data items include a variety of +traditional, well-understood types, such as integers and +strings. Primitive data items will be presented using +mnemonic names preceded by the character pair "p-", to serve +as a reminder that the named object is primitive. + +These items may be represented in various computer +systems in whatever fashion their programmers desire. + + +IV.1 -- Set Of Primitive Items + + +The set of primitive items defined includes p-INT, +p-STRING, p-STRUC, p-BITS, p-CHAR, p-BOOL, p-EMPTY, and +p-XTRA. + +Since the protocol was developed primarily for use in +message services, items such as p-FLOAT are not included +since they were unnecessary. Additional items may be easily +added as necessary. + +A p-INT performs the traditional role of representing +integer numbers. A p-BITS (BIT Stream) item represents a +bit stream. The two possible p-BOOL (BOOLean) items are +used to represent the logical values of *TRUE* and *FALSE*. +The single p-EMPTY item is used to, for example, indicate +that a given field of a message is empty. It is provided to +act as a place-holder, representing 'no data', and appears +as *EMPTY*. + +The p-STRUC (STRUCture) item is used to group together +a collection of items as a single value, maintaining the +ordering of the elements, such as a p-STRUC of p-INTs. + +A p-CHAR is a single character. The most common +occurrence of character data, however, will be as p-STRINGs. +A p-STRING should be considered to be a synonym for a +p-STRUC containing only p-CHARs. This concept is important +for generality and consistency, especially when considering +definitions of permissible operations on structures, such as +extracting subsequences of elements, etc. + + -4- + +Four p-XTRA items, which can be transmitted in a single +byte, are made available for higher level protocols to use +when a frequently used datum is handled which can be +represented just by its name. An example would be an +acknowledgment between two servers. Using p-XTRAs to +represent such data permits them to be handled in a single +byte. There are four possible p-XTRA items, termed *XTRA0*, +*XTRA1*, *XTRA2*, and *XTRA3*. These may be assigned +meanings by user protocols as desired. + + +IV.2 -- Printing Conventions + + +The following printing conventions are introduced to +facilitate discussion of the primitive items. + +When a specific instance of a primitive data item is +presented, it will be shown in a traditional representation +for that kind of data. For example, p-INTs are shown as +sequences of digits, e.g. 100, p-STRINGs, as sequences of +characters enclosed in double-quote characters, for example +"ABCDEF". + +As shown above, the two possible p-BOOL items are shown +as *TRUE* or *FALSE*. The object p-EMPTY appears as +*EMPTY*. A bit stream, i.e. p-BITS, appears as a stream of +1s and 0s enclosed in asterisks, for example *100101001*. A +p-CHAR will be presented as the character enclosed in single +quote characters, e.g., 'A'. + +P-STRUCs are printed as the representations of their +elements, enclosed in parentheses, for example (1 2 3 4) or +("XYZ" "ABC" 1 2) or ((1 2 3) "A" "B"). Note that because +p-STRINGs are simply a class of p-STRUCs assigned a special +name and printing format for brevity and convenience, the +items "ABC" and ('A' 'B' 'C') are identical, and the latter +format should not be used. + +To present a generic p-STRUC, as in specifying formats +of the contents of something, the items are presented as a +mnemonic name, optionally followed by a colon and the +permissible types of values for that datum. When one of +several items may appear as the value for some component, +the permissible ones appear separated by vertical-bar +characters. For example, p-INT|p-STRING represents a single +item, which may be either a p-INT or a p-STRING. + +To represent a succession of items, the Kleene star +convention is used. The specification p-INT[*] represents +any number of p-INTs. Similarly, p-INT[3,5] represents from +3 to 5 p-INTs, while p-INT[*,5] specifies up to 5 and +p-iNT[5,*] specifies at least 5 p-INTs. + + + + -5- + +For example, a p-STRUC which is used to carry names and +numbers might be specified as follows. + +(name:p-STRING number:p-INT) + +In discussing items in general, when a specific data +value is not intended, the name and types representation may +be used, e.g., offset:p-INT to discuss an 'offset' which has +a numeric value. + + +V. SEMANTIC ITEM MECHANISM + + +The semantic item mechanism provides a means for +program designers to use a variety of application-specific +data items. + +This mechanism is implemented using a special tagged +structure to carry the data type information as well as the +actual components of the particular semantic item. For +discussion purposes. Such a special p-STRUC will be termed a +p-EDT (Extended Data Type). + +When p-EDTs are transferred, their identity as a p-EDT +is maintained. So that an applications program receives the +corresponding semantic item instead of a simple p-STRUC. A +p-EDT is identical to a p-STRUC in all other respects. + + +V.1 -- Format of p-EDTs + + +A prototypical p-EDT follows. It is printed as if it +were a normal p-STRUC. Since p-EDTs are converted to +semantic items for presentation to the user, a p-EDT will +never be used except in this protocol definition. + +(type:p-INT|p-STRING version:p-INT com1:any +com2:any ...) + +The first element, the 'type' is generally a p-INT, and +is used to identify the particular type of semantic item. +Types are assigned numeric codes in a controlled fashion. +The type may alternatively be specified by a p-STRING, to +permit development of new data types for possible later +assignment of codes. Each type has an equivalent p-STRING +name. These may be used interchangeably as 'type' elements, +primarily to maintain upward compatibility. + +The second element of a p-EDT is always an p-INT, the +'version', and specifies the exact format of the particular +datum. A semantic item may undergo several revisions of its +internal structure. Which would be evident through assigning +different versions to each revision. + + -6- + +Successive components. The 'com' elements, if any. +carry the actual data of the semantic item. As each +semantic item is defined, conventions on permissible values +and interpretation of these components are presented. Such +definitions may use any types of items to specify the format +of the semantic item. Use of lower level concepts, such as +objects, in these definitions is prohibited. + +Semantic items will be printed as the mnemonic for the +type involved, preceded by the character pair "s-", to +signify that the data item is handled by this mechanism. + + +V.2 -- Printing Conventions + + +A semantic item is represented as if it were a p-STRUC +containing only the components, if any, but preceded by the +semantic type name and a # character. The version number is +assumed to be 1 if unspecified. For later versions, the +version number is attached to the type name, as in, for +example, FILE-2 to represent version 2 of the FILE data +type. + +For example, a semantic item called a 'file +specification' might be defined, containing two components, +a host number and pathname. A specific instance of such an +item might appear as #FILE(69 "DIRECTORY.NAME-OF-FILE"), +while a generic s-FILE might be presented as the following. + +#FILE(host:p-INT|p-STRING pathname:p-STRING) + + +the item, which may be either a p-INT or p-STRING, and +'pathname' is the second component, which must be a +p-STRING. The full definition would present interpretation +rules for these components. + + +VI. ENCODING OBJECTS + + +This section presents the set of objects which are used +to represent items as byte streams for inter-server +transmission. Objects will be presented using mnemonic +type-names preceded by the character pair "b-", indicating +their existence only as byte streams. + +All servers are required to be capable of decoding the +entire set of objects. Servers are not required to transmit +certain objects which are available to improve channel +efficiency. + + + + + -7- + +The encodings are designed to facilitate programming +and efficiency of the receiving decoder. In all cases, the +type and length in bytes of objects is supplied as the first +information sent. This characteristic is important for ease +of implementation. The type information permits a decoder to +be constructed in a modular fashion. The most important +advantage of including size information is that the receiver +always knows how many bytes it must read to discover what to +do next, and knows when each object terminates. This +requirement avoids many potential problems with timing and +synchronization of processes. + +Two varieties of objects are defined. The first will +be called ATOMIC, and includes objects used to efficiently +encode the most common data. The second variety is termed +NON-ATOMIC, and is used to encode larger or less common +items. + +In all cases, a data object begins with a single byte, +which will be termed the TYPE-BYTE, a field of which +contains the type code of the object. The following bytes, +if any, are interpreted according to the type involved. + + +VI.1 -- Presentation Conventations + + +In discussing formats of bytes, the following +conventions will be employed. The individual bits of a byte +will be referenced by using capital letters from A to H, +where A signifies the highest order bit, and H the lowest. +The entire eight bit value, for example, could be referred +to as ABCDEFGH. Similarly, subfields of the byte will be +identified by such sequences. The CDEF field specifies the +middle four bits of a byte. + +In referring to values of fields, binary format will be +used, and small letters near the end of the alphabet will be +used to identify particular bits for discussion. For +example, we might say that the BCD field of a byte contains +a specifier for some type, and define its value to be +BCD=11z. In discussions of the specifier usage, we could +refer to the cases where z=l and where z=0, as shorthand +notation to identify BCD=111 and BCD=110, respectively. + + +V1.2 -- Type-Byte Bit Assignment + + +To assist in understanding the assignment of the +various type-byte values, the table and graph below are +included, showing representations of the eight bits. + + + + + -8- + +OXXXXXXX -- CHAR7 (CHARacter, 7 bit) +10XXXXXX -- SINTEGER (Small INTEGER) +l10XXXXX -- NON-ATOM (NON-ATOMic objects) +11100XXX -- LINTEGER (Large INTEGER) +11101XXX -- reserved +11110XXX -- SBITSTR (Short BIT STReam) +111110XX -- XTRA (eXTRA single-byte objects) +1111110X -- BOOL (BOOLean) +11111110 -- EMPTY (EMPTY data item) +11111111 -- PADDING (unused byte) + + +In each case, the bits identified by X's are used to +contain information specific to the type involved. These +are explained when each type is defined. + +An equivalent tree representation follows, for those +who prefer it. +start with high order bit + | + | + | + 0-----0-----0-----0-----0-----0-----0-----0-----X + | | | | | | | | PADDING +0| 0| 0| 0| 0| 0| 0| 0| + | | | | | | | | + X | X | X | X X +CHAR7 | NON-ATOM | BITS | BOOL EMPTY + (7) | (5) | (3) | (1) + | 0| | | + SINTEGER | XTRA + (6) | (2) + LINTEGER + (3) + + Type-Byte Bit Assignment Scheme + + + + +This picture is interpreted by entering at the top, and +taking the appropriate branch at each node to correspond to +the next bit of the type-byte, as it is scanned from left to +right. When a type is assigned, the branch terminates with +an "X' and the name of the type of the object, with the +number of remaining bits in parentheses. The individual +object definitions specify how these bits are used for that +particular type. + + +V1.3 -- Atomic Objects + + +Atomic objects are identified by specific patterns in a +type-byte. Receiving servers must be capable of recognizing + + + -9- + +and handling all atomic types, since the size of the object +is not explicitly present in a uniform fashion. + + +================================ +| Atomic Object: B-CHAR7 | +================================ + + +The b-CHAR7 (CHARacter 7 bit) object is introduced to +handle transmission of characters, in 7-bit ASCII format. +Since the vast majority of message-related data involves +such objects, they are designed to be very efficient in +transmission. Other formats, such as eight bit values, can +be introduced as non-atomic objects. The format of a b-CHAR7 +follows: + +A=0 identifying the b-CHAR7 data type +BCDEFGH=tuvwxyz containing the character +code + +The tuvwxyz objects contain the ASCII code of the +character. For example, transmission of a "space' (ASCII +code 32, 40 octal) would be accomplished by the following +byte. + +00100000 +ABCDEFGH + +A=0 to identify this byte as a b-CHAR7. The remaining +bits contain the 7 bit code, octal 40, for space. + +A b-CHAR7 standing alone is presented as a p-CHAR. +Such occurrences will probably be rare if they are used at +all. The most common use of b-CHAR7's is as elements of +b-USTRUCs used to transmit p-STRINGS, as explained later. + + +============================= +| Atomic Object: B-SINTEGER | +============================= + +The b-SINTEGER (Small INTEGER) object is used to +transmit very small positive integers, of values up to 64. +It always translates to an p-INT, and any p-INT between 0 +and 63 may be encoded as a b-SINTEGER for transmission. The +format of an b-SINTEGER follows. + +AB=10 identifying the object as a b-SINTEGER +CDEFGH=uvwxyz containing the actual number + +For example, to transmit the integer 10 (12 octal), the +following byte would be transmitted: + +10001010 +ABCDEFGH + + -10- + +AB=10 to specify a b-SINTEGER. The remaining six bits +contain the number 10 expressed in binary. + +============================= +| Atomic Object: B-SINTEGER | +============================= + +The b-SINTEGER (Large INTEGER) object is used to +transmit p-INTs to any precision up to 64 bits. It is +always translated as a p-INT. Sending servers are permitted +to choose either b-SINTEGER or b-SINTEGER format for +transmission of numbers, as appropriate. When possible, +b-SINTEGERs can be used for better channel efficiency. The +format of a b-SINTEGER follows: + +ABCDE=11100 specifying that this is a b-SINTEGER. +FGH=xyz containing a count of number of bytes to follow. + +The xyz bits are interpreted as a number of bytes to +follow which contain the actual binary code of the the +integer in 2's complement format. Since a zero-byte integer +is disallowed, the pattern xyz=000 is interpreted as 1000, +specifying that 8 bytes follow. The number is transmitted +with high-order bits first. This format permits +transmission of integers as large as 64 bits in magnitude. + +For example, if the number 4096 (10000 octal) is to be +transmitted, the following sequence of bytes would be sent: + +11100010 00010000 00000000 +ABCDEFGH ---actual data--- + +ABCDE=11100, identifying this as a b-LINTEGER, E=0, +specifying a positive number, and FGH=010, specifying that 2 +bytes follow, containing the actual binary number. + +============================ +| Atomic Object: B-SBITSTR | +============================ + +The b-SBITSTR (Short BIT STReam) object is used to +transmit a p-BITS of length 63 or less. For longer bit +streams, the non-atomic object b-LBITSTR may be used. The +format of a b-SBITSTR follows. + +ABCDE=11110 specifying the type as b-SBITSTR +FGH=xyz specifying the number of bytes +following. + + + + + + + + -11- +The xyz value specifies the number of additional bytes +to be read to obtain the bit stream values. As in the case +of b-SINTEGER, the value xyz=000 is interpreted as 1000, +specifying that 8 bytes follow. + +To avoid requiring specification of exactly the number +of bits contained, the following convention is used. The +first data byte is scanned from left to right until the +first 1 bit is encountered. The bit stream is defined to +begin with the immediately following bit, and run through +the last bit of the last byte read. In other words, the bit +stream is 'right-adjusted' in the collected bytes, with its +left end delimited by the first "on' bit. + +For example, to send the bit stream *001010011* (9 +bits), the following bytes are transmitted. + +11110010 00000010 01010011 +ABCDEhij klmnopqr stuvwxyz + +The hij=010 value specifies that two bytes follow. The +q bit, which is the first 1 bit encountered, identifies the +start of the bit stream as being the r bit. The rstuvwxyz +bits are the bit stream being handled. + +========================= +| Atomic Object: b-BOOL | +========================= + +The b-BOOL (BOOLean) object is used to transmit +p-BOOLs. The format of b-BOOL objects follows. + +ABCDEFG=1111110 specifying the type as +b-BOOL +H=z specifying the value + +The two possible translations of a b-BOOL are *FALSE* +and *TRUE*. + +11111100 represents *FALSE* +11111101 represents *TRUE* +ABCDEFGz + +if z=0, the value is FALSE, otherwise TRUE. + + + +======================================== +| Atomic Object: B-EMPTY | +======================================== + +The b-EMPTY object type is used to transmit a 'null' +object, i.e. an *EMPTY*. The format of an b-EMPTY follows. + +ABCDEFGH=11111110 specifying *EMPTY* + + -12- +========================= +| Atomic Object: B-XTRA | +========================= + +The b-XTRA objects are used to carry the four possible +p-XTRA items, i.e., *XTRA0*, *XTRA1*, *XTRA2*, and *XTRA3*. +These four items correspond to the binary coding of the +remaining two bits after the b-XTRA type code bits. The +format of a b-XTRA follows. + +ABCDEF=111110 to specify the type b-XTRA +GH=yz to identify the particular p-XTRA item +carried + +The GH bits of the byte are decoded to produce a +particular p-XTRA item, as follows. + +GH=00 -- *XTRA0* +GH=01 -- *XTRA1* +GH=10 -- *XTRA2* +GH=11 -- *XTRA3* + +The b-XTRA object is included to provide the use of +several single-byte data items to higher levels. These +items may be assigned by individual applications to improve +the efficiency of transmission of several very frequent data +items. For example, the message services protocols will use +these items to convey positive and negative acknowledgments, +two very common items in every interaction. + +======================================== +| Atomic Object: B-PADDING +======================================== + +This object is anomalous, since it represents really no +data at all. Whenever it is encountered in a byte stream in +a position where a type-byte is expected, it is completely +ignored, and the succeeding byte examined instead. Its +purpose is to serve as a filler in byte streams, providing +servers with an aid in handling internal problems related to +their specific word lengths, etc. The encoders may freely +use this object to serve as padding when necessary. + +All b-PADDING data objects exist only within an encoded +byte stream. They never cause any data item whatsoever to +be presented externally to the coder module. The format of a +b-PADDING follows. + +ABCDEFGH=11111111 + +Note that this does not imply that all such 'null' +bytes in a stream are to be ignored, since they could be +encountered as a byte within some other type, such as +b-LINTEGER. Only bytes of this format which, by their +position in the stream, appear as a 'type' byte are to be +ignored. + + -13- +VI.4 -- Non-Atomic Objects + + +Non-atomic objects are are always transmitted preceded +by both a single type byte and some small number of size +byte(s). The type byte identifies that the data object +concerned is of a non-atomic type, as well as uniquely +specifying the particular type involved. All non-atomic +objects have type byte values of the following form. + +ABC=110 specifying that the object is +non-atomic +DEFGH=vwxyz specifying the particular type +of object + +The vwxyz value is used to specify one of 31 possible +non-atomic types. The value vwxyz=00000 is reserved for use +in future expansion. + +In all non-atomic data objects, the byte(s) following +the type-byte specify the number of bytes to follow which +contain the data object. In all cases, if the number of +bytes specified are processed, the next byte to be seen +should be another type-byte, the beginning of the next +object in the stream. + +The number of bytes containing the object size +information is variable. These bytes will be termed the +SIZE-BYTES. The first byte encountered has the following +format. + +A=s specifying the manner in which the size +information is encoded +BCDEFGH=tuvwxyz specifying the size, or +number of bytes containing the size + +The tuvwxyz values supply a positive binary number. If +the s value is a one, the tuvwxyz value specifies the number +of bytes to follow which should be read and concatenated as +a binary number, which will then specify the size of the +object. These bytes will appear with high order bits first. +Thus, if s=1, up to 128 bytes may follow, containing the +count of the succeeding data bytes, which should certainly +be sufficient. + +Since many non-atomic objects will be fairly short, the +s=0 condition is used to indicate that the 7 bits contained +in tuvwxyz specify the actual data byte count. This permits + +objects of sizes up to 128 bytes to be specified using one +size-information byte. The case tuvwxyz=0000000 is +interpreted as specifying 128 bytes. + +For example, a data object of some non-atomic type +which requires 100 (144 octal) bytes to be transmitted would +be sent as follows. + + -14- + +110XXXXX -- identifying a specific +non-atomic object +01100100 -- specifying that 100 bytes follow +. +. +data -- the 100 data bytes +. +. + +Note that the size count does not include the +size-specifier byte(s) themselves, but does include all +succeeding bytes in the stream used to encode the object. + +A data object requiring 20000 (47040 octal) bytes would +appear in the stream as follows. + +110XXXXX -- identifying a specific +non-atomic object +10000010 -- specifying that the next 2 bytes +contain the stream length +01001110 -- first byte of number 20000 +00100000 -- second byte +. +. +data -- 20,000 bytes +. +. + +Interpretation of the contents of the 20000 bytes in +the stream can be performed by a module which knows the +specific format of the non-atomic type specified by DEFGH in +the type-byte. + +The remainder of this section defines an initial set of +non-atomic types, the format of their encoding, and the +semantics of their interpretation. + + +================================ +| Non-atomic Object: B-LBITSTR | +================================ + +The b-LBITSTR (Long BIT Stream) data type is introduced +to transmit p-BITS which cannot be handled by a b-SBITSTR. +A b-LBITSTR may be used to transmit short p-BITS as well. +Its format follows. + + + + + + + + + + + -15- + +11000001 size-bytes data-bytes +ABCDEFGH + +ABC=110 identifies this as a non-atomic object. +DEFGH=00001 specifies that it is a b-LBITSTR. The standard +sizing information specifies the number of succeeding bytes. +Within the data-bytes, the first object encountered must +decode to a p-INT. This number conveys the length of the +bit stream to follow. The actual bit stream begins with the +next byte, and is left-adjusted in the byte stream. For +example to encode *101010101010*, the following b-LBITSTR +could be used, although a b-SBITSTR would be more compact. + +11000001 -- identifies a b-LBITSTR +00000010 -- b-SINTEGER, to specify length +10001100 -- size = 2 +10101010 -- first 8 data bits +10100000 -- last 4 data bits + + + +============================== +| Non-atomic Object: B-STRUC | +============================== + +The b-STRUC (STRUCture) data type is used to transmit +any p-STRUC. The translation rules for converting a b-STRUC +into a primitive item are presented following the discussion +of b-REPEATs. The b-STRUC format appears as follows. + +11000010 size-bytes data-bytes +ABCDEFGH + +ABC=110 identifies this as a non-atomic type. +DEFGH=00010 specifies that the object is a b-STRUC. Within +the data-bytes stream, objects simply follow in order. This +implies that the b-STRUC encoder and decoder modules can +simply make use of recursive calls to a standard +encoder/decoder for processing each element of the b-STRUC. + +Note that any type of object is permitted as an element of a +b-STRUC, but the size information of the b-STRUC must +include all bytes used to represent the elements. + +Containment of b-STRUCs within other b-STRUCs is +permitted to any reasonable level. That is, a b-STRUC may +contain as an element another b-STRUC, which contains +another b-STRUC, and so on. All servers are requires to +handle such containment to at least a minimum depth of +three. + +Examples of encoded structures appear in a later +section. + + + -16- +============================ +| Non-atomic Object: B-EDT | +============================ + +A b-EDT is the object used as the carrier for p-EDTs in +transmission of semantic items. It is functionally +identical to a b-STRUC, but has a different type code to +permit it to be identified and converted to a semantic item +instead of a p-STRUC. The format of a b-EDT follows. + +11000011 size-bytes data-bytes +ABCDEFGH + +As with all non-atomic types, ABC=110 to identify this +as such, and DEFGH=00011 to specify a b-EDT. The objects in +the data-bytes are decoded as for b-STRUCs. However, the +first object must decode to a p-iNT or p-STRING and the +second to a p-INT, to conform to the format of p-EDTs. + + + +=============================== +| Non-atomic Object: b-REPEAT | +=============================== + + +The b-REPEAT object is never translated directly into +an item. It is legal only as an component of an enclosing +b-STRUC, b-USTRUC, b-EDT, or b-REPEAT. A b-REPEAT is used to +concisely specify a set of elements to be treated as if they +appeared in the enclosing structure in place of the +b-REPEAT. This provides a mechanism for encoding a sequence +of identical data items or patterns efficiently for +transmission. + +A common example of this would be in transmission of +text, where line images containing long sequences of spaces, +or pages containing multiple carriage-return, line-feed +pairs, are often encountered. Such sequences could be +encoded as an appropriate b-REPEAT to compact the data for +transmission. The format of a b-REPEAT is as follows. + +11000100 -- identifyIng the object as a + b-REPEAT +size-bytes -- the standard non-atomic object + size information +countspec -- an object which translates to a p-INT +. +. +data -- the objects which define the pattern +. +. + +The 'countspec' object must translate to an p-INT to +specify the number of times that the following data pattern +should be repeated in the object enclosing the b-REPEAT. + + -17- + +The remaining objects in the b-REPEAT constitute the +data pattern which is to be repeated. The decoding of the +enclosing structure will be continued as if the data pattern +objects appeared 'countspec' times in place of the b-REPEAT. +Zero repeat counts are permitted, for generality. They +cause no objects to be simulated in the enclosing structure. + +An encoder does not have to use b-REPEATs at all, if +simplicity of coding outweighs the benefits of data +compression. In message services, for example, an encoder +might limIt itself to only compressing long text strings. It +is important for compatibility, however, to have the ability +in the decoders to handle b-REPEATs. + +=============================== +| Non-atomic Object: B-USTRUC | +=============================== + +The b-USTRUC (Uniform Structure) object type is +provided to enable servers to convey the fact that a p-STRUC +being transferred contains items of only a single type. The +most common example would involve a b-USTRUC which +translates to a p-STRUC of only p-CHARs, and hence may be +considered to be a p-STRING. Servers may use this +information to assist them in decoding objects efficiently. +No server is required to generate b-USTRUCs. + +The internal construction of a b-USTRUC is identical to +that of a b-STRUC, except for the type-byte. The format of a +b-USTRUC follows. + +11000101 size-bytes data-bytes +ABCDEFGH + +ABC=110 to identify a non-atomic object. DEFGH=00101 +specifies the object as a b-USTRUC. + +=============================== +| Non-atomic Object: B-STRING | +=============================== + +The b-STRING object is included to permit explicit +specification of a structure as a p-STRING. This +information will permit receiving servers to process the +incoming structure more efficiently. A b-STRING is +formatted similarity to a b-USTRUC, except that its type-byte +identifies the object as a b-STRI/NG. The normal sizing +information is followed by a stream of bytes which are +interpreted as b-CHAR7s, Ignoring the high-order bit. The +format of a b-STRING follows. + +11000110 size-bytes data-bytes +ABCDEFGH + +ABC=110 to identify a non-atomic object. DEFGH=00110 +specifies the object as a b-STRING. + + -18- + +VI.5 -- Structure Translation Rules + + +A b-STRUC is translated into a p-STRUC. This is +performed by translating each object of the b-STRUC Into its +corresponding item, and saving it for inclusion In the +p-STRUC being generated. A b-USTRUC is handled similarly, +but the coding programs may utilize the information that the +resultant p-STRUC will contain items of uniform type. The +preferred method of coding p-STRINGS is to use b-USTRUCs. + +If all of the elements of the resultant p-STRUC are +p-CHARs, it is presented to the user of the decoder as a +p-STRING. A p-STRING should be considered to be a synonym +for a p-STRUC containing only characters. It need not +necessarily exist at particular sites which would present +p-STRUCs of p-CHARs to their application programs + +The object b-REPEAT is handled in a special fashion +when encountered as an element. When this occurs, the data +pattern of the b-REPEAT is translated into a sequence of +items, and that sequence is repeated in the next higher +level as many times as specified in the b-REPEAT. +Therefore, b-REPEATS are legal only as elements of a +surrounding b-STRUC, b-USTRUC, b-EDT, or b-REPEAT. + +In encoding a p-STRUC or p-STRING for transmission, a +translator may use b-REPEATs as desired to effect data +compression, but their use is not mandatory. Similarly, +b-STRINGS may be used, but are not mandatory. + +A b-EDT is translated into a p-EDT to identify it as a +carrier for a semantic item. Otherwise, it is treated +identically to a b-STRUC. + + +VI.6 -- Translation Summary + + +The following table summarizes the possible +translations between primitive items and objects. + +p-INT <--> b-LINTEGER, b-SINTEGER +p-STRING <--> b-STRING, b-STRUC, b-USTRUC +p-STRUC <--> b-STRING, b-STRUC, b-USTRUC +p-BITS <--> b=SBITSTR, b-LBITSTR +p-CHAR <--> b-CHAR7 +p-BOOL <--> b-BOOL +p-EMPTY <--> b=EMPTY +p-XTRA <--> b-XTRA +p-EDT <--> b-EDT (all semantic items) +-none- <--> b-PADDING +-none- <--> b-REPEAT (only within structure) + +Note that all semantic items are represented as p-EDTs +which always exist as b-EDTs in byte-stream format. + + -19- +V1.7 -- Structure Coding Examples + + +The following stream transmits a b-STRUC containing 3 +b-SINTEGERs, with values 1, 2, and 3, representing a p-STRUC +containing three p-INTs, i.e. (1 2 3). + +11000010 -- b-STRUC +00000011 -- size=3 +10000001 -- b-SINTEGER=1 +10000010 -- b-SINTEGER=2 +10000011 -- b-SINTEGER=3 + +The next example represents a b-STRUC containing the +characters X and Y, followed by the b-LINTEGER 10, +representing a p-STRUC of 2 p-CHARs and a p-INT, i.e., ('X' +'Y' 10). Note that the p-INT prevents considering this a +p-STRING. + +11000010 -- b-STRUC +00000100 -- size=4 +01011000 -- b-CHAR7 'X' +01011001 -- b-CHAR7 'Y' +11100001 -- b-LINTEGER +00001010 -- 10 + +Note that a better way to send this p-STRUC would be to +represent the integer as a b-SINTEGER, as shown below. + +11000010 -- b-STRUC +00000011 -- size=3 +01011000 -- b-CHAR7 'X' +01011001 -- b-CHAR7 'Y' +10001010 -- b-SINTEGER=10 + +The next example shows a b-STRUC of b-CHAR7s. It is +the translation of the b-STRING "HELLO". + +11000010 -- b-STRUC +00000101 -- size=5 +01001000 -- b-CHAR7 'H' +01000101 -- b-CHAR7 'E' +01001100 -- b-CHAR7 'L' +01001100 -- b-CHAR7 'L' +01001111 -- b-CHAR7 'O' + +This datum could also be transmitted as a b-STRING. +Note that the character bytes are not necessarily b-CHAR7s, +since the high-order bit is ignored. + +11000110 -- b-STRING +00000101 -- size=5 +01001000 -- 'H' +01000101 -- 'E' +01001100 -- 'L' +01001100 -- 'L' +01001111 -- 'O' + + -20- +To encode a p-STRING containing 20 carriage-return +line-feed pairs, the following b-STRUC containing a b-REPEAT +could be used. + +11000010 -- b-STRUC +00000101 -- size=5 +11000100 -- b-REPEAT +00000011 -- size=3 +10010100 -- count, b-SINTEGER=20 +00001101 -- b-CHAR7, "CR' +00001010 -- b-CHAR7, 'IF' + +To encode a p-STRUC of p-INTs, where the sequence +contains a sequence of thirty 0's preceded by a single 1, +the following b-STRUC could be used. + +11000010 -- b-STRUC +00000110 -- size=6 +10000001 -- b-SINTEGER=1 +11000100 -- b-REPEAT +00000010 -- size=2 +10011110 -- count, b-SINTEGER=30 +10000000 -- b-SINTEGER=0 + + +VII. A GENERAL DATA TRANSFER SCHEME + + +This section considers a possible scheme for extending +the concept of a data translator into an multi-purpose data +transfer mechanism. + +The proposed environment would provide a set of +primitive items, including those enumerated herein but +extended as necessary to accommodate a variety of +applications. Communication between processes would be +defined solely in terms of these items, and would +specifically avoid any consideration of the actual formats +in which the data is transferred. + +A repertoire of translators would be provided, one of +which is the MSDTP machinery, for use in converting items to +any of a number of transmission formats. Borrowing a +concept from radio terminology, each translator would be +analogous to a different type of modulation scheme, to be +used to transfer data through some communications medium. +Such media could be an eight-bit byte-oriented connection, +36-bit connection, etc. and conceivably have other +distinguishing features, such as bandwidth, cost, and delay. +For each media which a site supports, it would provide its +programmers with a module for performing the translations +required. + + + + + -21- + +Certain media or translators might not handle various +items. For example, the MSDTP does not handle items which +might be termed p-FLOATs, p-COMPLEXs, p-ARRAY, and so on. In +addition, the efficiency of various media for transfer of +specific items may differ drastically. MSDTP, for example, +transfers data frequently used in message handling very +efficiently, but is relatively poor at transfer of very +large or deep tree structures. + +Available at each site as a process or subroutine +package wouLd be a module responsible for interfacing with +its counterpart at the other end of the media. These +modules would use a protocol, not yet defined, to match +their capabilities, and choose a particular media and +translator, when more than one exists, for transfer of data +items. + +Such a facility could totally insulate applications +from need to consider encoding formats, machine differences, +and so on, as well as eliminate duplication of effort in +producing such facilities for every new project which +requires them. In addition, as new translators or media are +introduced, they would become immediately available to +existing users without reprogramming. + +Implementation of such a protocol should not be very +difficult or time-consuming, since it need not be very +sophisticated in choosing the most appropriate transfer +mechanism in initial implementations. The system is +inherently upward-compatible and easily expandable. + + + + + + + + + + + + + + + + + + + + + + + + -22- |