diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc683.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc683.txt')
-rw-r--r-- | doc/rfc/rfc683.txt | 175 |
1 files changed, 175 insertions, 0 deletions
diff --git a/doc/rfc/rfc683.txt b/doc/rfc/rfc683.txt new file mode 100644 index 0000000..f327ab5 --- /dev/null +++ b/doc/rfc/rfc683.txt @@ -0,0 +1,175 @@ + + RFC 683, NIC 32251 + + FTPSRV - TENEX FTP EXTENSIONS FOR PAGED FILES + + R. Clements - BBN - 3 April 75 + + + + + +1 Introduction + + In response to a long-known need for the ability to transfer TENEX paged +files over the net via FTP, the TENEX FTP implementation has been extended. + + This implementation is an extension to the "OLD" protocol (RFC 354). It +was built after useful discussions with Postel, Neigus, et al. I do not mean +to imply that they agreed that this implementation is correct, nor for that +matter do I feel it is correct. A "correct" implementation will be negotiated +and implemented in the "NEW" protocol (RFC 542), if funding ever appears for +that task. + +2 The Problem(s) + + This extension attacks two separate problems: Network reliability and +TENEX disk file format's incompatibility with FTP. A checksummed and +block-sequence-numbered transmission mode is seriously needed, in my opinion. +This mode should also allow data compression. + + It is also necessary to handle paged, holey TENEX files. This latter +problem, seriously needed for NLS, is the motivation for the current +extension. + + The former problem requires a new MODE command, if done correctly; +probably two MODEs, to allow data compression in addition to checksumming. +Actually, I think that is the tip of an iceberg which grows as 2**N for +additional sorts of modes, so maybe some mode combination system needs to be +dreamed up. Cf the AN, AT, AC, EN, ET, EC TYPEs. Also, one should be able to +use MODE B and MODE C together (NEW protocol) to gain both the compression and +restart facilities if one wanted. + + The second problem, TENEX files, are probably a new kind of STRUcture. +However, it should be possible to send a paper tape to a disk file, or vice +versa, with the transfer looking like a paged file; so perhaps we are dealing +with a data representation TYPE. This argument is a bit strained, though, so +a paged STRUcture is quite likely correct. I admit to feeling very unsure +about what is a MODE, what is a TYPE and what is a STRUcture. + +3 The (Incorrect) choices made + + Having decided that new MODEs and STRUctures were needed, I instead +implemented the whole thing as a single new TYPE. After all, I rationalize, +checksumming the data on the network (MODE) and representing the data in the +processing system as a checksummed TYPE are really just a matter of where you +draw the imaginary line between the net and the data. Also, a single new TYPE +command reduced the size of the surgery required on the FTP user and server +programs. + +4 Implementation details + + The name of the new TYPE is "XTP". I propose this as a standard for all +the Key Letter class of FTP commands: the "X" stands for "experimental" -- +agreed on between cooperating sites. The letter after the "X" is signed out +from the protocol deity by an implementor for a given system. In this case, +"T" is for TENEX. Subsequent letter(s) distinguish among possibly multiple +private values of the FTP command. Here "P" is "Paged" type. + + TYPE XTP is only implemented for STRU F, BYTE 36, and MODE S. + + Information of TYPE XTP is transfered in chunks (I intentionally avoid the +words RECORD and BLOCK) which consist of a header and some data. The data in +a chunk may be part of the data portion of the file being transferred, or it +may be the FDB (File Descriptor Block) associated with the file. + +5 Diversion: the TENEX Disk File + + For those not familiar with the TENEX file system, a brief dissertation is +included here to make the rest of the implementation meaningful. + + A TENEX disk file consists of four things: a pathname, a page table, a +(possibly empty) set of pages, and a set of attributes. + + The pathname is specified in the RETR or STOR verb. It includes the +directory name, file name, file name extension, and version number. + + The page table contains up to 2**18 entries. Each entry may be EMPTY, or +may point to a page. If it is not empty, there are also some page-specific +access bits; not all pages of a file need have the same access protection. + + A page is a contiguous set of 512 words of 36 bits each. + + The attributes of the file, in the FDB, contain such things as creation +time, write time, read time, writer's byte-size, end of file pointer, count of +reads and writes, backup system tape numbers, etc. + + NOTE: there is NO requirement that pages in the page table be contiguous. +There may be empty page table slots between occupied ones. Also, the end of +file pointer is simply a number. There is no requirement that it in fact +point at the "last" datum in the file. Ordinary sequential I/O calls in TENEX +will cause the end of file pointer to be left after the last datum written, +but other operations may cause it not to be so, if a particular programming +system so requires. + + In fact both of these special cases, "holey" files and end-of-file +pointers not at the end of the file, occur with NLS data files. These files +were the motivation for the new TYPE. + +6 Meanwhile, back at the implementation,... + + Each chunk of information has a header. The first byte, which is the +first word (since TYPE XTP is only implemented for BYTE 36) of the chunk, is a +small number, currently 6, which is the number of following words which are +still in the header. Next come those six words, and then come some data +words. + + The six header words are: + Word 1: a checksum. + This is a one's complement sum (magnitude and end-around carry) of + the six header words and the following data words (but not the + leading "6" itself). The sum of all words including the checksum + must come out + or - zero. + Word 2: A sequence number. + The first chunk is number 1, the second is number 2, etc. + Word 3: NDW, + the number of data words in this chunk, following the header. Thus + the total length of the chunk is 1 (the word containing NHEAD) + + NHEAD +NDW. The checksum checks all but the first of these. + Word 4: Page number. + If the data is a disk file page, this is the number of that page in + the file's page map. Empty pages (holes) in the file are simply + not sent. Note that a hole is NOT the same as a page of zeroes. + Word 5: ACCESS. + The access bits associated with the page in the file's page map. + (This full word quantity is put into AC2 of an SPACS by the program + reading from net to disk.) + Word 6: TYPE. + A code for what type of chunk this is. Currently, only type zero + for a data page, and type -3 for an FDB are sent. + + After the header are NDW data words. NDW is currently either 1000 octal +for a data page or 25 octal for an FDB. Trailing zeroes in a disk file page +will soon be discarded, making NDW less than 1000 in that case. The receiving +portions of FTP server and user will accept these shortened pages. The sender +doesn't happen to send them that way yet. + +Verification is performed such that an error is reported if either: + The checksum fails, + The sequence number is not correct, + NDW is unreasonable for the given chunk type, or + The network file ends at some point other than immediately following the +data portion of an FDB chunk. + +7 Closing comments + + This FTP server and user are in operation on all the BBN systems and at +some other sites -- the user being more widely distributed since fewer sites +have made local modifications to the user process. + + I believe the issues of checksumming and sequencing should be addressed +for the "NEW" protocol. I hope the dissertation on TENEX files has been +useful to users of other systems. It may explain my lack of comprehension of +the "record" concept, for example. A TENEX file is just a bunch of words +pointed to by a page table. If those words contain CRLF's, fine -- but that +doesn't mean "record" to TENEX. I think this RFC also points out clearly that +net data transfers are implemented like the layers of an onion: some +characters are packaged into a line. Some lines are packaged into a file. +The file is broken into other managable units for transmission. Those units +have compression applied to them. The units may be flagged by restart markers +(has anyone actually done that?). The compressed units may be checksummed, +sequence numbered, date-and-time stamped, and flagged special delivery. On +the other end, the process is reversed. Perhaps MODE, TYPE, and STRU don't +really adequately describe the situation. This RFC was written to allow +implementors to interface with the new FTP server at TENEX sites which install +it. It is also really a request for comments on some of these other issues. |