doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc683.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 175 insertions, 0 deletions
diff --git a/doc/rfc/rfc683.txt b/doc/rfc/rfc683.txt
new file mode 100644
index 0000000..f327ab5
--- /dev/null
+++ b/doc/rfc/rfc683.txt
@@ -0,0 +1,175 @@
+
+                              RFC 683, NIC 32251
+
+                FTPSRV - TENEX FTP EXTENSIONS FOR PAGED FILES
+
+                        R. Clements - BBN - 3 April 75
+
+
+
+
+
+1   Introduction
+
+    In response to a long-known need for the ability to transfer TENEX paged
+files over the net via FTP, the TENEX FTP implementation has been extended.
+
+    This implementation is an extension to the "OLD" protocol (RFC 354).  It
+was built after useful discussions with Postel, Neigus, et al.  I do not mean
+to imply that they agreed that this implementation is correct, nor for that
+matter do I feel it is correct.  A "correct" implementation will be negotiated
+and implemented in the "NEW" protocol (RFC 542), if funding ever appears for
+that task.
+
+2   The Problem(s)
+
+    This extension attacks two separate problems: Network reliability and
+TENEX disk file format's incompatibility with FTP.  A checksummed and
+block-sequence-numbered transmission mode is seriously needed, in my opinion.
+This mode should also allow data compression.
+
+    It is also necessary to handle paged, holey TENEX files.  This latter
+problem, seriously needed for NLS, is the motivation for the current
+extension.
+
+    The former problem requires a new MODE command, if done correctly;
+probably two MODEs, to allow data compression in addition to checksumming.
+Actually, I think that is the tip of an iceberg which grows as 2**N for
+additional sorts of modes, so maybe some mode combination system needs to be
+dreamed up.  Cf the AN, AT, AC, EN, ET, EC TYPEs.  Also, one should be able to
+use MODE B and MODE C together (NEW protocol) to gain both the compression and
+restart facilities if one wanted.
+
+    The second problem, TENEX files, are probably a new kind of STRUcture.
+However, it should be possible to send a paper tape to a disk file, or vice
+versa, with the transfer looking like a paged file; so perhaps we are dealing
+with a data representation TYPE.  This argument is a bit strained, though, so
+a paged STRUcture is quite likely correct.  I admit to feeling very unsure
+about what is a MODE, what is a TYPE and what is a STRUcture.
+
+3   The (Incorrect) choices made
+
+    Having decided that new MODEs and STRUctures were needed, I instead
+implemented the whole thing as a single new TYPE.  After all, I rationalize,
+checksumming the data on the network (MODE) and representing the data in the
+processing system as a checksummed TYPE are really just a matter of where you
+draw the imaginary line between the net and the data.  Also, a single new TYPE
+command reduced the size of the surgery required on the FTP user and server
+programs.
+
+4   Implementation details
+
+    The name of the new TYPE is "XTP".  I propose this as a standard for all
+the Key Letter class of FTP commands: the "X" stands for "experimental" --
+agreed on between cooperating sites.  The letter after the "X" is signed out
+from the protocol deity by an implementor for a given system.  In this case,
+"T" is for TENEX.  Subsequent letter(s) distinguish among possibly multiple
+private values of the FTP command.  Here "P" is "Paged" type.
+
+   TYPE XTP is only implemented for STRU F, BYTE 36, and MODE S.
+
+    Information of TYPE XTP is transfered in chunks (I intentionally avoid the
+words RECORD and BLOCK) which consist of a header and some data.  The data in
+a chunk may be part of the data portion of the file being transferred, or it
+may be the FDB (File Descriptor Block) associated with the file.  
+
+5   Diversion: the TENEX Disk File
+
+    For those not familiar with the TENEX file system, a brief dissertation is
+included here to make the rest of the implementation meaningful.
+
+    A TENEX disk file consists of four things: a pathname, a page table, a
+(possibly empty) set of pages, and a set of attributes.
+
+    The pathname is specified in the RETR or STOR verb.  It includes the
+directory name, file name, file name extension, and version number.
+
+    The page table contains up to 2**18 entries.  Each entry may be EMPTY, or
+may point to a page.  If it is not empty, there are also some page-specific
+access bits; not all pages of a file need have the same access protection.
+
+    A page is a contiguous set of 512 words of 36 bits each.
+
+    The attributes of the file, in the FDB, contain such things as creation
+time, write time, read time, writer's byte-size, end of file pointer, count of
+reads and writes, backup system tape numbers, etc.
+
+    NOTE: there is NO requirement that pages in the page table be contiguous.
+There may be empty page table slots between occupied ones.  Also, the end of
+file pointer is simply a number.  There is no requirement that it in fact
+point at the "last" datum in the file. Ordinary sequential I/O calls in TENEX
+will cause the end of file pointer to be left after the last datum written,
+but other operations may cause it not to be so, if a particular programming
+system so requires.
+
+    In fact both of these special cases, "holey" files and end-of-file
+pointers not at the end of the file, occur with NLS data files.  These files
+were the motivation for the new TYPE.
+
+6   Meanwhile, back at the implementation,...
+
+    Each chunk of information has a header.  The first byte, which is the
+first word (since TYPE XTP is only implemented for BYTE 36) of the chunk, is a
+small number, currently 6, which is the number of following words which are
+still in the header.  Next come those six words, and then come some data
+words.
+
+   The six header words are:
+      Word 1: a checksum.
+         This is a one's complement sum (magnitude and end-around carry) of 
+         the six header words and the following data words (but not the 
+         leading "6" itself).  The sum of all words including the checksum 
+         must come out + or - zero.
+      Word 2: A sequence number.
+         The first chunk is number 1, the second is number 2, etc.
+      Word 3: NDW,
+         the number of data words in this chunk, following the header.  Thus
+         the total length of the chunk is 1 (the word containing NHEAD) + 
+         NHEAD +NDW.  The checksum checks all but the first of these.
+      Word 4: Page number.
+         If the data is a disk file page, this is the number of that page in
+         the file's page map.  Empty pages (holes) in the file are simply 
+         not sent.  Note that a hole is NOT the same as a page of zeroes.
+      Word 5: ACCESS.
+         The access bits associated with the page in the file's page map.  
+         (This full word quantity is put into AC2 of an SPACS by the program
+         reading from net to disk.)
+      Word 6: TYPE.
+         A code for what type of chunk this is. Currently, only type zero 
+         for a data page, and type -3 for an FDB are sent.
+
+    After the header are NDW data words.  NDW is currently either 1000 octal
+for a data page or 25 octal for an FDB.  Trailing zeroes in a disk file page
+will soon be discarded, making NDW less than 1000 in that case.  The receiving
+portions of FTP server and user will accept these shortened pages.  The sender
+doesn't happen to send them that way yet.
+
+Verification is performed such that an error is reported if either:
+   The checksum fails,
+   The sequence number is not correct,
+   NDW is unreasonable for the given chunk type, or
+   The network file ends at some point other than immediately following the 
+data portion of an FDB chunk.
+
+7   Closing comments
+
+    This FTP server and user are in operation on all the BBN systems and at
+some other sites -- the user being more widely distributed since fewer sites
+have made local modifications to the user process.
+
+    I believe the issues of checksumming and sequencing should be addressed
+for the "NEW" protocol.  I hope the dissertation on TENEX files has been
+useful to users of other systems.  It may explain my lack of comprehension of
+the "record" concept, for example.  A TENEX file is just a bunch of words
+pointed to by a page table.  If those words contain CRLF's, fine -- but that
+doesn't mean "record" to TENEX. I think this RFC also points out clearly that
+net data transfers are implemented like the layers of an onion: some
+characters are packaged into a line.  Some lines are packaged into a file.
+The file is broken into other managable units for transmission.  Those units
+have compression applied to them.  The units may be flagged by restart markers
+(has anyone actually done that?).  The compressed units may be checksummed,
+sequence numbered, date-and-time stamped, and flagged special delivery.  On
+the other end, the process is reversed.  Perhaps MODE, TYPE, and STRU don't
+really adequately describe the situation. This RFC was written to allow
+implementors to interface with the new FTP server at TENEX sites which install
+it.  It is also really a request for comments on some of these other issues.
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc683.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)