doc: Add RFC documents

author: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
committer: Thomas Voss <mail@thomasvoss.com> 2024-11-27 20:54:24 +0100
commit: 4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree: e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc373.txt
parent: ea76e11061bda059ae9f9ad130a9895cc85607db (diff)
1 files changed, 218 insertions, 0 deletions
diff --git a/doc/rfc/rfc373.txt b/doc/rfc/rfc373.txt
new file mode 100644
index 0000000..fe75d79
--- /dev/null
+++ b/doc/rfc/rfc373.txt
@@ -0,0 +1,218 @@
+
+
+
+
+
+
+NWG/RFC #373                                       14 July 1972
+NIC 11058                                          SU-AI
+
+
+                        ARBITRARY CHARACTER SETS
+
+                            by John McCarthy
+
+It would be nice to be able to have documents stored in computers that
+could include arbitrary characters and to be able to display them on
+any CRT screen, edit them using any keyboard, and print them on any
+printer.  The object of this memorandum is to suggest how to get there
+from here with special reference to the ARPA network.
+
+Where are we now?
+
+   (1) At present, there is 96 character ASCII, and everyone agrees that
+   it should be included in any larger set.
+
+   (2) Many installations are dependent on 64 character sets which do not
+   even include the lower case latin alphabet.
+
+   (3) At the Stanford Artificial Intelligence Laboratory, we have a 114
+   character set that includes 96 character ASCII and which is
+   implemented in our keyboards, displays, and line printer
+
+   (4) Printers are becoming available that get their character designs
+   out of memory, for example, the Xerox XGP printer, one of which we are
+   getting.
+
+   (5) The IMLAC type display has the character designs in main memory so
+   that changing the displayed set is just a matter of reloading the
+   memory.
+
+   (6) Many display systems share the character generator among many
+   display units.  In some of these, e.g. the Datadisc, arbitrary sets
+   are probably feasible (using kludgery to be described later), but in
+   other systems, e.g. our III's arbitrary sets are not feasible.
+
+One possible approach to communication in expanded character sets is
+to produce an expanded standard set of characters, perhaps using 8 or
+9 bits and expect new equipment to implement this set.  This approach
+has the disadvantage that it will be very hard to get agreement on
+what the next step should be, and even if formal agreement is
+realized, many groups will find it in their interest to ignore the
+standard.
+
+
+
+
+
+                                                                [Page 1]
+
+NWG/RFC# 373                        JMC 14-JUL-72 12:41  11058
+ARBITRARY CHARACTER SETS  by John McCarthy
+
+Therefore, I would like to suggest that the next step be to arbitrary
+character sets.  I suggest implementing this in the following way:
+
+   (1) There be established a registry of characters.  Anyone can
+   register a new character.  Each character has a unique number, 17 bits
+   should be enough even to include Chinese.  Besides this, each
+   character has a name in ASCII usually mnemonic.  Finally, the
+   character has a design which is a picture on a 50 by 50 dot matrix.
+
+   (2) Besides the registry of characters, there is a registry of
+   characters sets, which different groups are using for different
+   classes of documents.  A registered character set has a registry
+   number and a table giving the correspondence between the character
+   codes as bit sequences and the registered character numbers.
+
+   (3) Associated with a document is a statement of the character code
+   used therein.  This may be one of the registered codes or it may
+   contain in addition modifications described by an auxiliary table
+   giving the code correspondence with registered character numbers.  A
+   character code may have an escape character that says that the next
+   character is described by its registry number.  The statement of the
+   character code may be a header on the document or the receiver may
+   have to learn it by some other means, e.g.  because its library
+   catalog entry contains this information.
+
+   (4) Devices such as printers and displays draw characters in different
+   ways and standardization doesn't seem feasible at present. Therefore,
+   it is necessary to provide a way of going from the standard
+   description of a character using a 50 by 50 dot matrix to whatever
+   method the device uses.  This is up to the programmers who are
+   supporting the device.  Some may choose to manually create files
+   describing how registered characters are implemented.  They may find
+   it too much work to provide for all the characters and to update their
+   files when new characters are registered.  Others will provide
+   programs for going from the registered descriptions to descriptions
+   compatible with their implementations.  Perhaps most will hand tailor
+   the characters most used and provide a program for the others.
+
+
+
+
+
+
+
+
+
+
+
+                                                                [Page 2]
+
+NWG/RFC# 373                        JMC 14-JUL-72 12:41  11058
+ARBITRARY CHARACTER SETS  by John McCarthy
+
+   (5) The easiest device to handle is the line printer because it is
+   slow.  At the beginning of the print job, the SPOOL program will look
+   up the character set and load the printer's memory with the character
+   designs used in the particular document.  Sometimes, it may have to go
+   through the network to one of the computers that stores the registry
+   in order to find out what to do.
+
+   (6) Display systems that have a character memory for each display unit
+   can be handled in about the same way.  Users will occasionally
+   experience delays when the display programs are surprised by
+   unfamiliar characters.
+
+   (7) Display systems that share character memories require more
+   complicated treatment.  The object is to keep the memory large enough
+   to keep all the characters that the current set of users is using and
+   to handle the required table lookups from the different character
+   codes in a nice way.  There will be limitations on the diversity of
+   character sets that can be in use simultaneously. Systems like the
+   Datadisc that only look up the character when it is first written can
+   be extended to work with large sets.  Systems that have to look up
+   each character code 30 times per second in order to maintain the
+   display won't work so well.
+
+I have no special ideas about how to make keyboards adaptable to
+arbitrary sets.  Each user may have to fend for himself.
+
+In this memorandum so far, I have ignored typography, i.e. the fact
+that in printed documents the same letter may be printed in many
+fonts.  Perhaps, each character in each font will require a separate
+registered description, but with a constant difference between the
+numbers of the same character in different fonts.  Installations will
+again have to decide what font distinctions they will implement.
+
+Some other issues that might be considered are whether means can be
+provided to adapt texts automatically to the line and page lengths of
+the different devices.
+
+It seems to me most likely that the typographical problems cannot be
+solved at this time, and it would be best to adopt conventions for
+registering character designs at this time, and leave typography for
+later.
+
+
+
+
+
+
+
+                                                                [Page 3]
+
+NWG/RFC# 373                        JMC 14-JUL-72 12:41  11058
+ARBITRARY CHARACTER SETS  by John McCarthy
+
+In my opinion, there is no real obstacle to establishing the registry
+in the ARPA network now, getting the standards organization to work,
+and being able to exchange documents in extended character sets as
+soon as the various installations can acquire the printers and display
+devices.
+
+It is the present policy of the Stanford Artificial Intelligence
+Laboratory to acquire no more devices that are wedded to fixed
+character sets.
+
+
+
+
+
+       [ This RFC was put into machine readable form for entry ]
+       [ into the online RFC archives by BBN Corp. under the   ]
+       [ direction of Alex McKenzie.                      1/97 ]
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+                                                                [Page 4]
+
author	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
committer	Thomas Voss <mail@thomasvoss.com>	2024-11-27 20:54:24 +0100
commit	4bfd864f10b68b71482b35c818559068ef8d5797 (patch)
tree	e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc373.txt
parent	ea76e11061bda059ae9f9ad130a9895cc85607db (diff)