summaryrefslogtreecommitdiff
path: root/doc/rfc/rfc1003.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/rfc/rfc1003.txt')
-rw-r--r--doc/rfc/rfc1003.txt411
1 files changed, 411 insertions, 0 deletions
diff --git a/doc/rfc/rfc1003.txt b/doc/rfc/rfc1003.txt
new file mode 100644
index 0000000..0805cf9
--- /dev/null
+++ b/doc/rfc/rfc1003.txt
@@ -0,0 +1,411 @@
+
+Network Working Group Alan Katz
+Request for Comments: 1003 USC/ISI
+ March 1987
+
+
+ Issues in Defining an Equations Representation Standard
+
+
+Status of This Memo
+
+ This memo is intended to identify and explore issues in defining a
+ standard for the exchange of mathematical equations. No attempt is
+ made at a complete definition and more questions are asked than are
+ answered. Questions about the user interface are only addressed to
+ the extent that they affect interchange issues. Comments are
+ welcome. Distribution of this memo is unlimited.
+
+I. Introduction
+
+ Since the early days of the Arpanet, electronic mail has been in
+ wide use and many regard it as an essential tool. Numerous mailing
+ lists and newsgroups have sprung up over the years, allowing large
+ numbers of people all over the world to participate remotely in
+ discussions on a variety of topics. More recently, multimedia mail
+ systems have been developed which allow users to not only send and
+ receive text messages, but also those containing voice, bitmaps,
+ graphics, and other electronic media.
+
+ Most of us in the Internet community take electronic mail for
+ granted, but for the rest of the world, it is a brand new
+ capability. Many are not convinced that electronic mail will be
+ useful for them and may also feel it is just an infinite time sink
+ (as we all know, this is actually true). In particular, most
+ scientists (apart from computer scientists) do not yet use, or are
+ just beginning to use, electronic mail.
+
+ The current NSF supercomputer initiative may change this. Its
+ primary purpose is to provide remote supercomputer access to a much
+ greater number of scientists across the country. However, doing
+ this will involve the interconnection of many university-wide
+ networks to NSF supercomputer sites and therefore to the NSF
+ backbone network. Thus, in the very near future we will have a
+ large number of scientists in the country suddenly able to
+ communicate via electronic mail.
+
+ Generally, text-only mail has sufficed up until now. One can dream
+ of the day (not so far in the future) when everyone will have
+ bitmapped display workstations with multimedia mail systems, but we
+ can get by without it for now. I believe, however, that the new NSF
+ user community will find one other capability almost essential in
+ making electronic mail useful to them, and that is the ability to
+
+
+
+Katz [Page 1]
+
+RFC 1003 March 1987
+
+
+ include equations in messages.
+
+ A glance through any scientific journal will demonstrate the
+ importance of equations in scientific communication. Indeed, papers
+ in some fields seem to contain more mathematics than English. It is
+ hard to imagine that when people in these fields are connected into
+ an electronic mail community they will be satisfied with a mail
+ system which doesn't allow equations. Indeed, with the advent of
+ the NSF's Experimental Research in Electronic Submission (EXPRESS)
+ project, scientists will begin submitting manuscripts and project
+ proposals directly through electronic mail and the ability to handle
+ equations will be essential.
+
+ Currently, there exists no standard for the representation of
+ equations. In fact, there is not even agreement on what it is that
+ ought to be represented. Users of particular equation systems (such
+ as LaTex or EQN) sometimes advocate just including source files of
+ that system in messages, but this may not be a good long-term
+ solution. With the new NSF community coming on line in the near
+ future, I feel the time is now right to try to define a standard
+ which will meet the present and future needs of the user community.
+
+ Such a standard should allow the interchange of equations via
+ electronic mail as well as be compatible with as many existing
+ systems as possible. It should be as general as possible, but still
+ efficiently represent those aspects of equations which are most
+ commonly used. One point to be kept in mind is that most equations
+ typesetting is currently being done by secretaries and professional
+ typesetters who do not know what the equations mean, only what they
+ look like. Although this is mainly a user interface consideration,
+ any proposed standard must not require the user to understand an
+ equation in order to type it in. We are not interested here in
+ representing mathematics, only displayed equations.
+
+ In this memo, I will try to raise issues that will need to be
+ considered in defining such a standard and to get a handle on what
+ it is that needs to be represented. Hopefully, this will form the
+ basis of a discussion leading eventually to a definition. Before
+ examining what it is that could be or should be represented in the
+ standard, we will first review the characteristics of some existing
+ systems.
+
+2. Existing Systems
+
+ There currently exist many incompatible systems which can handle
+ equations to a certain extent. Most of these are extensions to text
+ formatting systems to allow the inclusion of equations. As such,
+ general representation and standards considerations were not a major
+ concern when these systems were initially designed. We will examine
+ the three main types of systems: Directive systems, Symbolic
+ Language systems, and Full Display systems.
+
+
+
+Katz [Page 2]
+
+RFC 1003 March 1987
+
+
+ Some text editing facilities simply allow an expanded font set which
+ includes those symbols typically used in mathematics. I do not
+ consider these systems as truly able to handle equations since much
+ of mathematics cannot be represented. It takes more than the Greek
+ alphabet and an integral and square root symbol to make an equations
+ system.
+
+ Directive systems are those which represent equations and formating
+ information in terms of directives embedded in the text. LaTex and
+ EQN are two examples. LaTex is a more friendly version of Knuth's
+ Tex system, while EQN is a preprocessor for Troff, a document
+ preparation system available under Unix.
+
+ With these Directive systems, it is usually necessary to actually
+ print out the document to see what the equations and formatted text
+ will look like, although there are on-screen previewers which run on
+ workstations such as the Sun. Directive systems have the advantage
+ that the source files are just text and can be edited with standard
+ text editors (such as Emacs) and transferred as text in standard
+ electronic messages (a big advantage considering existing mail
+ interconnectivity of the various user communities). Also, it is
+ relatively easy to make global changes with the help of your
+ favorite text editor (for example, to change all Greek letter
+ alpha's to beta's or all integrals to summation signs in a document.
+ This is generally impossible with the other types of systems
+ described below).
+
+ The primary disadvantage of these systems is that writing an
+ equation corresponds to writing a portion of a computer program.
+ The equations are sometimes hard to read, generally hard to edit,
+ and one may make syntax errors which are hard to identify. Also,
+ people who are not used to programming, and typesetters who do not
+ actually know what an equation means, only what it should look like,
+ find specifying an equation in this language very difficult and may
+ not be willing to put up with it.
+
+ Full Display Systems are those such as Xerox STAR and VIEWPOINT.
+ The user enters an equation using the keyboard and sees exactly that
+ equation displayed as it is typed. At all times, what is displayed
+ is exactly how things will look when it is printed out.
+ Unfortunately, VIEWPOINT does not allow the user to place any symbol
+ anywhere on the page. There are many things (such as putting dots
+ on indices) which are not possible. For those things which are
+ implemented, it works rather nicely.
+
+ Hockney's Egg is a display system which was developed at the UCLA
+ Physics Department and runs on the IBM PC. It has the advantage of
+ being able to put any character of any font anywhere on the screen,
+ thus allowing not only equations, but things like chemical diagrams.
+
+
+
+
+
+Katz [Page 3]
+
+RFC 1003 March 1987
+
+
+ Interleaf's Workstation Publishing Software system is not strictly
+ speaking an equations system, but equations may be entered via a cut
+ and paste method. At all times, what one sees is what will be
+ printed out and one may put any symbol anywhere on the page. The
+ problem with this system is that one HAS TO put everything in a
+ certain place. It sometimes takes an enormous amount of work to get
+ things to be positioned correctly and to look nice.
+
+ Generally, Full Display Systems are specific to a particular piece
+ of hardware and the internal representation of the equations is not
+ only hidden from the user, but is in many cases proprietary.
+
+ Symbolic Language systems, such as Macsyma and Reduce, also allow
+ the entry of equations. These are in the form of program function
+ calls. These are systems that actually know some mathematics. One
+ can only enter the particular type of mathematics that the system
+ knows.
+
+ We next will look at what should be represented in an equations
+ system. We will want a representation standard general enough to
+ allow (almost) anything which comes up to be represented, but does
+ not require vast amounts of storage.
+
+3. What Could be Represented?
+
+ We will first examine what it is that could be represented. At the
+ most primative level, one could simply store a bitmap of each
+ printed equation (expensive in terms of storage). At the other end
+ of the spectrum, one could represent the actual mathematical
+ information that the equation itself represents (as in the input to
+ Macsyma). In between, one could represent the mathematical symbols
+ and where they are, or represent a standard set of mathematical
+ notation, as in EQN.
+
+ It is useful to think of an analogy with printed text. Suppose we
+ have text printed in a certain font. How could it be represented?
+ Well, we could store a bitmap of the printed text, store characters
+ and fonts, store words, or at the most abstract, we could store the
+ meaning behind the words.
+
+ What we actually do, of course, is store characters (in ordinary
+ text) and sometimes fonts (in text intended to be printed). We do
+ not attempt to represent the meaning of words, or even represent the
+ notion of a word. We generally only have characters, separated by
+ spaces or carriage returns (which are also characters). Even when
+ we specify fonts, if a slightly different one happened to be printed
+ out it would not matter greatly.
+
+ Equations may be considered an extension of ordinary text, together
+ with particular fonts. However, the choice of font may be extremely
+ important. If the wrong font happens to be printed out, the meaning
+
+
+
+Katz [Page 4]
+
+RFC 1003 March 1987
+
+
+ of the equation may be completely changed. There are also items,
+ such as growing parentheses, fractions, and matrices, which are
+ particular to equations.
+
+ We are not interested in representing the meaning of an equation,
+ even if we knew how to in general, but in representing a picture of
+ the equation. Thus, we will not further consider the types of
+ representations made in the Symbolic Language systems. We still
+ have Directive systems and the Full Display systems. We shall
+ assume that both of these will continue to exist and that the
+ defined standard should be able to deal with existing systems of
+ either type.
+
+ Assuming we do not want to just store a bitmap of the equation
+ (which would not allow any easy editing or interfacing with existing
+ systems), we are now left with the following possibilities:
+
+ 1. Store characters, fonts and positions only. Allow
+ anything to be anywhere (this is what Interleaf does).
+
+ 2. Store characters, fonts, and positions, but only allow
+ discrete positions. This makes it easier to place
+ subscripts and superscripts correctly (this is what
+ Hockney's Egg does).
+
+ 3. Use a language similar to EQN or LaTex, which has ideas
+ such as subscripts, superscripts, fractions, and growing
+ parentheses. Generally positioning is done automatically
+ when the typesetting occurs, but it is possible to do a
+ sort of relative positioning of symbols with some work.
+
+ 4. Use a language such as Troff or Tex, which is what EQN and
+ Latex is translated into.
+
+ 5. Some combination of the above.
+
+ In the next section, I will argue for a particular combination of
+ the above as a tentative choice. It may turn out, with more
+ information and experience, that this choice should be modified.
+
+4. What I Think Should be Represented
+
+ Let us now take a stab at what sort of standard we should have.
+ First of all, we would like our standard if at all possible to be
+ compatible with all of the existing systems described previously.
+ If the standard becomes widely accepted, it should be general enough
+ not to constrain severely the design of new user interfaces. Thus,
+ while we should provide for efficiently representing those aspects
+ of equations which are commonly used (subscripts, parentheses, etc.)
+ we would like extensions to be possible which enable the
+ representation of any symbol anywhere.
+
+
+
+Katz [Page 5]
+
+RFC 1003 March 1987
+
+
+ We would like standard mathematical symbols, as well as all Greek
+ and Latin letters to be available. We would also like any required
+ typesetting knowledge to be in programs and not required of the
+ user.
+
+ I feel that the exact position of a subscript or superscript should
+ not have to be specified by the user or be represented (unless the
+ user specifically wants it to be). It is nice to be able to place
+ any symbol anywhere (and indeed the standard ought to allow for
+ this), but having to do this for everything is not good. The
+ standard should be able to represent the idea of a subscript,
+ superscript, or growing fraction with no more specification.
+
+ My suggestion, therefore, is for something like EQN, but with
+ extensions to allow positioning of symbols in some kind of absolute
+ coordinates as well as relative positioning (EQN does allow some
+ positioning relative to where the next symbol would normally go).
+ This has the advantage that the representation is in ordinary text,
+ which can be sent in messages, the Directive systems can map almost
+ directly into it, and it should allow representation for Full
+ Display systems. The ideas of subscript and superscripts (without
+ having to specify a position), growing parentheses, fractions, and
+ matrices, and special fonts are already there.
+
+ Most equations can be specified very compactly within EQN, and if
+ positioning is provided as an extension, exceptions can be handled.
+ (The same could be said for LaTex, however, I consider the syntax
+ there to be somewhat unreadable and prefer EQN. Essentially, either
+ will do).
+
+ User interfaces should be able to be easily constructed which would
+ allow one to type in an EQN style specification and have the
+ equation appear immediately on the screen. For non-specialists, it
+ may be better to use existing Full Display systems which are then
+ translated in this EQN like standard (perhaps using a lot of the
+ absolute positioning facility).
+
+5. Conclusions
+
+ In summary:
+
+
+ 1. A standard for the efficient representation of mathematical
+ equations should be defined as soon as possible in order to
+ allow the interchange of equations in documents and mail
+ messages and the transfer of equations between various
+ existing internal representations.
+
+ 2. Most equations entry is currently done by people who do not
+ know what the equations mean, and are not programmers. It
+ may be that the optimal user interface for these people is
+
+
+
+Katz [Page 6]
+
+RFC 1003 March 1987
+
+
+ different than for those who do know mathematics and/or are
+ programmers. An equations standard should not preclude
+ this.
+
+ 3. The standard should easily handle those aspects of equations
+ which are common, such as the set of things provided in EQN.
+
+ 4. It should also be possible, however, to place any defined
+ symbol anywhere and the standard should allow this type of
+ specification when needed.
+
+ 5. As many of the existing systems (all of them if possible)
+ should be able to be translated into the standard.
+
+ 6. The standard should not make requirements on the user
+ interface such that the user must have much typesetting
+ knowledge. This knowledge should be in the user interface
+ or printing routines.
+
+ 7. Full Display systems may be best for non-specialists and for
+ non-programmers. Directive systems, perhaps with the
+ ability to preview the final equation on one's screen, may
+ be best for the rest.
+
+ 8. A distinction should be made between the representation of
+ an equation (which we are dealing with here) and the
+ mathematical knowledge it represents.
+
+ I suggest something like EQN as a standard with extensions to allow
+ positioning of symbols in some kind of absolute coordinates as well
+ as relative positioning. This has the advantage that the
+ representation is in ordinary text, which can be sent in messages,
+ the Directive systems can map almost directly into it, and it should
+ allow representation for Full Display systems. The ideas of
+ subscript and superscripts (without having to specify a position),
+ growing parentheses, fractions, and matrices, and special fonts are
+ already there.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Katz [Page 7]
+