diff options
author | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
---|---|---|
committer | Thomas Voss <mail@thomasvoss.com> | 2024-11-27 20:54:24 +0100 |
commit | 4bfd864f10b68b71482b35c818559068ef8d5797 (patch) | |
tree | e3989f47a7994642eb325063d46e8f08ffa681dc /doc/rfc/rfc850.txt | |
parent | ea76e11061bda059ae9f9ad130a9895cc85607db (diff) |
doc: Add RFC documents
Diffstat (limited to 'doc/rfc/rfc850.txt')
-rw-r--r-- | doc/rfc/rfc850.txt | 1059 |
1 files changed, 1059 insertions, 0 deletions
diff --git a/doc/rfc/rfc850.txt b/doc/rfc/rfc850.txt new file mode 100644 index 0000000..af4b47d --- /dev/null +++ b/doc/rfc/rfc850.txt @@ -0,0 +1,1059 @@ +RFC 850 June 1983 + + + Standard for Interchange of USENET Messages + + Mark R. Horton + + +[ This memo is distributed as an RFC only to make this +information easily accessible to researchers in the ARPA +community. It does not specify an Internet standard. ] + +1. Introduction + +This document defines the standard format for interchange +of Network News articles among USENET sites. It describes +the format for articles themselves, and gives partial +standards for transmission of news. The news transmission +is not entirely standardized in order to give a good deal +of flexibility to the individual hosts to choose +transmission hardware and software, whether to batch news, +and so on. + +There are five sections to this document. Section two +section defines the format. Section three defines the +valid control messages. Section four specifies some valid +transmission methods. Section five describes the overall +news propagation algorithm. + + +2. Article Format + +The primary consideration in choosing an article format is +that it fit in with existing tools as well as possible. +Existing tools include both implementations of mail and +news. (The notesfiles system from the University of +Illinois is considered a news implementation.) A standard +format for mail messages has existed for many years on the +ARPANET, and this format meets most of the needs of +USENET. Since the ARPANET format is extensible, +extensions to meet the additional needs of USENET are +easily made within the ARPANET standard. Therefore, the +rule is adopted that all USENET news articles must be +formatted as valid ARPANET mail messages, according to the +ARPANET standard RFC 822. This standard is more +restrictive than the ARPANET standard, placing additional +requirements on each article and forbidding use of certain +ARPANET features. However, it should always be possible +to use a tool expecting an ARPANET message to process a +news article. In any situation where this standard +conflicts with the ARPANET standard, RFC 822 should be +considered correct and this standard in error. + + + + + + + - 1 - + + +An example message is included to illustrate the fields. + + Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP + Posting-Version: version B 2.10 2/13/83; site eagle.UUCP + Path: cbosgd!mhuxj!mhuxt!eagle!jerry + From: jerry@eagle.uucp (Jerry Schwarz) + Newsgroups: net.general + Subject: Usenet Etiquette -- Please Read + Message-ID: <642@eagle.UUCP> + Date: Friday, 19-Nov-82 16:14:55 EST + Followup-To: net.news + Expires: Saturday, 1-Jan-83 00:00:00 EST + Date-Received: Friday, 19-Nov-82 16:59:30 EST + Organization: Bell Labs, Murray Hill + + The body of the article comes here, after a blank line. + +Here is an example of a message in the old format (before +the existence of this standard). It is recommended that +implementations also accept articles in this format to +ease upward conversion. + + From: cbosgd!mhuxj!mhuxt!eagle!jerry (Jerry Schwarz) + Newsgroups: net.general + Title: Usenet Etiquette -- Please Read + Article-I.D.: eagle.642 + Posted: Fri Nov 19 16:14:55 1982 + Received: Fri Nov 19 16:59:30 1982 + Expires: Mon Jan 1 00:00:00 1990 + + The body of the article comes here, after a blank line. + +Some news systems transmit news in the "A" format, which +looks like this: + + Aeagle.642 + net.general + cbosgd!mhuxj!mhuxt!eagle!jerry + Fri Nov 19 16:14:55 1982 + Usenet Etiquette - Please Read + The body of the article comes here, with no blank line. + +An article consists of several header lines, followed by a +blank line, followed by the body of the message. The +header lines consist of a keyword, a colon, a blank, and +some additional information. This is a subset of the +ARPANET standard, simplified to allow simpler software to +handle it. The "from" line may optionally include a +full name, in the format above, or use the ARPANET angle +bracket syntax. To keep the implementations simple, other +formats (for example, with part of the machine address +after the close parenthesis) are not allowed. The ARPANET +convention of continuation header lines (beginning with a +blank or tab) is allowed. + + + - 2 - + + + +Certain headers are required, certain headers are +optional. Any unrecognized headers are allowed, and will +be passed through unchanged. The required headers are +Relay-Version, Posting-Version, From, Date, Newsgroups, +Subject, Message-ID, Path. The optional headers are +Followup-To, Date-Received, Expires, Reply-To, Sender, +References, Control, Distribution, Organization. + +2.1 Required Headers + +2.1.1 Relay-Version This header line shows the version +of the program responsible for the transmission of this +article over the immediate link, that is, the program that +is relaying the article from the next site. For example, +suppose site A sends an article to site B, and site B +forwards the article to site C. The message being +transmitted from A to B would have a Relay-Version header +identifying the program running on A, and the message +transmitted from B to C would identify the program running +on B. This header can be used to interpret older headers +in an upward compatible way. Relay-Version must always be +the first in a message; thus, all articles meeting this +standard will begin with an upper case "R". No other +restrictions are placed on the order of header lines. + +The line contains two fields, separated by semicolons. +The fields are the version and the full domain name of the +site. The version should identify the system program used +(e.g., "B") as well as a version number and version +date. For example, the header line might contain + + Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP + +This header should not be passed on to additional sites. +A relay program, when passing an article on, should +include only its own Relay-Version, not the Relay-Version +of some other site. (For upward compatibility with older +software, if a Relay-Version is found in a header which is +not the first line, it should be assumed to be moved by an +older version of news and deleted.) + +2.1.2 Posting-Version This header identifies the +software responsible for entering this message into the +network. It has the same format as Relay-Version. It +will normally identify the same site as the Message-ID, +unless the posting site is serving as a gateway for a +message that already contains a message ID generated by +mail. (While it is permissible for a gateway to use an +externally generated message ID, the message ID should be +checked to ensure it conforms to this standard and to RFC +822.) + + + + + - 3 - + + + +2.1.3 From The From line contains the electronic mailing +address of the person who sent the message, in the ARPA +internet syntax. It may optionally also contain the full +name of the person, in parentheses, after the electronic +address. The electronic address is the same as the entity +responsible for originating the article, unless the Sender +header is present, in which case the From header might not +be verified. Note that in all site and domain names, +upper and lower case are considered the same, thus +mark@cbosgd.UUCP, mark@cbosgd.uucp, and mark@CBosgD.UUcp +are all equivalent. User names may or may not be case +sensitive, for example, Billy@cbosgd.UUCP might be +different from BillY@cbosgd.UUCP. Programs should avoid +changing the case of electronic addresses when forwarding +news or mail. + +RFC 822 specifies that all text in parentheses is to be +interpreted as a comment. It is common in ARPANET mail to +place the full name of the user in a comment at the end of +the From line. This standard specifies a more rigid +syntax. The full name is not considered a comment, but an +optional part of the header line. Either the full name is +omitted, or it appears in parentheses after the electronic +address of the person posting the article, or it appears +before an electronic address enclosed in angle brackets. +Thus, the three permissible forms are: + + From: mark@cbosgd.UUCP + From: mark@cbosgd.UUCP (Mark Horton) + From: Mark Horton <mark@cbosgd.UUCP> + +Full names may contain any printing ASCII characters from +space through tilde, with the exceptions that they may not +contain parentheses "(" or ")", or angle brackets +"<" or ">". Additional restrictions may be placed on +full names by the mail standard, in particular, the +characters comma ",", colon ":", and semicolon ";" +are inadvisable in full names. + +2.1.4 Date The Date line (formerly "Posted") is the +date, in a format that must be acceptable both to the +ARPANET and to the getdate routine, that the article was +originally posted to the network. This date remains +unchanged as the article is propagated throughout the +network. One format that is acceptable to both is + + Weekday, DD-Mon-YY HH:MM:SS TIMEZONE + +Several examples of valid dates appear in the sample +article above. Note in particular that ctime format: + + Wdy Mon DD HH:MM:SS YYYY + + + + - 4 - + + + + + +is not acceptable because it is not a valid ARPANET date. +However, since older software still generates this format, +news implementations are encouraged to accept this format +and translate it into an acceptable format. + +The contents of the TIMEZONE field is currently subject to +worldwide time zone abbreviations, including the usual +American zones (PST, PDT, MST, MDT, CST, CDT, EST, EDT), +the other North American zones (Bering through +Newfoundland), European zones, Australian zones, and so +on. Lacking a complete list at present (and unsure if an +unambiguous list exists), authors of software are +encouraged to keep this code flexible, and in particular +not to assume that time zone names are exactly three +letters long. Implementations are free to edit this +field, keeping the time the same, but changing the time +zone (with an appropriate adjustment to the local time +shown) to a known time zone. + +2.1.5 Newsgroups The Newsgroups line specifies which +newsgroup or newsgroups the article belongs in. Multiple +newsgroups may be specified, separated by a comma. +Newsgroups specified must all be the names of existing +newsgroups, as no new newsgroups will be created by simply +posting to them. + +Wildcards (e.g., the word "all") are never allowed in a +Newsgroups line. For example, a newsgroup "net.all" is +illegal, although a newsgroup name "net.sport.football" +is permitted. + +If an article is received with a Newsgroups line listing +some valid newsgroups and some invalid newsgroups, a site +should not remove invalid newsgroups from the list. +Instead, the invalid newsgroups should be ignored. For +example, suppose site A subscribes to the classes +"btl.all" and "net.all", and exchanges news articles +with site B, which subscribes to "net.all" but not +"btl.all". Suppose A receives an article with +"Newsgroups: net.micro,btl.general". This article is +passed on to B because B receives net.micro, but B does +not receive btl.general. A must leave the Newsgroup line +unchanged. If it were to remove "btl.general", the +edited header could eventually reenter the "btl.all" +class, resulting in an article that is not shown to users +subscribing to "btl.general". Also, followups from +outside "btl.all" would not be shown to such users. + + + + + + + - 5 - + + + +2.1.6 Subject The Subject line (formerly "Title") +tells what the article is about. It should be suggestive +enough of the contents of the article to enable a reader +to make a decision whether to read the article based on +the subject alone. If the article is submitted in +response to another article (e.g., is a "followup") the +default subject should begin with the four characters +"Re: " and the References line is required. (The user +might wish to edit the subject of the followup, but the +default should begin with "Re: ".) + +2.1.7 Message-ID The Message-ID line gives the article a +unique identifier. The same message ID may not be reused +during the lifetime of any article with the same message +ID. (It is recommended that no message ID be reused for +at least two years.) Message ID's have the syntax + + "<" "string not containing blank or >" ">" + +In order to conform to RFC 822, the Message-ID must have +the format + + "<" "unique" "@" "full domain name" ">" + +where "full domain name" is the full name of the host at +which the article entered the network, including a domain +that host is in, and unique is any string of printing +ASCII characters, not including "<", ">", or "@". For +example, the "unique" part could be an integer +representing a sequence number for articles submitted to +the network, or a short string derived from the date and +time the article was created. For example, valid message +ID for an article submitted from site ucbvax in domain +Berkeley.ARPA would be "<4123@ucbvax.Berkeley.ARPA>". +Programmers are urged not to make assumptions about the +content of message ID fields from other hosts, but to +treat them as unknown character strings. It is not safe, +for example, to assume that a message ID will be under 14 +characters, nor that it is unique in the first 14 +characters. + +The angle brackets are considered part of the message ID. +Thus, in references to the message ID, such as the +ihave/sendme and cancel control messages, the angle +brackets are included. White space characters (e.g., +blank and tab) are not allowed in a message ID. All +characters between the angle brackets must be printing +ASCII characters. + +2.1.8 Path This line shows the path the article took to +reach the current system. When a system forwards the +message, it should add its own name to the list of systems +in the Path line. The names may be separated by any +punctuation character or characters, thus + + - 6 - + + + +"cbosgd!mhuxj!mhuxt", "cbosgd, mhuxj, mhuxt", and +"@cbosgd.uucp,@mhuxj.uucp,@mhuxt.uucp" and even +"teklabs, zehntel, sri-unix@cca!decvax" are valid +entries. (The latter path indicates a message that passed +through decvax, cca, sri-unix, zehntel, and teklabs, in +that order.) Additional names should be added from the +left, for example, the most recently added name in the +third example was "teklabs". Letters, digits, periods +and hyphens are considered part of site names; other +punctuation, including blanks, are considered separators. + +Normally, the rightmost name will be the name of the +originating system. However, it is also permissible to +include an extra entry on the right, which is the name of +the sender. This is for upward compatibility with older +system. + +The Path line is not used for replies, and should not be +taken as a mailing address. It is intended to show the +route the message travelled to reach the local site. +There are several uses for this information. One is to +monitor USENET routing for performance reasons. Another +is to establish a path to reach new sites. Perhaps the +most important is to cut down on redundant USENET traffic +by failing to forward a message to a site that is known to +have already received it. In particular, when site A +sends an article to site B, the Path line includes "A", +so that site B will not immediately send the article back +to site A. The site name each site uses to identify +itself should be the same as the name by which its +neighbors know it, in order to make this optimization +possible. + +A site adds its own name to the front of a path when it +receives a message from another site. Thus, if a message +with path A!X!Y!Z is passed from site A to site B, B will +add its own name to the path when it receives the message +from A, e.g., B!A!X!Y!Z. If B then passes the message on +to C, the message sent to C will contain the path +B!A!X!Y!Z, and when C receives it, C will change it to +C!B!A!X!Y!Z. + +Special upward compatibility note: Since the From, Sender, +and Reply-To lines are in internet format, and since many +USENET sites do not yet have mailers capable of +understanding internet format, it would break the reply +capability to completely sever the connection between the +Path header and the reply function. Thus, sites are +required to continue to keep the Path line in a working +reply format as much as possible, until January 1, 1984. +It is recognized that the path is not always a valid reply +string in older implementations, and no requirement to fix +this problem is placed on implementations. However, the + + + - 7 - + + +existing convention of placing the site name and an "!" +at the front of the path, and of starting the path with +the site name, an "!", and the user name, should be +maintained at least until 1984. + +2.2 Optional Headers + +2.2.1 Reply-To This line has the same format as From. +If present, mailed replies to the author should be sent to +the name given here. Otherwise, replies are mailed to the +name on the From line. (This does not prevent additional +copies from being sent to recipients named by the replier, +or on To or Cc lines.) The full name may be optionally +given, in parentheses, as in the From line. + +2.2.2 Sender This field is present only if the submitter +manually enters a From line. It is intended to record the +entity responsible for submitting the article to the +network, and should be verified by the software at the +submitting site. + +For example, if John Smith is visiting CCA and wishes to +post an article to the network, using friend Sarah Jones +account, the message might read + + From: smith@ucbvax.uucp (John Smith) + Sender: jones@cca.arpa (Sarah Jones) + +If a gateway program enters a mail message into the +network at site sri-unix, the lines might read + + From: John.Doe@CMU-CS-A.ARPA + Sender: network@sri-unix.ARPA + +The primary purpose of this field is to be able to track +down articles to determine how they were entered into the +network. The full name may be optionally given, in +parentheses, as in the From line. + +2.2.3 Followup-To This line has the same format as +Newsgroups. If present, follow-up articles are to be +posted to the newsgroup(s) listed here. If this line is +not present, followups are posted to the newsgroup(s) +listed in the Newsgroups line, except that followups to +"net.general" should instead go to "net.followup". + +2.2.4 Date-Received This line (formerly "Received") is +in a legal USENET date format. It records the date and +time that the article was first received on the local +system. If this line is present in an article being +transmitted from one host to another, the receiving host +should ignore it and replace it with the current date. +Since this field is intended for local use only, no site +is required to support it. However, no site should pass +this field on to another site unchanged. + + - 8 - + + + +2.2.5 Expires This line, if present, is in a legal +USENET date format. It specifies a suggested expiration +date for the article. If not present, the local default +expiration date is used. + +This field is intended to be used to clean up articles +with a limited usefulness, or to keep important articles +around for longer than usual. For example, a message +announcing an upcoming seminar could have an expiration +date the day after the seminar, since the message is not +useful after the seminar is over. Since local sites have +local policies for expiration of news (depending on +available disk space, for instance), users are discouraged +from providing expiration dates for articles unless there +is a natural expiration date associated with the topic. +System software should almost never provide a default +Expires line. Leave it out and allow local policies to be +used unless there is a good reason not to. + +2.2.6 References This field lists the message ID's of +any articles prompting the submission of this article. It +is required for all follow-up articles, and forbidden when +a new subject is raised. Implementations should provide a +follow-up command, which allows a user to post a follow-up +article. This command should generate a Subject line +which is the same as the original article, except that if +the original subject does not begin with "Re: " or "re: ", +the four characters "Re: " are inserted before the +subject. If there is no References line on the original +header, the References line should contain the message ID +of the original article (including the angle brackets). +If the original article does have a References line, the +followup article should have a References line containing +the text of the original References line, a blank, and the +message ID of the original article. + +The purpose of the References header is to allow articles +to be grouped into conversations by the user interface +program. This allows conversations within a newsgroup to +be kept together, and potentially users might shut off +entire conversations without unsubscribing to a newsgroup. +User interfaces may not make use of this header, but all +automatically generated followups should generate the +References line for the benefit of systems that do use it, +and manually generated followups (e.g. typed in well after +the original article has been printed by the machine) +should be encouraged to include them as well. + +2.2.7 Control If an article contains a Control line, the +article is a control message. Control messages are used +for communication among USENET host machines, not to be +read by users. Control messages are distributed by the +same newsgroup mechanism as ordinary messages. The body +of the Control header line is the message to the host. + + - 9 - + + + +For upward compatibility, messages that match the +newsgroup pattern "all.all.ctl" should also be +interpreted as control messages. If no Control: header is +present on such messages, the subject is used as the +control message. However, messages on newsgroups matching +this pattern do not conform to this standard. + +2.2.8 Distribution This line is used to alter the +distribution scope of the message. It has the same format +as the Newsgroups line. User subscriptions are still +controlled by Newsgroups, but the message is sent to all +systems subscribing to the newsgroups on the Distribution +line instead of the Newsgroups line. Thus, a car for sale +in New Jersey might have headers including + + Newsgroups: net.auto,net.wanted + Distribution: nj.all + +so that it would only go to persons subscribing to +net.auto or net.wanted within New Jersey. The intent of +this header is to further restrict the distribution of a +newsgroup, not to increase it. A local newsgroup, such as +nj.crazy-eddie, will probably not be propagated by sites +outside New Jersey that do not show such a newsgroup as +valid. Wildcards in newsgroup names in the Distribution +line are allowed. Followup articles should default to the +same Distribution line as the original article, but the +user can change it to a more limited one, or escalate the +distribution if it was originally restricted and a more +widely distributed reply is appropriate. + +2.2.9 Organization The text of this line is a short +phrase describing the organization to which the sender +belongs, or to which the machine belongs. The intent of +this line is to help identify the person posting the +message, since site names are often cryptic enough to make +it hard to recognize the organization by the electronic +address. + + +3. Control Messages + +This section lists the control messages currently defined. +The body of the Control header is the control message. +Messages are a sequence of zero or more words, separated +by white space (blanks or tabs). The first word is the +name of the control message, remaining words are +parameters to the message. The remainder of the header +and the body of the message are also potential parameters; +for example, the From line might suggest an address to +which a response is to be mailed. + + + + + - 10 - + + + +Implementors and administrators may choose to allow +control messages to be automatically carried out, or to +queue them for manual processing. However, manually +processed messages should be dealt with promptly. + +3.1 Cancel + + cancel <message ID> + +If an article with the given message ID is present on the +local system, the article is cancelled. This mechanism +allows a user to cancel an article after the article has +been distributed over the network. + +Only the author of the article or the local super user is +allowed to use this message. The verified sender of a +message is the Sender line, or if no Sender line is +present, the From line. The verified sender of the cancel +message must be the same as either the Sender or From +field of the original message. A verified sender in the +cancel message is allowed to match an unverified From in +the original message. + +3.2 Ihave/Sendme + + ihave <message ID list> <remotesys> + sendme <message ID list> <remotesys> + +This message is part of the "ihave/sendme" protocol, +which allows one site (say "A") to tell another site +("B") that a particular message has been received on A. +Suppose that site A receives article "ucbvax.1234", and +wishes to transmit the article to site B. A sends the +control message "ihave ucbvax.1234 A" to site B (by +posting it to newsgroup "to.B"). B responds with the +control message "sendme ucbvax.1234 B" (on newsgroup +to.A) if it has not already received the article. Upon +receiving the Sendme message, A sends the article to B. + +This protocol can be used to cut down on redundant traffic +between sites. It is optional and should be used only if +the particular situation makes it worthwhile. Frequently, +the outcome is that, since most original messages are +short, and since there is a high overhead to start sending +a new message with UUCP, it costs as much to send the +Ihave as it would cost to send the article itself. + +One possible solution to this overhead problem is to batch +requests. Several message ID's may be announced or +requested in one message. If no message ID's are listed +in the control message, the body of the message should be +scanned for message ID's, one per line. + + + + - 11 - + + + +3.3 Newgroup + + newgroup <groupname> + +This control message creates a new newsgroup with the name +given. Since no articles may be posted or forwarded until +a newsgroup is created, this message is required before a +newsgroup can be used. The body of the message is +expected to be a short paragraph describing the intended +use of the newsgroup. + +3.4 Rmgroup + + rmgroup <groupname> + +This message removes a newsgroup with the given name. +Since the newsgroup is removed from every site on the +network, this command should be used carefully by a +responsible administrator. + +3.5 Sendsys + + sendsys (no arguments) + +The "sys" file, listing all neighbors and which +newsgroups are sent to each neighbor, will be mailed to +the author of the control message (Reply-to, if present, +otherwise From). This information is considered public +information, and it is a requirement of membership in +USENET that this information be provided on request, +either automatically in response to this control message, +or manually, by mailing the requested information to the +author of the message. This information is used to keep +the map of USENET up to date, and to determine where +netnews is sent. + +The format of the file mailed back to the author should be +the same as that of the "sys" file. This format has one +line per neighboring site (plus one line for the local +site), containing four colon separated fields. The first +field has the site name of the neighbor, the second field +has a newsgroup pattern describing the newsgroups sent to +the neighbor. The third and fourth fields are not defined +by this standard. A sample response: + + From cbosgd!mark Sun Mar 27 20:39:37 1983 + Subject: response to your sendsys request + To: mark@cbosgd.UUCP + + + + + + + + - 12 - + + + Responding-System: cbosgd.UUCP + cbosgd:osg,cb,btl,bell,net,fa,to,test + ucbvax:net,fa,to.ucbvax:L: + cbosg:net,fa,bell,btl,cb,osg,to.cbosg:F:/usr/spool/outnews/cbosg + cbosgb:osg,to.cbosgb:F:/usr/spool/outnews/cbosgb + sescent:net,fa,bell,btl,cb,to.sescent:F:/usr/spool/outnews/sescent + npois:net,fa,bell,btl,ug,to.npois:F:/usr/spool/outnews/npois + mhuxi:net,fa,bell,btl,ug,to.mhuxi:F:/usr/spool/outnews/mhuxi + +3.6 Senduuname + + senduuname (no arguments) + +The "uuname" program is run, and the output is mailed to +the author of the control message (Reply-to, if present, +otherwise From). This program lists all uucp neighbors of +the local site. This information is used to make maps of +the UUCP network. The sys file is not the same as the +UUCP L.sys file. The L.sys file should never be +transmitted to another party without the consent of the +sites whose passwords are listed therein. + +It is optional for a site to provide this information. +Some reply should be made to the author of the control +message, so that a transmission error won't be blamed. It +is also permissible for a site to run the uuname program +(or in some other way determine the uucp neighbors) and +edit the output, either automatically or manually, before +mailing the reply back to the author. The file should +contain one site per line, beginning with the uucp site +name. Additional information may be included, separated +from the site name by a blank or tab. The phone number or +password for the site should NOT be included, as the reply +is considered to be in the public domain. (The uuname +program will send only the site name and not the entire +contents of the L.sys file, thus, phone numbers and +passwords are not transmitted.) + +The purpose of this message is to generate and maintain +UUCP mail routing maps. Thus, connections over which mail +can be sent using the site!user syntax should be included, +regardless of whether the link is actually a UUCP link at +the physical level. If a mail router should use it, it +should be included. Since all information sent in +response to this message is optional, sites are free to +edit the list, deleting secret or private links they do +not wish to publicise. + +3.7 Version + + version (no arguments) + +The name and version of the software running on the local +system is to be mailed back to the author of the article +(Reply-to if present, otherwise From). + + - 13 - + + + +4. Transmission Methods + +USENET is not a physical network, but rather a logical +network resting on top of several existing physical +networks. These networks include, but are not limited to, +UUCP, the ARPANET, an Ethernet, the BLICN network, an NSC +Hyperchannel, and a Berknet. What is important is that +two neighboring systems on USENET have some method to get +a new article, in the format listed here, from one system +to the other, and once on the receiving system, processed +by the netnews software on that system. (On UNIX systems, +this usually means the "rnews" program being run with +the article on the standard input.) + +It is not a requirement that USENET sites have mail +systems capable of understanding the ARPA Internet mail +syntax, but it is strongly recommended. Since From, +Reply-To, and Sender lines use the Internet syntax, +replies will be difficult or impossible without an +internet mailer. A site without an internet mailer can +attempt to use the Path header line for replies, but this +field is not guaranteed to be a working path for replies. +In any event, any site generating or forwarding news +messages must have an internet address that allows them to +receive mail from sites with internet mailers, and they +must include their internet address on their From line. + +4.1 Remote Execution + +Some networks permit direct remote command execution. On +these networks, news may be forwarded by spooling the +rnews command with the article on the standard input. For +example, if the remote system is called "remote", news +would be sent over a UUCP link with the command "uux - +remote!rnews", and on a Berknet, "net -mremote rnews". +It is important that the article be sent via a reliable +mechansim, normally involving the possibility of spooling, +rather than direct real-time remote execution. This is +because, if the remote system is down, a direct execution +command will fail, and the article will never be +delivered. If the article is spooled, it will eventually +be delivered when both systems are up. + +4.2 Transfer by Mail + +On some systems, direct remote spooled execution is not +possible. However, most systems support electronic mail, +and a news article can be sent as mail. One approach is +to send a mail message which is identical to the news +message: the mail headers are the news headers, and the +mail body is the news body. By convention, this mail is +sent to the user "newsmail" on the remote machine. + + + + - 14 - + + + +One problem with this method is that it may not be +possible to convince the mail system that the From line of +the message is valid, since the mail message was generated +by a program on a system different from the source of the +news article. Another problem is that error messages +caused by the mail transmission would be sent to the +originator of the news article, who has no control over +news transmission between two cooperating hosts and does +not know who to contact. Transmission error messages +should be directed to a responsible contact person on the +sending machine. + +A solution to this problem is to encapsulate the news +article into a mail message, such that the entire article +(headers and body) are part of the body of the mail +message. The convention here is that such mail is sent to +user "rnews" on the remote system. A mail message body +is generated by prepending the letter "N" to each line +of the news article, and then attaching whatever mail +headers are convenient to generate. The N's are attached +to prevent any special lines in the news article from +interfering with mail transmission, and to prevent any +extra lines inserted by the mailer (headers, blank lines, +etc.) from becoming part of the news article. A program +on the receiving machine receives mail to "rnews", +extracting the article itself and invoking the "rnews" +program. An example in this format might look like this: + + Date: Monday, 3-Jan-83 08:33:47 MST + From: news@cbosgd.UUCP + Subject: network news article + To: rnews@npois.UUCP + + NRelay-Version: B 2.10 2/13/83 cbosgd.UUCP + NPosting-Version: B 2.9 6/21/82 sask.UUCP + NPath: cbosgd!mhuxj!harpo!utah-cs!sask!derek + NFrom: derek@sask.UUCP (Derek Andrew) + NNewsgroups: net.test + NSubject: necessary test + NMessage-ID: <176@sask.UUCP> + NDate: Monday, 3-Jan-83 00:59:15 MST + N + NThis really is a test. If anyone out there more than 6 + Nhops away would kindly confirm this note I would + Nappreciate it. We suspect that our news postings + Nare not getting out into the world. + N + +Using mail solves the spooling problem, since mail must +always be spooled if the destination host is down. +However, it adds more overhead to the transmission process +(to encapsulate and extract the article) and makes it +harder for software to give different priorities to news +and mail. + + - 15 - + + + +4.3 Batching + +Since news articles are usually short, and since a large +number of messages are often sent between two sites in a +day, it may make sense to batch news articles. Several +articles can be combined into one large article, using +conventions agreed upon in advance by the two sites. One +such batching scheme is described here; its use is still +considered experimental. + +News articles are combined into a script, separated by a +header of the form: + + ##! rnews 1234 + +where 1234 is the length, in bytes, of the article. Each +such line is followed by an article containing the given +number of bytes. (The newline at the end of each line of +the article is counted as one byte, for purposes of this +count, even if it is stored as CRLF.) For example, a batch +of articles might look like this: + + #! rnews 374 + Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP + Posting-Version: version B 2.10 2/13/83; site eagle.UUCP + Path: cbosgd!mhuxj!mhuxt!eagle!jerry + From: jerry@eagle.uucp (Jerry Schwarz) + Newsgroups: net.general + Subject: Usenet Etiquette -- Please Read + Message-ID: <642@eagle.UUCP> + Date: Friday, 19-Nov-82 16:14:55 EST + + Here is an important message about USENET Etiquette. + #! rnews 378 + Relay-Version: version B 2.10 2/13/83; site cbosgd.UUCP + Posting-Version: version B 2.10 2/13/83; site eagle.UUCP + Path: cbosgd!mhuxj!mhuxt!eagle!jerry + From: jerry@eagle.uucp (Jerry Schwarz) + Newsgroups: net.followup + Subject: Notes on Etiquette article + Message-ID: <643@eagle.UUCP> + Date: Friday, 19-Nov-82 17:24:12 EST + + There was something I forgot to mention in the last message. + +Batched news is recognized because the first character in +the message is "#". The message is then passed to the +unbatcher for interpretation. + + + + + + + + - 16 - + + +5. The News Propagation Algorithm + +This section describes the overall scheme of USENET and +the algorithm followed by sites in propagating news to the +entire network. Since all sites are affected by +incorrectly formatted articles and by propagation errors, +it is important for the method to be standardized. + +USENET is a directed graph. Each node in the graph is a +host computer, each arc in the graph is a transmission +path from one host to another host. Each arc is labelled +with a newsgroup pattern, specifying which newsgroup +classes are forwarded along that link. Most arcs are +bidirectional, that is, if site A sends a class of +newsgroups to site B, then site B usually sends the same +class of newsgroups to site A. This bidirectionality is +not, however, required. + +USENET is made up of many subnetworks. Each subnet has a +name, such as "net" or "btl". The special subnet +"net" is defined to be USENET, although the union of all +subnets may be a superset of USENET (because of sites that +get local newsgroup classes but do not get net.all). Each +subnet is a connected graph, that is, a path exists from +every node to every other node in the subnet. In +addition, the entire graph is (theoretically) connected. +(In practice, some political considerations have caused +some sites to be unable to post articles reaching the rest +of the network.) + +An article is posted on one machine to a list of +newsgroups. That machine accepts it locally, then +forwards it to all its neighbors that are interested in at +least one of the newsgroups of the message. (Site A deems +site B to be "interested" in a newsgroup if the +newsgroup matches the pattern on the arc from A to B. +This pattern is stored in a file on the A machine.) The +sites receiving the incoming article examine it to make +sure they really want the article, accept it locally, and +then in turn forward the article to all their interest +neighbors. This process continues until the entire +network has seen the article. + +An important part of the algorithm is the prevention of +loops. The above process would cause a message to loop +along a cycle forever. In particular, when site A sends +an article to site B, site B will send it back to site A, +which will send it to site B, and so on. One solution to +this is the history mechanism. Each site keeps track of +all articles it has seen (by their message ID) and +whenever an article comes in that it has already seen, the +incoming article is discarded immediately. This solution +is sufficient to prevent loops, but additional +optimizations can be made to avoid sending articles to +sites that will simply throw them away. + + - 17 - + + + +One optimization is that an article should never be sent +to a machine listed in the Path line of the header. When +a machine name is in the Path line, the message is known +to have passed through the machine. Another optimization +is that, if the article originated on site A, then site A +has already seen the article. (Origination can be +determined by the Posting-Version line.) + +Thus, if an article is posted to newsgroup "net.misc", +it will match the pattern "net.all" (where "all" is a +metasymbol that matches any string), and will be forwarded +to all sites that subscribe to net.all (as determined by +what their neighbors send them). These sites make up the +"net" subnetwork. An article posted to "btl.general" +will reach all sites receiving "btl.all", but will not +reach sites that do not get "btl.all". In effect, the +articles reaches the "btl" subnetwork. An article +posted to newsgroups "net.micro,btl.general" will reach +all sites subscribing to either of the two classes. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - 18 - |