diff options
Diffstat (limited to 'doc/rfc/rfc1267.txt')
| -rw-r--r-- | doc/rfc/rfc1267.txt | 1963 | 
1 files changed, 1963 insertions, 0 deletions
diff --git a/doc/rfc/rfc1267.txt b/doc/rfc/rfc1267.txt new file mode 100644 index 0000000..70b6704 --- /dev/null +++ b/doc/rfc/rfc1267.txt @@ -0,0 +1,1963 @@ + + + + + + +Network Working Group                                        K. Lougheed +Request for Comments: 1267                                 cisco Systems +Obsoletes RFCs: 1105, 1163                                    Y. Rekhter +                                  T.J. Watson Research Center, IBM Corp. +                                                            October 1991 + + +                  A Border Gateway Protocol 3 (BGP-3) + +Status of this Memo + +   This memo, together with its companion document, "Application of the +   Border Gateway Protocol in the Internet", define an inter-autonomous +   system routing protocol for the Internet.  This RFC specifies an IAB +   standards track protocol for the Internet community, and requests +   discussion and suggestions for improvements.  Please refer to the +   current edition of the "IAB Official Protocol Standards" for the +   standardization state and status of this protocol.  Distribution of +   this memo is unlimited. + +1.  Acknowledgements + +   We would like to express our thanks to Guy Almes (Rice University), +   Len Bosack (cisco Systems), Jeffrey C. Honig (Cornell Theory Center) +   and all members of the Interconnectivity Working Group of the +   Internet Engineering Task Force, chaired by Guy Almes, for their +   contributions to this document. + +   We like to explicitly thank Bob Braden (ISI) for the review of this +   document as well as his constructive and valuable comments. + +   We would also like to thank Bob Hinden, Director for Routing of the +   Internet Engineering Steering Group, and the team of reviewers he +   assembled to review earlier versions of this document.  This team, +   consisting of Deborah Estrin, Milo Medin, John Moy, Radia Perlman, +   Martha Steenstrup, Mike St. Johns, and Paul Tsuchiya, acted with a +   strong combination of toughness, professionalism, and courtesy. + +2.  Introduction + +   The Border Gateway Protocol (BGP) is an inter-Autonomous System +   routing protocol.  It is built on experience gained with EGP as +   defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as +   described in RFC 1092 [2] and RFC 1093 [3]. + +   The primary function of a BGP speaking system is to exchange network +   reachability information with other BGP systems.  This network +   reachability information includes information on the full path of + + + +Lougheed & Rekhter                                              [Page 1] + +RFC 1267                         BGP-3                      October 1991 + + +   Autonomous Systems (ASs) that traffic must transit to reach these +   networks.  This information is sufficient to construct a graph of AS +   connectivity from which routing loops may be pruned and some policy +   decisions at the AS level may be enforced. + +   To characterize the set of policy decisions that can be enforced +   using BGP, one must focus on the rule that an AS advertize to its +   neighbor ASs only those routes that it itself uses.  This rule +   reflects the "hop-by-hop" routing paradigm generally used throughout +   the current Internet.  Note that some policies cannot be supported by +   the "hop-by-hop" routing paradigm and thus require techniques such as +   source routing to enforce.  For example, BGP does not enable one AS +   to send traffic to a neighbor AS intending that that traffic take a +   different route from that taken by traffic originating in the +   neighbor AS.  On the other hand, BGP can support any policy +   conforming to the "hop-by-hop" routing paradigm.  Since the current +   Internet uses only the "hop-by-hop" routing paradigm and since BGP +   can support any policy that conforms to that paradigm, BGP is highly +   applicable as an inter-AS routing protocol for the current Internet. + +   A more complete discussion of what policies can and cannot be +   enforced with BGP is outside the scope of this document (but refer to +   the companion document discussing BGP usage [5]). + +   BGP runs over a reliable transport protocol.  This eliminates the +   need to implement explicit update fragmentation, retransmission, +   acknowledgement, and sequencing.  Any authentication scheme used by +   the transport protocol may be used in addition to BGP's own +   authentication mechanisms.  The error notification mechanism used in +   BGP assumes that the transport protocol supports a "graceful" close, +   i.e., that all outstanding data will be delivered before the +   connection is closed. + +   BGP uses TCP [4] as its transport protocol.  TCP meets BGP's +   transport requirements and is present in virtually all commercial +   routers and hosts.  In the following descriptions the phrase +   "transport protocol connection" can be understood to refer to a TCP +   connection.  BGP uses TCP port 179 for establishing its connections. + +   This memo uses the term `Autonomous System' (AS) throughout.  The +   classic definition of an Autonomous System is a set of routers under +   a single technical administration, using an interior gateway protocol +   and common metrics to route packets within the AS, and using an +   exterior gateway protocol to route packets to other ASs.  Since this +   classic definition was developed, it has become common for a single +   AS to use several interior gateway protocols and sometimes several +   sets of metrics within an AS.  The use of the term Autonomous System +   here stresses the fact that, even when multiple IGPs and metrics are + + + +Lougheed & Rekhter                                              [Page 2] + +RFC 1267                         BGP-3                      October 1991 + + +   used, the administration of an AS appears to other ASs to have a +   single coherent interior routing plan and presents a consistent +   picture of what networks are reachable through it.  From the +   standpoint of exterior routing, an AS can be viewed as monolithic: +   reachability to networks directly connected to the AS must be +   equivalent from all border gateways of the AS. + +   The planned use of BGP in the Internet environment, including such +   issues as topology, the interaction between BGP and IGPs, and the +   enforcement of routing policy rules is presented in a companion +   document [5].  This document is the first of a series of documents +   planned to explore various aspects of BGP application. + +   Please send comments to the BGP mailing list (iwg@rice.edu). + +3.  Summary of Operation + +   Two systems form a transport protocol connection between one another. +   They exchange messages to open and confirm the connection parameters. +   The initial data flow is the entire BGP routing table.  Incremental +   updates are sent as the routing tables change.  BGP does not require +   periodic refresh of the entire BGP routing table.  Therefore, a BGP +   speaker must retain the current version of the entire BGP routing +   tables of all of its peers for the duration of the connection. +   KeepAlive messages are sent periodically to ensure the liveness of +   the connection.  Notification messages are sent in response to errors +   or special conditions.  If a connection encounters an error +   condition, a notification message is sent and the connection is +   closed. + +   The hosts executing the Border Gateway Protocol need not be routers. +   A non-routing host could exchange routing information with routers +   via EGP or even an interior routing protocol.  That non-routing host +   could then use BGP to exchange routing information with a border +   router in another Autonomous System.  The implications and +   applications of this architecture are for further study. + +   If a particular AS has multiple BGP speakers and is providing transit +   service for other ASs, then care must be taken to ensure a consistent +   view of routing within the AS.  A consistent view of the interior +   routes of the AS is provided by the interior routing protocol.  A +   consistent view of the routes exterior to the AS can be provided by +   having all BGP speakers within the AS maintain direct BGP connections +   with each other.  Using a common set of policies, the BGP speakers +   arrive at an agreement as to which border routers will serve as +   exit/entry points for particular networks outside the AS.  This +   information is communicated to the AS's internal routers, possibly +   via the interior routing protocol.  Care must be taken to ensure that + + + +Lougheed & Rekhter                                              [Page 3] + +RFC 1267                         BGP-3                      October 1991 + + +   the interior routers have all been updated with transit information +   before the BGP speakers announce to other ASs that transit service is +   being provided. + +   Connections between BGP speakers of different ASs are referred to as +   "external" links.  BGP connections between BGP speakers within the +   same AS are referred to as "internal" links. + +4.  Message Formats + +   This section describes message formats used by BGP. + +   Messages are sent over a reliable transport protocol connection.  A +   message is processed only after it is entirely received.  The maximum +   message size is 4096 octets.  All implementations are required to +   support this maximum message size.  The smallest message that may be +   sent consists of a BGP header without a data portion, or 19 octets. + +   4.1 Message Header Format + +   Each message has a fixed-size header.  There may or may not be a data +   portion following the header, depending on the message type.  The +   layout of these fields is shown below: + +    0                   1                   2                   3 +    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +   |                                                               | +   +                                                               + +   |                                                               | +   +                                                               + +   |                           Marker                              | +   +                                                               + +   |                                                               | +   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +   |          Length               |      Type     | +   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +   Marker: + +      This 16-octet field contains a value that the receiver of the +      message can predict.  If the Type of the message is OPEN, or if +      the Authentication Code used in the OPEN message of the connection +      is zero, then the Marker must be all ones.  Otherwise, the value +      of the marker can be predicted by some a computation specified as +      part of the authentication mechanism used.  The Marker can be used +      to detect loss of synchronization between a pair of BGP peers, and +      to authenticate incoming BGP messages. + + + +Lougheed & Rekhter                                              [Page 4] + +RFC 1267                         BGP-3                      October 1991 + + +   Length: + +      This 2-octet unsigned integer indicates the total length of the +      message, including the header, in octets.  Thus, e.g., it allows +      one to locate in the transport-level stream the (Marker field of +      the) next message.  The value of the Length field must always be +      at least 19 and no greater than 4096, and may be further +      constrained, depending on the message type.  No "padding" of extra +      data after the message is allowed, so the Length field must have +      the smallest value required given the rest of the message. + +   Type: + +      This 1-octet unsigned integer indicates the type code of the +      message.  The following type codes are defined: + +                           1 - OPEN +                           2 - UPDATE +                           3 - NOTIFICATION +                           4 - KEEPALIVE + +4.2 OPEN Message Format + +   After a transport protocol connection is established, the first +   message sent by each side is an OPEN message.  If the OPEN message is +   acceptable, a KEEPALIVE message confirming the OPEN is sent back. +   Once the OPEN is confirmed, UPDATE, KEEPALIVE, and NOTIFICATION +   messages may be exchanged. + +   In addition to the fixed-size BGP header, the OPEN message contains +   the following fields: + + + + + + + + + + + + + + + + + + + + +Lougheed & Rekhter                                              [Page 5] + +RFC 1267                         BGP-3                      October 1991 + + +     0                   1                   2                   3 +    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +    +-+-+-+-+-+-+-+-+ +    |    Version    | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |     My Autonomous System      | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |           Hold Time           | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |                         BGP Identifier                        | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |  Auth. Code   | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |                                                               | +    |                       Authentication Data                     | +    |                                                               | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +   Version: + +      This 1-octet unsigned integer indicates the protocol version +      number of the message.  The current BGP version number is 3. + +   My Autonomous System: + +      This 2-octet unsigned integer indicates the Autonomous System +      number of the sender. + +   Hold Time: + +      This 2-octet unsigned integer indicates the maximum number of +      seconds that may elapse between the receipt of successive +      KEEPALIVE and/or UPDATE and/or NOTIFICATION messages. + + +   BGP Identifier: +      This 4-octet unsigned integer indicates the BGP Identifier of +      the sender. A given BGP speaker sets the value of its BGP +      Identifier to the IP address of one of its interfaces. +      The value of the BGP Identifier is determined on startup +      and is the same for every local interface and every BGP peer. + +   Authentication Code: + +      This 1-octet unsigned integer indicates the authentication +      mechanism being used.  Whenever an authentication mechanism is +      specified for use within BGP, three things must be included in the +      specification: + + + +Lougheed & Rekhter                                              [Page 6] + +RFC 1267                         BGP-3                      October 1991 + + +         - the value of the Authentication Code which indicates use of +         the mechanism, +         - the form and meaning of the Authentication Data, and +         - the algorithm for computing values of Marker fields. +      Only one authentication mechanism is specified as part of this +      memo: +         - its Authentication Code is zero, +         - its Authentication Data must be empty (of zero length), and +         - the Marker fields of all messages must be all ones. +      The semantics of non-zero Authentication Codes lies outside the +      scope of this memo. + +      Note that a separate authentication mechanism may be used in +      establishing the transport level connection. + +   Authentication Data: + +      The form and meaning of this field is a variable-length field +      depend on the Authentication Code.  If the value of Authentication +      Code field is zero, the Authentication Data field must have zero +      length.  The semantics of the non-zero length Authentication Data +      field is outside the scope of this memo. + +      Note that the length of the Authentication Data field can be +      determined from the message Length field by the formula: + +         Message Length = 29 + Authentication Data Length + +      The minimum length of the OPEN message is 29 octets (including +      message header). + +4.3 UPDATE Message Format + +   UPDATE messages are used to transfer routing information between BGP +   peers.  The information in the UPDATE packet can be used to construct +   a graph describing the relationships of the various Autonomous +   Systems.  By applying rules to be discussed, routing information +   loops and some other anomalies may be detected and removed from +   inter-AS routing. + +   In addition to the fixed-size BGP header, the UPDATE message contains +   the following fields (note that all fields may have arbitrary +   alignment): + + + + + + + + +Lougheed & Rekhter                                              [Page 7] + +RFC 1267                         BGP-3                      October 1991 + + +     0                   1                   2                   3 +     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |  Total Path Attributes Length | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |                                                               | +    /                      Path Attributes                          / +    /                                                               / +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |                       Network 1                               | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    /                                                               / +    /                                                               / +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    |                       Network n                               | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +   Total Path Attribute Length: + +      This 2-octet unsigned integer indicates the total length of the +      Path Attributes field in octets.  Its value must allow the (non- +      negative integer) number of Network fields to be determined as +      specified below. + +   Path Attributes: + +      A variable length sequence of path attributes is present in every +      UPDATE.  Each path attribute is a triple <attribute type, +      attribute length, attribute value> of variable length. + +      Attribute Type is a two-octet field that consists of the Attribute +      Flags octet followed by the Attribute Type Code octet. + +       0                   1 +       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +      |  Attr. Flags  |Attr. Type Code| +      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +      The high-order bit (bit 0) of the Attribute Flags octet is the +      Optional bit.  It defines whether the attribute is optional (if +      set to 1) or well-known (if set to 0). + +      The second high-order bit (bit 1) of the Attribute Flags octet is +      the Transitive bit.  It defines whether an optional attribute is +      transitive (if set to 1) or non-transitive (if set to 0).  For +      well-known attributes, the Transitive bit must be set to 1.  (See +      Section 5 for a discussion of transitive attributes.) + + + +Lougheed & Rekhter                                              [Page 8] + +RFC 1267                         BGP-3                      October 1991 + + +      The third high-order bit (bit 2) of the Attribute Flags octet is +      the Partial bit.  It defines whether the information contained in +      the optional transitive attribute is partial (if set to 1) or +      complete (if set to 0).  For well-known attributes and for +      optional non-transitive attributes the Partial bit must be set to +      0. + +      The fourth high-order bit (bit 3) of the Attribute Flags octet is +      the Extended Length bit.  It defines whether the Attribute Length +      is one octet (if set to 0) or two octets (if set to 1).  Extended +      Length may be used only if the length of the attribute value is +      greater than 255 octets. + +      The lower-order four bits of the Attribute Flags octet are unused. +      They must be zero (and must be ignored when received). + +      The Attribute Type Code octet contains the Attribute Type Code. +      Currently defined Attribute Type Codes are discussed in Section 5. + +      If the Extended Length bit of the Attribute Flags octet is set to +      0, the third octet of the Path Attribute contains the length of +      the attribute data in octets. + +      If the Extended Length bit of the Attribute Flags octet is set to +      1, then the third and the fourth octets of the path attribute +      contain the length of the attribute data in octets. + +      The remaining octets of the Path Attribute represent the attribute +      value and are interpreted according to the Attribute Flags and the +      Attribute Type Code. + +      The meaning and handling of Path Attributes is discussed in +      Section 5. + +   Network: + +      Each 4-octet Internet network number indicates one network whose +      Inter-Autonomous System routing is described by the Path +      Attributes.  Subnets and host addresses are specifically not +      allowed.  The total number of Network fields in the UPDATE message +      can be determined by the formula: + +         Message Length = 19 + Total Path Attribute Length + 4 * #Nets + +      The message Length field of the message header and the Path +      Attributes Length field of the UPDATE message must be such that +      the formula results in a non-negative integer number of Network +      fields. + + + +Lougheed & Rekhter                                              [Page 9] + +RFC 1267                         BGP-3                      October 1991 + + +   The minimum length of the UPDATE message is 37 octets (including +   message header). + +4.4 KEEPALIVE Message Format + +   BGP does not use any transport protocol-based keep-alive mechanism to +   determine if peers are reachable.  Instead, KEEPALIVE messages are +   exchanged between peers often enough as not to cause the hold time +   (as advertised in the OPEN message) to expire.  A reasonable maximum +   time between KEEPALIVE messages would be one third of the Hold Time +   interval. + +   KEEPALIVE message consists of only message header and has a length of +   19 octets. + +4.5 NOTIFICATION Message Format + +   A NOTIFICATION message is sent when an error condition is detected. +   The BGP connection is closed immediately after sending it. + +   In addition to the fixed-size BGP header, the NOTIFICATION message +   contains the following fields: + +     0                   1                   2                   3 +     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +    | Error code    | Error subcode |           Data                | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               + +    |                                                               | +    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + +   Error Code: + +      This 1-octet unsigned integer indicates the type of NOTIFICATION. +      The following Error Codes have been defined: + +           Error Code       Symbolic Name               Reference + +             1         Message Header Error             Section 6.1 +             2         OPEN Message Error               Section 6.2 +             3         UPDATE Message Error             Section 6.3 +             4         Hold Timer Expired               Section 6.5 +             5         Finite State Machine Error       Section 6.6 +             6         Cease                            Section 6.7 + + + + + + +Lougheed & Rekhter                                             [Page 10] + +RFC 1267                         BGP-3                      October 1991 + + +   Error subcode: + +      This 1-octet unsigned integer provides more specific information +      about the nature of the reported error.  Each Error Code may have +      one or more Error Subcodes associated with it.  If no appropriate +      Error Subcode is defined, then a zero (Unspecific) value is used +      for the Error Subcode field. + +      Message Header Error subcodes: + +                      1  - Connection Not Synchronized. +                      2  - Bad Message Length. +                      3  - Bad Message Type. + +      OPEN Message Error subcodes: + +                      1  - Unsupported Version Number. +                      2  - Bad Peer AS. +                      3  - Bad BGP Identifier. +                      4  - Unsupported Authentication Code. +                      5  - Authentication Failure. + +      UPDATE Message Error subcodes: + +                      1 - Malformed Attribute List. +                      2 - Unrecognized Well-known Attribute. +                      3 - Missing Well-known Attribute. +                      4 - Attribute Flags Error. +                      5 - Attribute Length Error. +                      6 - Invalid ORIGIN Attribute +                      7 - AS Routing Loop. +                      8 - Invalid NEXT_HOP Attribute. +                      9 - Optional Attribute Error. +                     10 - Invalid Network Field. + + +   Data: + +      This variable-length field is used to diagnose the reason for the +      NOTIFICATION.  The contents of the Data field depend upon the +      Error Code and Error Subcode.  See Section 6 below for more +      details. + +      Note that the length of the Data field can be determined from the +      message Length field by the formula: + +         Message Length = 21 + Data Length + + + + +Lougheed & Rekhter                                             [Page 11] + +RFC 1267                         BGP-3                      October 1991 + + +   The minimum length of the NOTIFICATION message is 21 octets +   (including message header). + +5.  Path Attributes + +   This section discusses the path attributes of the UPDATE message. + +   Path attributes fall into four separate categories: + +            1. Well-known mandatory. +            2. Well-known discretionary. +            3. Optional transitive. +            4. Optional non-transitive. + +   Well-known attributes must be recognized by all BGP implementations. +   Some of these attributes are mandatory and must be included in every +   UPDATE message.  Others are discretionary and may or may not be sent +   in a particular UPDATE message.  Which well-known attributes are +   mandatory or discretionary is noted in the table below. + +   All well-known attributes must be passed along (after proper +   updating, if necessary) to other BGP peers. + +   In addition to well-known attributes, each path may contain one or +   more optional attributes.  It is not required or expected that all +   BGP implementations support all optional attributes.  The handling of +   an unrecognized optional attribute is determined by the setting of +   the Transitive bit in the attribute flags octet.  Paths with +   unrecognized transitive optional attributes should be accepted. If a +   path with unrecognized transitive optional attribute is accepted and +   passed along to other BGP peers, then the unrecognized transitive +   optional attribute of that path must be passed along with the path to +   other BGP peers with the Partial bit in the Attribute Flags octet set +   to 1. If a path with recognized transitive optional attribute is +   accepted and passed along to other BGP peers and the Partial bit in +   the Attribute Flags octet is set to 1 by some previous AS, it is not +   set back to 0 by the current AS. Unrecognized non-transitive optional +   attributes must be quietly ignored and not passed along to other BGP +   peers. + +   New transitive optional attributes may be attached to the path by the +   originator or by any other AS in the path.  If they are not attached +   by the originator, the Partial bit in the Attribute Flags octet is +   set to 1.  The rules for attaching new non-transitive optional +   attributes will depend on the nature of the specific attribute.  The +   documentation of each new non-transitive optional attribute will be +   expected to include such rules.  (The description of the INTER-AS +   METRIC attribute gives an example.)  All optional attributes (both + + + +Lougheed & Rekhter                                             [Page 12] + +RFC 1267                         BGP-3                      October 1991 + + +   transitive and non-transitive) may be updated (if appropriate) by ASs +   in the path. + +   The sender of an UPDATE message should order path attributes within +   the UPDATE message in ascending order of attribute type.  The +   receiver of an UPDATE message must be prepared to handle path +   attributes within the UPDATE message that are out of order. + +   The same attribute cannot appear more than once within the Path +   Attributes field of a particular UPDATE message. + +   Following table specifies attribute type code, attribute length, and +   attribute category for path attributes defined in this document: + +   Attribute Name     Type Code    Length     Attribute category +      ORIGIN              1          1        well-known, mandatory +      AS_PATH             2       variable    well-known, mandatory +      NEXT_HOP            3          4        well-known, mandatory +      UNREACHABLE         4          0        well-known, discretionary +      INTER-AS METRIC     5          2        optional, non-transitive + +   ORIGIN: + +      The ORIGIN path attribute defines the origin of the path +      information.  The data octet can assume the following values: + +         Value    Meaning +           0       IGP - network(s) are interior to the originating AS +           1       EGP - network(s) learned via EGP +           2       INCOMPLETE - network(s) learned by some other means + +   AS_PATH: + +      The AS_PATH attribute enumerates the ASs that must be traversed to +      reach the networks listed in the UPDATE message.  Since an AS +      identifier is 2 octets, the length of an AS_PATH attribute is +      twice the number of ASs in the path.  Rules for constructing an +      AS_PATH attribute are discussed in Section 9. + +      If a previously advertised route has become unreachable, then +      the AS_PATH path attribute of the unreachable route may be +      truncated when passed in the UPDATE message. Truncation is +      achieved by constructing the AS_PATH path attribute that consists +      of only the autonomous system of the sender of the UPDATE message. +      To make the truncated AS_PATH semantically correct, the sender +      also sends the ORIGIN path attribute with the value INCOMPLETE. +      Note that truncation may be done only over external BGP links. + + + + +Lougheed & Rekhter                                             [Page 13] + +RFC 1267                         BGP-3                      October 1991 + + +   NEXT_HOP: + +      The NEXT_HOP path attribute defines the IP address of the border +      router that should be used as the next hop to the networks listed +      in the UPDATE message.  If this border router belongs to the same +      AS as the BGP peer that advertises it, it is called an internal +      border router. If this border router belongs to a different AS +      than the one that the BGP peer that advertises it, it is called an +      external border router. A BGP speaker can advertise any internal +      border router as the next hop provided that the interface +      associated with the IP address of this border router (as +      specified in the NEXT_HOP path attribute) shares a common subnet +      with both the local and remote BGP speakers. A BGP speaker can +      advertise any external border router as the next hop, provided +      that the IP address of this border router was learned from one +      of the BGP speaker's peers, and the interface associated with +      the IP address of this border router (as specified in the +      NEXT_HOP path attribute) shares a common subnet with the local +      and remote BGP speakers.  A BGP speaker needs to be able to +      support disabling advertisement of external border routers. + +      The NEXT_HOP path attribute has meaning only on external BGP +      links.  However, presence of the NEXT_HOP path attribute in the +      UPDATE message received via an internal BGP link does not +      constitute an error. + +   UNREACHABLE: + +      The UNREACHABLE attribute is used to notify a BGP peer that some +      of the previously advertised routes have become unreachable. + +   INTER-AS METRIC: + +      The INTER-AS METRIC attribute may be used on external (inter-AS) +      links to discriminate between multiple exit or entry points to the +      same neighboring AS.  The value of the INTER-AS METRIC attribute +      is a 2-octet unsigned number which is called a metric.  All other +      factors being equal, the exit or entry point with lower metric +      should be preferred.  If received over external links, the INTER- +      AS METRIC attribute may be propagated over internal links to other +      BGP speaker within the same AS.  The INTER-AS METRIC attribute is +      never propagated to other BGP speakers in neighboring AS's. + +      If a previously advertised route has become unreachable, then +      the INTER-AS METRIC path attribute may be omitted from the UPDATE +      message. + + + + + +Lougheed & Rekhter                                             [Page 14] + +RFC 1267                         BGP-3                      October 1991 + + +6.  BGP Error Handling. + +   This section describes actions to be taken when errors are detected +   while processing BGP messages. + +   When any of the conditions described here are detected, a +   NOTIFICATION message with the indicated Error Code, Error Subcode, +   and Data fields is sent, and the BGP connection is closed.  If no +   Error Subcode is specified, then a zero must be used. + +   The phrase "the BGP connection is closed" means that the transport +   protocol connection has been closed and that all resources for that +   BGP connection have been deallocated.  Routing table entries +   associated with the remote peer are marked as invalid.  The fact that +   the routes have become invalid is passed to other BGP peers before +   the routes are deleted from the system. + +   Unless specified explicitly, the Data field of the NOTIFICATION +   message that is sent to indicate an error is empty. + +6.1 Message Header error handling. + +   All errors detected while processing the Message Header are indicated +   by sending the NOTIFICATION message with Error Code Message Header +   Error.  The Error Subcode elaborates on the specific nature of the +   error. + +   The expected value of the Marker field of the message header is all +   ones if the message type is OPEN.  The expected value of the Marker +   field for all other types of BGP messages determined based on the +   Authentication Code in the BGP OPEN message and the actual +   authentication mechanism (if the Authentication Code in the BGP OPEN +   message is non-zero). If the Marker field of the message header is +   not the expected one, then a synchronization error has occurred and +   the Error Subcode is set to Connection Not Synchronized. + +   If the Length field of the message header is less than 19 or greater +   than 4096, or if the Length field of an OPEN message is less  than +   the minimum length of the OPEN message, or if the Length field of an +   UPDATE message is less than the minimum length of the UPDATE message, +   or if the Length field of a KEEPALIVE message is not equal to 19, or +   if the Length field of a NOTIFICATION message is less than the +   minimum length of the NOTIFICATION message, then the Error Subcode is +   set to Bad Message Length.  The Data field contains the erroneous +   Length field. + +   If the Type field of the message header is not recognized, then the +   Error Subcode is set to Bad Message Type.  The Data field contains + + + +Lougheed & Rekhter                                             [Page 15] + +RFC 1267                         BGP-3                      October 1991 + + +   the erroneous Type field. + +6.2 OPEN message error handling. + +   All errors detected while processing the OPEN message are indicated +   by sending the NOTIFICATION message with Error Code OPEN Message +   Error.  The Error Subcode elaborates on the specific nature of the +   error. + +   If the version number contained in the Version field of the received +   OPEN message is not supported, then the Error Subcode is set to +   Unsupported Version Number.  The Data field is a 2-octet unsigned +   integer, which indicates the largest locally supported version number +   less than the version the remote BGP peer bid (as indicated in the +   received OPEN message). + +   If the Autonomous System field of the OPEN message is unacceptable, +   then the Error Subcode is set to Bad Peer AS.  The determination of +   acceptable Autonomous System numbers is outside the scope of this +   protocol. + +   If the BGP Identifier field of the OPEN message is syntactically +   incorrect, then the Error Subcode is set to Bad BGP Identifier. +   Syntactic correctness means that the BGP Identifier field represents +   a valid IP host address. + +   If the Authentication Code of the OPEN message is not recognized, +   then the Error Subcode is set to Unsupported Authentication Code.  If +   the Authentication Code is zero, then the Authentication Data must be +   of zero length.  Otherwise, the Error Subcode is set to +   Authentication Failure. + +   If the Authentication Code is non-zero, then the corresponding +   authentication procedure is invoked.  If the authentication procedure +   (based on Authentication Code and Authentication Data) fails, then +   the Error Subcode is set to Authentication Failure. + +6.3 UPDATE message error handling. + +   All errors detected while processing the UPDATE message are indicated +   by sending the NOTIFICATION message with Error Code UPDATE Message +   Error.  The error subcode elaborates on the specific nature of the +   error. + +   Error checking of an UPDATE message begins by examining the path +   attributes.  If the Total Attribute Length is too large (i.e., if +   Total Attribute Length + 21 exceeds the message Length), or if the +   (non-negative integer) Number of Network fields cannot be computed as + + + +Lougheed & Rekhter                                             [Page 16] + +RFC 1267                         BGP-3                      October 1991 + + +   in Section 4.3, then the Error Subcode is set to Malformed Attribute +   List. + +   If any recognized attribute has Attribute Flags that conflict with +   the Attribute Type Code, then the Error Subcode is set to Attribute +   Flags Error.  The Data field contains the erroneous attribute (type, +   length and value). + +   If any recognized attribute has Attribute Length that conflicts with +   the expected length (based on the attribute type code), then the +   Error Subcode is set to Attribute Length Error.  The Data field +   contains the erroneous attribute (type, length and value). + +   If any of the mandatory well-known attributes are not present, then +   the Error Subcode is set to Missing Well-known Attribute.  The Data +   field contains the Attribute Type Code of the missing well-known +   attribute. + +   If any of the mandatory well-known attributes are not recognized, +   then the Error Subcode is set to Unrecognized Well-known Attribute. +   The Data field contains the unrecognized attribute (type, length and +   value). + +   If the ORIGIN attribute has an undefined value, then the Error +   Subcode is set to Invalid Origin Attribute.  The Data field contains +   the unrecognized attribute (type, length and value). + +   If the NEXT_HOP attribute field is syntactically or semantically +   incorrect, then the Error Subcode is set to Invalid NEXT_HOP +   Attribute. + +   The Data field contains the incorrect attribute (type, length and +   value).  Syntactic correctness means that the NEXT_HOP attribute +   represents a valid IP host address.  Semantic correctness applies +   only to the external BGP links. It means that the interface +   associated with the IP address, as specified in the NEXT_HOP +   attribute, shares a common subnet with the receiving BGP speaker. + +   The AS route specified by the AS_PATH attribute is checked for AS +   loops.  AS loop detection is done by scanning the full AS route (as +   specified in the AS_PATH attribute) and checking that each AS occurs +   at most once.  If a loop is detected, then the Error Subcode is set +   to AS Routing Loop.  The Data field contains the incorrect attribute +   (type, length and value). + +   If an optional attribute is recognized, then the value of this +   attribute is checked.  If an error is detected, the attribute is +   discarded, and the Error Subcode is set to Optional Attribute Error. + + + +Lougheed & Rekhter                                             [Page 17] + +RFC 1267                         BGP-3                      October 1991 + + +   The Data field contains the attribute (type, length and value). + +   If any attribute appears more than once in the UPDATE message, then +   the Error Subcode is set to Malformed Attribute List. + +   Each Network field in the UPDATE message is checked for syntactic +   validity.  If the Network field is syntactically incorrect, or +   contains a subnet or a host address, then the Error Subcode is set to +   Invalid Network Field. + +6.4 NOTIFICATION message error handling. + +   If a peer sends a NOTIFICATION message, and there is an error in that +   message, there is unfortunately no means of reporting this error via +   a subsequent NOTIFICATION message.  Any such error, such as an +   unrecognized Error Code or Error Subcode, should be noticed, logged +   locally, and brought to the attention of the administration of the +   peer.  The means to do this, however, lies outside the scope of this +   document. + +6.5 Hold Timer Expired error handling. + +   If a system does not receive successive KEEPALIVE and/or UPDATE +   and/or NOTIFICATION messages within the period specified in the Hold +   Time field of the OPEN message, then the NOTIFICATION message with +   Hold Timer Expired Error Code must be sent and the BGP connection +   closed. + +6.6 Finite State Machine error handling. + +   Any error detected by the BGP Finite State Machine (e.g., receipt of +   an unexpected event) is indicated by sending the NOTIFICATION message +   with Error Code Finite State Machine Error. + +6.7 Cease. + +   In absence of any fatal errors (that are indicated in this section), +   a BGP peer may choose at any given time to close its BGP connection +   by sending the NOTIFICATION message with Error Code Cease.  However, +   the Cease NOTIFICATION message must not be used when a fatal error +   indicated by this section does exist. + +6.8 Connection collision detection. + +   If a pair of BGP speakers try simultaneously to establish a TCP +   connection to each other, then two parallel connections between this +   pair of speakers might well be formed.  We refer to this situation as +   connection collision.  Clearly, one of these connections must be + + + +Lougheed & Rekhter                                             [Page 18] + +RFC 1267                         BGP-3                      October 1991 + + +   closed. + +   Based on the value of the BGP Identifier a convention is established +   for detecting which BGP connection is to be preserved when a +   collision does occur. The convention is to compare the BGP +   Identifiers of the peers involved in the collision and to retain only +   the connection initiated by the BGP speaker with the higher-valued +   BGP Identifier. + +   Upon receipt of an OPEN message, the local system must examine all of +   its connections that are in the OpenSent state.  If among them there +   is a connection to a remote BGP speaker whose BGP Identifier equals +   the one in the OPEN message, then the local system performs the +   following collision resolution procedure: + +          1. The BGP Identifier of the local system is compared to the +          BGP Identifier of the remote system (as specified in the +          OPEN message). + +          2. If the value of the local BGP Identifier is less than the +          remote one, the local system closes BGP connection that +          already exists (the one that is already in the OpenSent +          state), and accepts BGP connection initiated by the remote +          system. + +          3. Otherwise, the local system closes newly created BGP +          connection (the one associated with the newly received OPEN +          message), and continues to use the existing one (the one +          that is already in the OpenSent state). + +          Comparing BGP Identifiers is done by treating them as +          (4-octet long) unsigned integers. + +          A connection collision with existing BGP connections that +          are either in OpenConfirm or Established states causes +          unconditional closing of the newly created connection.  Note +          that a connection collision cannot be detected with +          connections that are in Idle, or Connect, or Active states. + +          Closing the BGP connection (that results from the collision +          resolution procedure) is accomplished by sending the +          NOTIFICATION message with the Error Code Cease. + +7.  BGP Version Negotiation. + +   BGP speakers may negotiate the version of the protocol by making +   multiple attempts to open a BGP connection, starting with the highest +   version number each supports.  If an open attempt fails with an Error + + + +Lougheed & Rekhter                                             [Page 19] + +RFC 1267                         BGP-3                      October 1991 + + +   Code OPEN Message Error, and an Error Subcode Unsupported Version +   Number, then the BGP speaker has available the version number it +   tried, the version number its peer tried, the version number passed +   by its peer in the NOTIFICATION message, and the version numbers that +   it supports.  If the two peers do support one or more common +   versions, then this will allow them to rapidly determine the highest +   common version. In order to support BGP version negotiation, future +   versions of BGP must retain the format of the OPEN and NOTIFICATION +   messages. + +8.  BGP Finite State machine. + +   This section specifies BGP operation in terms of a Finite State +   Machine (FSM).  Following is a brief summary and overview of BGP +   operations by state as determined by this FSM.  A condensed version +   of the BGP FSM is found in Appendix 1. + +   Initially BGP is in the Idle state. + +      Idle state: + +         In this state BGP refuses all incoming BGP connections.  No +         resources are allocated to the BGP neighbor.  In response to +         the Start event (initiated by either system or operator) the +         local system initializes all BGP resources, starts the +         ConnectRetry timer, initiates a transport connection to other +         BGP peer, while listening for connection that may be initiated +         by the remote BGP peer, and changes its state to Connect. +         The exact value of the ConnectRetry timer is a local matter, +         but should be sufficiently large to allow TCP initialization. + +         Any other event received in the Idle state is ignored. + +      Connect state: + +         In this state BGP is waiting for the transport protocol +         connection to be completed. + +         If the transport protocol connection succeeds, the local system +         clears the ConnectRetry timer, completes initialization, sends +         an OPEN message to its peer, and changes its state to OpenSent. + +         If the transport protocol connect fails (e.g., retransmission +         timeout), the local system restarts the ConnectRetry timer, +         continues to listen for a connection that may be initiated by +         the remote BGP peer, and changes its state to Active state. + +         In response to the ConnectRetry timer expired event, the local + + + +Lougheed & Rekhter                                             [Page 20] + +RFC 1267                         BGP-3                      October 1991 + + +         system restarts the ConnectRetry timer, initiates a transport +         connection to other BGP peer, continues to listen for a +         connection that may be initiated by the remote BGP peer, and +         stays in the Connect state. + +         Start event is ignored in the Active state. + +         In response to any other event (initiated by either system or +         operator), the local system releases all BGP resources +         associated with this connection and changes its state to Idle. + +      Active state: + +         In this state BGP is trying to acquire a BGP neighbor by +         initiating a transport protocol connection. + +         If the transport protocol connection succeeds, the local system +         clears the ConnectRetry timer, completes initialization, sends +         an OPEN message to its peer, sets its hold timer to a large +         value, and changes its state to OpenSent. + +         In response to the ConnectRetry timer expired event, the local +         system restarts the ConnectRetry timer, initiates a transport +         connection to other BGP peer, continues to listen for a +         connection that may be be initiated by the remote BGP peer, and +         changes its state to Connect. + +         If the local system detects that a remote peer is trying to +         establish BGP connection to it, and the IP address of the +         remote peer is not an expected one, the local system restarts +         the ConnectRetry timer, rejects the attempted connection, +         continues to listen for a connection that may be initiated by +         the remote BGP peer, and stays in the Active state. + +         Start event is ignored in the Active state. + +         In response to any other event (initiated by either system or +         operator), the local system releases all BGP resources +         associated with this connection and changes its state to Idle. + +      OpenSent state: + +         In this state BGP waits for an OPEN message from its peer. +         When an OPEN message is received, all fields are checked for +         correctness.  If the BGP message header checking or OPEN +         message checking detects an error (see Section 6.2), or +         a connection collision (see Section 6.8) the local +         system sends a NOTIFICATION message and changes its state to + + + +Lougheed & Rekhter                                             [Page 21] + +RFC 1267                         BGP-3                      October 1991 + + +         Idle. + +         If there are no errors in the OPEN message, BGP sends a +         KEEPALIVE message and sets a KeepAlive timer.  The hold timer, +         which was originally set to an arbitrary large value (see +         above), is replaced with the value indicated in the OPEN +         message.  If the value of the Autonomous System field is the +         same as our own, then the connection is "internal" connection; +         otherwise, it is "external".  (This will effect UPDATE +         processing as described below.)  Finally, the state is changed +         to OpenConfirm. + +         If a disconnect notification is received from the underlying +         transport protocol, the local system closes the BGP connection, +         restarts the ConnectRetry timer, while continue listening for +         connection that may be initiated by the remote BGP peer, and +         goes into the Active state. + +         If the hold time expires, the local system sends NOTIFICATION +         message with error code Hold Timer Expired and changes its +         state to Idle. + +         In response to the Stop event (initiated by either system or +         operator) the local system sends NOTIFICATION message with +         Error Code Cease and changes its state to Idle. + +         Start event is ignored in the OpenSent state. + +         In response to any other event the local system sends +         NOTIFICATION message with Error Code Finite State Machine Error +         and changes its state to Idle. + +         Whenever BGP changes its state from OpenSent to Idle, it closes +         the BGP (and transport-level) connection and releases all +         resources associated with that connection. + +      OpenConfirm state: + +         In this state BGP waits for a KEEPALIVE or NOTIFICATION +         message. + +         If the local system receives a KEEPALIVE message, it changes +         its state to Established. + +         If the hold timer expires before a KEEPALIVE message is +         received, the local system sends NOTIFICATION message with +         error code Hold Timer expired and changes its state to Idle. + + + + +Lougheed & Rekhter                                             [Page 22] + +RFC 1267                         BGP-3                      October 1991 + + +         If the local system receives a NOTIFICATION message, it changes +         its state to Idle. + +         If the KeepAlive timer expires, the local system sends a +         KEEPALIVE message and restarts its KeepAlive timer. + +         If a disconnect notification is received from the underlying +         transport protocol, the local system changes its state to Idle. + +         In response to the Stop event (initiated by either system or +         operator) the local system sends NOTIFICATION message with +         Error Code Cease and changes its state to Idle. + +         Start event is ignored in the OpenConfirm state. + +         In response to any other event the local system sends +         NOTIFICATION message with Error Code Finite State Machine Error +         and changes its state to Idle. + +         Whenever BGP changes its state from OpenConfirm to Idle, it +         closes the BGP (and transport-level) connection and releases +         all resources associated with that connection. + +      Established state: + +         In the Established state BGP can exchange UPDATE, NOTIFICATION, +         and KEEPALIVE messages with its peer. + +         If the local system receives an UPDATE or KEEPALIVE message, it +         restarts its Holdtime timer. + +         If the local system receives a NOTIFICATION message, it changes +         its state to Idle. + +         If the local system receives an UPDATE message and the UPDATE +         message error handling procedure (see Section 6.3) detects an +         error, the local system sends a NOTIFICATION message and +         changes its state to Idle. + +         If a disconnect notification is received from the underlying +         transport protocol, the local system  changes its state to +         Idle. + +         If the Holdtime timer expires, the local system sends a +         NOTIFICATION message with Error Code Hold Timer Expired and +         changes its state to Idle. + +         If the KeepAlive timer expires, the local system sends a + + + +Lougheed & Rekhter                                             [Page 23] + +RFC 1267                         BGP-3                      October 1991 + + +         KEEPALIVE message and restarts its KeepAlive timer. + +         Each time the local system sends a KEEPALIVE or UPDATE message, +         it restarts its KeepAlive timer. + +         In response to the Stop event (initiated by either system or +         operator), the local system sends a NOTIFICATION message with +         Error Code Cease and changes its state to Idle. + +         Start event is ignored in the Established state. + +         In response to any other event, the local system sends +         NOTIFICATION message with Error Code Finite State Machine Error +         and changes its state to Idle. + +         Whenever BGP changes its state from Established to Idle, it +         closes the BGP (and transport-level) connection, releases all +         resources associated with that connection, and deletes all +         routes derived from that connection. + +9.  UPDATE Message Handling + +   An UPDATE message may be received only in the Established state. +   When an UPDATE message is received, each field is checked for +   validity as specified in Section 6.3. + +   If an optional non-transitive attribute is unrecognized, it is +   quietly ignored.  If an optional transitive attribute is +   unrecognized, the Partial bit (the third high-order bit) in the +   attribute flags octet is set to 1, and the attribute is retained for +   propagation to other BGP speakers. + +   If an optional attribute is recognized, and has a valid value, then, +   depending on the type of the optional attribute, it is processed +   locally, retained, and updated, if necessary, for possible +   propagation to other BGP speakers. + +   If the network and the path attributes associated with a route to +   that network are correct, then the route is compared with other +   routes to the same network. + +   When a BGP speaker receives a new route from a peer over external BGP +   link, it shall advertise that route to other BGP speakers in its +   autonomous system by means of an UPDATE message if either of the +   following conditions occur: + +      a) the newly received route is considered to be better +         than the other routes to the same network (as listed + + + +Lougheed & Rekhter                                             [Page 24] + +RFC 1267                         BGP-3                      October 1991 + + +         in the UPDATE message) that have been received over +         external BGP links, or + +      b) there are no other acceptable routes to the network +         (as listed in the UPDATE message) that have been +         received over external BGP links. + +   When a BGP speaker receives an unreachable route from a BGP peer over +   external BGP link, it shall advertise that route to all other BGP +   speakers in its autonomous system, indicating that it has become +   unreachable, if the following condition occur: + +      a) a corresponding acceptable route to the same destination +         was considered to be the best one among all routes to that +         destination that have been received over external BGP links +         (that is the local system has been advertising the +         route to all other BGP speakers in its autonomous system +         before it received the UPDATE message that reported it +         as unreachable). + +   Whenever a BGP speaker selects a new route (among all the routes +   received from external and internal BGP peers), or determines that +   the reachable destinations within its own autonomous system have +   changed, it shall generate an UPDATE message and forward it to each +   of its external peers (peers connected via external BGP links). + +   If a route in the UPDATE was received over an internal link, it is +   not propagated over any other internal link.  This restriction is due +   to the fact that all BGP speakers within a single AS form a +   completely connected graph (see above). + +   If the UPDATE message is propagated over an external link, then the +   local AS number is prepended to the AS_PATH attribute, and the +   NEXT_HOP attribute is updated with an IP address of the router that +   should be used as a next hop to the network.  If the UPDATE message +   is propagated over an internal link, then the AS_PATH attribute and +   the NEXT_HOP attribute are passed unmodified. + +   Generally speaking, the rules for comparing routes among several +   alternatives are outside the scope of this document.  There are two +   exceptions: + +      - If the local AS appears in the AS path of the new route being +        considered, then that new route cannot be viewed as better than +        any other route.  If such a route were ever used, a routing loop +        would result. + +      - In order to achieve successful distributed operation, only routes + + + +Lougheed & Rekhter                                             [Page 25] + +RFC 1267                         BGP-3                      October 1991 + + +        with a likelihood of stability can be chosen.  Thus, an AS must +        avoid using unstable routes, and it must not make rapid +        spontaneous changes to its choice of route.  Quantifying the terms +        "unstable" and "rapid" in the previous sentence will require +        experience, but the principle is clear. + +10. Detection of Inter-AS Policy Contradictions + +   Since BGP requires no central authority for coordinating routing +   policies among ASs, and since routing policies are not exchanged via +   the protocol itself, it is possible for a group of ASs to have a set +   of routing policies that cannot simultaneously be satisfied.  This +   may cause an indefinite oscillation of the routes in this group of +   ASs. + +   To help detect such a situation, all BGP speakers must observe the +   following rule.  If a route to a destination that is currently used +   by the local system is determined to be unreachable (e.g., as a +   result of receiving an UPDATE message for this route with the +   UNREACHABLE attribute), then, before switching to another route, this +   local system must advertize this route as unreachable to all the BGP +   neighbors to which it previously advertized this route. + +   This rule will allow other ASs to distinguish between two different +   situations: + +      - The local system has chosen to use a new route because the old +        route become unreachable. + +      - The local system has chosen to use a new route because it +        preferred it over the old route.  The old route is still +        viable. + +   In the former case, an UPDATE message with the UNREACHABLE attribute +   will be received for the old route.  In the latter case it will not. + +   In some cases, this may allow a BGP speaker to detect the fact that +   its policies, taken together with the policies of some other AS, +   cannot simultaneously be satisfied.  For example, consider the +   following situation involving AS A and its neighbor AS B.  B +   advertises a route with a path of the form <B,...>, where A is not +   present in the path.  A then decides to use this path, and advertises +   <A,B,...> to all its neighbors.  B later advertises <B,...,A,...> +   back to A, without ever declaring its previous path <B,...> to be +   unreachable.  Evidently, A prefers routes via B and B prefers routes +   via A.  The combined policies of A and B, taken together, cannot be +   satisfied.  Such an event should be noticed, logged locally, and +   brought to the attention of AS A's administration.  The means to do + + + +Lougheed & Rekhter                                             [Page 26] + +RFC 1267                         BGP-3                      October 1991 + + +   this, however, lies outside the scope of this document.  Also outside +   the document is a more complete procedure for detecting such +   contradictions of policy. + +   While the above rules provide a mechanism to detect a set of routing +   policies that cannot be satisfied simultaneously, the protocol itself +   does not provide any mechanism for suppressing the route oscillation +   that may result from these unsatisfiable policies.  The reason for +   doing this is that routing policies are viewed as external to the +   protocol and as determined by the local AS administrator. + +Appendix 1.  BGP FSM State Transitions and Actions. + +   This Appendix discusses the transitions between states in the BGP FSM +   in response to BGP events.  The following is the list of these states +   and events. + +    BGP States: + +             1 - Idle +             2 - Connect +             3 - Active +             4 - OpenSent +             5 - OpenConfirm +             6 - Established + + +    BGP Events: + +             1 - BGP Start +             2 - BGP Stop +             3 - BGP Transport connection open +             4 - BGP Transport connection closed +             5 - BGP Transport connection open failed +             6 - BGP Transport fatal error +             7 - ConnectRetry timer expired +             8 - Holdtime timer expired +             9 - KeepAlive timer expired +            10 - Receive OPEN message +            11 - Receive KEEPALIVE message +            12 - Receive UPDATE messages +            13 - Receive NOTIFICATION message + +   The following table describes the state transitions of the BGP FSM +   and the actions triggered by these transitions. + + + + + + +Lougheed & Rekhter                                             [Page 27] + +RFC 1267                         BGP-3                      October 1991 + + +    Event                Actions               Message Sent   Next State +    -------------------------------------------------------------------- +    Idle (1) +     1            Initialize resources            none             2 +                  Start ConnectRetry timer +                  Initiate a transport connection +     others               none                    none             1 + +    Connect(2) +     1                    none                    none             2 +     3            Complete initialization         OPEN             4 +                  Clear ConnectRetry timer +     5            Restart ConnectRetry timer      none             3 +     7            Restart ConnectRetry timer      none             2 +                  Initiate a transport connection +     others       Release resources               none             1 + +    Active (3) +     1                    none                    none             3 +     3            Complete initialization         OPEN             4 +                  Clear ConnectRetry timer +     5            Close connection                                 3 +                  Restart ConnectRetry timer +     7            Restart ConnectRetry timer      none             2 +                  Initiate a transport connection +     others       Release resources               none             1 + +    OpenSent(4) +     1                    none                    none             4 +     4            Close transport connection      none             3 +                  Restart ConnectRetry timer +     6            Release resources               none             1 +    10            Process OPEN is OK            KEEPALIVE          5 +                  Process OPEN failed           NOTIFICATION       1 +    others        Close transport connection    NOTIFICATION       1 +                  Release resources + +    OpenConfirm (5) +     1                   none                     none             5 +     4            Release resources               none             1 +     6            Release resources               none             1 +     9            Restart KeepAlive timer       KEEPALIVE          5 +    11            Complete initialization         none             6 +                  Restart Holdtime timer +    13            Close transport connection                       1 +                  Release resources +    others        Close transport connection    NOTIFICATION       1 +                  Release resources + + + +Lougheed & Rekhter                                             [Page 28] + +RFC 1267                         BGP-3                      October 1991 + + +    Established (6) +     1                   none                     none             6 +     4            Release resources               none             1 +     6            Release resources               none             1 +     9            Restart KeepAlive timer       KEEPALIVE          6 +    11            Restart Holdtime timer        KEEPALIVE          6 +    12            Process UPDATE is OK          UPDATE             6 +                  Process UPDATE failed         NOTIFICATION       1 +    13            Close transport connection                       1 +                  Release resources +    others        Close transport connection    NOTIFICATION       1 +                  Release resources +   --------------------------------------------------------------------- + +   The following is a condensed version of the above state transition +   table. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Lougheed & Rekhter                                             [Page 29] + +RFC 1267                         BGP-3                      October 1991 + + +Events| Idle | Active | Connect | OpenSent | OpenConfirm | Estab +      | (1)  |   (2)  |  (3)    |    (4)   |     (5)     |   (6) +      |-------------------------------------------------------------- + 1    |  2   |    2   |   3     |     4    |      5      |    6 +      |      |        |         |          |             | + 2    |  1   |    1   |   1     |     1    |      1      |    1 +      |      |        |         |          |             | + 3    |  1   |    4   |   4     |     1    |      1      |    1 +      |      |        |         |          |             | + 4    |  1   |    1   |   1     |     3    |      1      |    1 +      |      |        |         |          |             | + 5    |  1   |    3   |   3     |     1    |      1      |    1 +      |      |        |         |          |             | + 6    |  1   |    1   |   1     |     1    |      1      |    1 +      |      |        |         |          |             | + 7    |  1   |    2   |   2     |     1    |      1      |    1 +      |      |        |         |          |             | + 8    |  1   |    1   |   1     |     1    |      1      |    1 +      |      |        |         |          |             | + 9    |  1   |    1   |   1     |     1    |      5      |    6 +      |      |        |         |          |             | +10    |  1   |    1   |   1     |  1 or 5  |      1      |    1 +      |      |        |         |          |             | +11    |  1   |    1   |   1     |     1    |      6      |    6 +      |      |        |         |          |             | +12    |  1   |    1   |   1     |     1    |      1      | 1 or 6 +      |      |        |         |          |             | +13    |  1   |    1   |   1     |     1    |      1      |    1 +      |      |        |         |          |             | +      --------------------------------------------------------------- + +Appendix 2.  Comparison with RFC 1163 + +   To detect and recover from BGP connection collision, a new field (BGP +   Identifier) has been added to the OPEN message. New text (Section +   6.8) has been added to specify the procedure for detecting and +   recovering from collision. + +   The new document no longer restricts the border router that is passed +   in the NEXT_HOP path attribute to be part of the same Autonomous +   System as the BGP Speaker. + +   New document optimizes and simplifies the exchange of the information +   about previously reachable routes. + +Appendix 3.  Comparison with RFC 1105 + +   All of the changes listed in Appendix 2, plus the following. + + + +Lougheed & Rekhter                                             [Page 30] + +RFC 1267                         BGP-3                      October 1991 + + +   Minor changes to the RFC1105 Finite State Machine were necessary to +   accommodate the TCP user interface provided by 4.3 BSD. + +   The notion of Up/Down/Horizontal relations present in RFC1105 has +   been removed from the protocol. + +   The changes in the message format from RFC1105 are as follows: + +      1.  The Hold Time field has been removed from the BGP header and +          added to the OPEN message. + +      2.  The version field has been removed from the BGP header and +          added to the OPEN message. + +      3.  The Link Type field has been removed from the OPEN message. + +      4.  The OPEN CONFIRM message has been eliminated and replaced +          with implicit confirmation provided by the KEEPALIVE message. + +      5.  The format of the UPDATE message has been changed +          significantly.  New fields were added to the UPDATE message +          to support multiple path attributes. + +      6.  The Marker field has been expanded and its role broadened to +          support authentication. + +   Note that quite often BGP, as specified in RFC 1105, is referred to +   as BGP-1, BGP, as specified in RFC 1163, is referred to as BGP-2, and +   BGP, as specified in this document is referred to as BGP-3. + +Appendix 4.  TCP options that may be used with BGP + +   If a local system TCP user interface supports TCP PUSH function, then +   each BGP message should be transmitted with PUSH flag set.  Setting +   PUSH flag forces BGP messages to be transmitted promptly to the +   receiver. + +   If a local system TCP user interface supports setting precedence for +   TCP connection, then the BGP transport connection should be opened +   with precedence set to Internetwork Control (110) value (see also +   [6]). + + + + + + + + + + +Lougheed & Rekhter                                             [Page 31] + +RFC 1267                         BGP-3                      October 1991 + + +Appendix 5.  Implementation Recommendations + +   This section presents some implementation recommendations. + +5.1 Multiple Networks Per Message + +   The BGP protocol allows for multiple networks with the same AS path +   and next-hop gateway to be specified in one message. Making use of +   this capability is highly recommended. With one network per message +   there is a substantial increase in overhead in the receiver. Not only +   does the system overhead increase due to the reception of multiple +   messages, but the overhead of scanning the routing table for flash +   updates to BGP peers and other routing protocols (and sending the +   associated messages) is incurred multiple times as well. One method +   of building messages containing many networks per AS path and gateway +   from a routing table that is not organized per AS path is to build +   many messages as the routing table is scanned. As each network is +   processed, a message for the associated AS path and gateway is +   allocated, if it does not exist, and the new network is added to it. +   If such a message exists, the new network is just appended to it. If +   the message lacks the space to hold the new network, it is +   transmitted, a new message is allocated, and the new network is +   inserted into the new message. When the entire routing table has been +   scanned, all allocated messages are sent and their resources +   released.  Maximum compression is achieved when all networks share a +   gateway and common path attributes, making it possible to send many +   networks in one 4096-byte message. + +   When peering with a BGP implementation that does not compress +   multiple networks into one message, it may be necessary to take steps +   to reduce the overhead from the flood of data received when a peer is +   acquired or a significant network topology change occurs. One method +   of doing this is to limit the rate of flash updates. This will +   eliminate the redundant scanning of the routing table to provide +   flash updates for BGP peers and other routing protocols. A +   disadvantage of this approach is that it increases the propagation +   latency of routing information.  By choosing a minimum flash update +   interval that is not much greater than the time it takes to process +   the multiple messages this latency should be minimized. A better +   method would be to read all received messages before sending updates. + +5.2  Processing Messages on a Stream Protocol + +   BGP uses TCP as a transport mechanism.  Due to the stream nature of +   TCP, all the data for received messages does not necessarily arrive +   at the same time. This can make it difficult to process the data as +   messages, especially on systems such as BSD Unix where it is not +   possible to determine how much data has been received but not yet + + + +Lougheed & Rekhter                                             [Page 32] + +RFC 1267                         BGP-3                      October 1991 + + +   processed. + +   One method that can be used in this situation is to first try to read +   just the message header. For the KEEPALIVE message type, this is a +   complete message; for other message types, the header should first be +   verified, in particular the total length. If all checks are +   successful, the specified length, minus the size of the message +   header is the amount of data left to read. An implementation that +   would "hang" the routing information process while trying to read +   from a peer could set up a message buffer (4096 bytes) per peer and +   fill it with data as available until a complete message has been +   received. + +5.3 Processing Update Messages + +   In BGP, all UPDATE messages are incremental. Once a particular +   network is listed in an Update message as being reachable through an +   AS path and gateway, that piece of information is expected to be +   retained indefinitely. + +   In order for a route to a network to be removed, it must be +   explicitly listed in an Update message as being unreachable or with +   new routing information to replace the old. Note that a BGP peer will +   only advertise one route to a given network, so any announcement of +   that network by a particular peer replaces any previous information +   about that network received from the same peer. + +   One useful optimization is that unreachable networks need not be +   advertised with their original attributes.  Instead, all unreachable +   networks could be sent in a single message, perhaps with an AS path +   consisting of the local AS only and with an origin set to INCOMPLETE. + +   This approach has the obvious advantage of low overhead; if all +   routes are stable, only KEEPALIVE messages will be sent. There is no +   periodic flood of route information. + +   However, this means that a consistent view of routing information +   between BGP peers is only possible over the course of a single +   transport connection, since there is no mechanism for a complete +   update. This requirement is accommodated by specifying that BGP peers +   must transition to the Idle state upon the failure of a transport +   connection. + +5.4 BGP Timers + +      BGP employs three timers: ConnectRetry, Holdtime, and KeepAlive. +      Suggested value for the ConnectRetry timer is 120 seconds. +      Suggested value for the Holdtime timer is 90 seconds. + + + +Lougheed & Rekhter                                             [Page 33] + +RFC 1267                         BGP-3                      October 1991 + + +      Suggested value for the KeepAlive timer is 30 seconds. +      An implementation of BGP shall allow any of these timers to be +      configurable. + +5.5 Frequency of Route Selection + +   An implementation of BGP shall allow a border router to set up the +   minimum amount of time that must elapse between selection and +   subsequent advertisement of better routes received by a given BGP +   speaker from BGP speakers located in adjacent ASs. + +   Since fast convergence is needed within an AS, deferring selection +   does not apply to selection of better routes chosen as a result of +   UPDATEs from BGP speakers located in the advertising speaker's own +   AS.  To avoid long-lived black holes, it does not apply to +   advertisement of previously selected routes which have become +   unreachable. In both of these situations, the local BGP speaker must +   select and advertise such routes immediately. + +   If a BGP speaker received better routes from BGP speakers in adjacent +   ASs, but have not yet advertised them because the time has not yet +   elapsed, the reception of any routes from other BGP speakers in its +   own AS shall trigger a new route selection process that will be based +   on both updates from BGP speakers in the same AS and in adjacent ASs. + +References + +   [1] Mills, D., "Exterior Gateway Protocol Formal Specification", RFC +       904, BBN, April 1984. + +   [2] Rekhter, Y., "EGP and Policy Based Routing in the New NSFNET +       Backbone", RFC 1092, T.J. Watson Research Center, February 1989. + +   [3] Braun, H-W., "The NSFNET Routing Architecture", RFC 1093, +       MERIT/NSFNET Project, February 1989. + +   [4] Postel, J., "Transmission Control Protocol - DARPA Internet +       Program Protocol Specification", RFC 793, DARPA, September 1981. + +   [5] Rekhter, Y., and P. Gross, "Application of the Border Gateway +       Protocol in the Internet", RFC 1268, T.J. Watson Research Center, +       IBM Corp., ANS, October 1991. + +   [6] Postel, J., "Internet Protocol - DARPA Internet Program Protocol +       Specification", RFC 791, DARPA, September 1981. + + + + + + +Lougheed & Rekhter                                             [Page 34] + +RFC 1267                         BGP-3                      October 1991 + + +Security Considerations + +   Security issues are not discussed in this memo. + +Authors' Addresses + +   Kirk Lougheed +   cisco Systems, Inc. +   1525 O'Brien Drive +   Menlo Park, CA 94025 + +   Phone:  (415) 326-1941 +   Email:  LOUGHEED@CISCO.COM + + +   Yakov Rekhter +   T.J. Watson Research Center IBM Corporation +   P.O. Box 218 +   Yorktown Heights, NY 10598 + +   Phone:  (914) 945-3896 +   Email:  YAKOV@WATSON.IBM.COM + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Lougheed & Rekhter                                             [Page 35] +
\ No newline at end of file  |