diff options
Diffstat (limited to 'doc/rfc/rfc1267.txt')
-rw-r--r-- | doc/rfc/rfc1267.txt | 1963 |
1 files changed, 1963 insertions, 0 deletions
diff --git a/doc/rfc/rfc1267.txt b/doc/rfc/rfc1267.txt new file mode 100644 index 0000000..70b6704 --- /dev/null +++ b/doc/rfc/rfc1267.txt @@ -0,0 +1,1963 @@ + + + + + + +Network Working Group K. Lougheed +Request for Comments: 1267 cisco Systems +Obsoletes RFCs: 1105, 1163 Y. Rekhter + T.J. Watson Research Center, IBM Corp. + October 1991 + + + A Border Gateway Protocol 3 (BGP-3) + +Status of this Memo + + This memo, together with its companion document, "Application of the + Border Gateway Protocol in the Internet", define an inter-autonomous + system routing protocol for the Internet. This RFC specifies an IAB + standards track protocol for the Internet community, and requests + discussion and suggestions for improvements. Please refer to the + current edition of the "IAB Official Protocol Standards" for the + standardization state and status of this protocol. Distribution of + this memo is unlimited. + +1. Acknowledgements + + We would like to express our thanks to Guy Almes (Rice University), + Len Bosack (cisco Systems), Jeffrey C. Honig (Cornell Theory Center) + and all members of the Interconnectivity Working Group of the + Internet Engineering Task Force, chaired by Guy Almes, for their + contributions to this document. + + We like to explicitly thank Bob Braden (ISI) for the review of this + document as well as his constructive and valuable comments. + + We would also like to thank Bob Hinden, Director for Routing of the + Internet Engineering Steering Group, and the team of reviewers he + assembled to review earlier versions of this document. This team, + consisting of Deborah Estrin, Milo Medin, John Moy, Radia Perlman, + Martha Steenstrup, Mike St. Johns, and Paul Tsuchiya, acted with a + strong combination of toughness, professionalism, and courtesy. + +2. Introduction + + The Border Gateway Protocol (BGP) is an inter-Autonomous System + routing protocol. It is built on experience gained with EGP as + defined in RFC 904 [1] and EGP usage in the NSFNET Backbone as + described in RFC 1092 [2] and RFC 1093 [3]. + + The primary function of a BGP speaking system is to exchange network + reachability information with other BGP systems. This network + reachability information includes information on the full path of + + + +Lougheed & Rekhter [Page 1] + +RFC 1267 BGP-3 October 1991 + + + Autonomous Systems (ASs) that traffic must transit to reach these + networks. This information is sufficient to construct a graph of AS + connectivity from which routing loops may be pruned and some policy + decisions at the AS level may be enforced. + + To characterize the set of policy decisions that can be enforced + using BGP, one must focus on the rule that an AS advertize to its + neighbor ASs only those routes that it itself uses. This rule + reflects the "hop-by-hop" routing paradigm generally used throughout + the current Internet. Note that some policies cannot be supported by + the "hop-by-hop" routing paradigm and thus require techniques such as + source routing to enforce. For example, BGP does not enable one AS + to send traffic to a neighbor AS intending that that traffic take a + different route from that taken by traffic originating in the + neighbor AS. On the other hand, BGP can support any policy + conforming to the "hop-by-hop" routing paradigm. Since the current + Internet uses only the "hop-by-hop" routing paradigm and since BGP + can support any policy that conforms to that paradigm, BGP is highly + applicable as an inter-AS routing protocol for the current Internet. + + A more complete discussion of what policies can and cannot be + enforced with BGP is outside the scope of this document (but refer to + the companion document discussing BGP usage [5]). + + BGP runs over a reliable transport protocol. This eliminates the + need to implement explicit update fragmentation, retransmission, + acknowledgement, and sequencing. Any authentication scheme used by + the transport protocol may be used in addition to BGP's own + authentication mechanisms. The error notification mechanism used in + BGP assumes that the transport protocol supports a "graceful" close, + i.e., that all outstanding data will be delivered before the + connection is closed. + + BGP uses TCP [4] as its transport protocol. TCP meets BGP's + transport requirements and is present in virtually all commercial + routers and hosts. In the following descriptions the phrase + "transport protocol connection" can be understood to refer to a TCP + connection. BGP uses TCP port 179 for establishing its connections. + + This memo uses the term `Autonomous System' (AS) throughout. The + classic definition of an Autonomous System is a set of routers under + a single technical administration, using an interior gateway protocol + and common metrics to route packets within the AS, and using an + exterior gateway protocol to route packets to other ASs. Since this + classic definition was developed, it has become common for a single + AS to use several interior gateway protocols and sometimes several + sets of metrics within an AS. The use of the term Autonomous System + here stresses the fact that, even when multiple IGPs and metrics are + + + +Lougheed & Rekhter [Page 2] + +RFC 1267 BGP-3 October 1991 + + + used, the administration of an AS appears to other ASs to have a + single coherent interior routing plan and presents a consistent + picture of what networks are reachable through it. From the + standpoint of exterior routing, an AS can be viewed as monolithic: + reachability to networks directly connected to the AS must be + equivalent from all border gateways of the AS. + + The planned use of BGP in the Internet environment, including such + issues as topology, the interaction between BGP and IGPs, and the + enforcement of routing policy rules is presented in a companion + document [5]. This document is the first of a series of documents + planned to explore various aspects of BGP application. + + Please send comments to the BGP mailing list (iwg@rice.edu). + +3. Summary of Operation + + Two systems form a transport protocol connection between one another. + They exchange messages to open and confirm the connection parameters. + The initial data flow is the entire BGP routing table. Incremental + updates are sent as the routing tables change. BGP does not require + periodic refresh of the entire BGP routing table. Therefore, a BGP + speaker must retain the current version of the entire BGP routing + tables of all of its peers for the duration of the connection. + KeepAlive messages are sent periodically to ensure the liveness of + the connection. Notification messages are sent in response to errors + or special conditions. If a connection encounters an error + condition, a notification message is sent and the connection is + closed. + + The hosts executing the Border Gateway Protocol need not be routers. + A non-routing host could exchange routing information with routers + via EGP or even an interior routing protocol. That non-routing host + could then use BGP to exchange routing information with a border + router in another Autonomous System. The implications and + applications of this architecture are for further study. + + If a particular AS has multiple BGP speakers and is providing transit + service for other ASs, then care must be taken to ensure a consistent + view of routing within the AS. A consistent view of the interior + routes of the AS is provided by the interior routing protocol. A + consistent view of the routes exterior to the AS can be provided by + having all BGP speakers within the AS maintain direct BGP connections + with each other. Using a common set of policies, the BGP speakers + arrive at an agreement as to which border routers will serve as + exit/entry points for particular networks outside the AS. This + information is communicated to the AS's internal routers, possibly + via the interior routing protocol. Care must be taken to ensure that + + + +Lougheed & Rekhter [Page 3] + +RFC 1267 BGP-3 October 1991 + + + the interior routers have all been updated with transit information + before the BGP speakers announce to other ASs that transit service is + being provided. + + Connections between BGP speakers of different ASs are referred to as + "external" links. BGP connections between BGP speakers within the + same AS are referred to as "internal" links. + +4. Message Formats + + This section describes message formats used by BGP. + + Messages are sent over a reliable transport protocol connection. A + message is processed only after it is entirely received. The maximum + message size is 4096 octets. All implementations are required to + support this maximum message size. The smallest message that may be + sent consists of a BGP header without a data portion, or 19 octets. + + 4.1 Message Header Format + + Each message has a fixed-size header. There may or may not be a data + portion following the header, depending on the message type. The + layout of these fields is shown below: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + + + + | | + + + + | Marker | + + + + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Length | Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Marker: + + This 16-octet field contains a value that the receiver of the + message can predict. If the Type of the message is OPEN, or if + the Authentication Code used in the OPEN message of the connection + is zero, then the Marker must be all ones. Otherwise, the value + of the marker can be predicted by some a computation specified as + part of the authentication mechanism used. The Marker can be used + to detect loss of synchronization between a pair of BGP peers, and + to authenticate incoming BGP messages. + + + +Lougheed & Rekhter [Page 4] + +RFC 1267 BGP-3 October 1991 + + + Length: + + This 2-octet unsigned integer indicates the total length of the + message, including the header, in octets. Thus, e.g., it allows + one to locate in the transport-level stream the (Marker field of + the) next message. The value of the Length field must always be + at least 19 and no greater than 4096, and may be further + constrained, depending on the message type. No "padding" of extra + data after the message is allowed, so the Length field must have + the smallest value required given the rest of the message. + + Type: + + This 1-octet unsigned integer indicates the type code of the + message. The following type codes are defined: + + 1 - OPEN + 2 - UPDATE + 3 - NOTIFICATION + 4 - KEEPALIVE + +4.2 OPEN Message Format + + After a transport protocol connection is established, the first + message sent by each side is an OPEN message. If the OPEN message is + acceptable, a KEEPALIVE message confirming the OPEN is sent back. + Once the OPEN is confirmed, UPDATE, KEEPALIVE, and NOTIFICATION + messages may be exchanged. + + In addition to the fixed-size BGP header, the OPEN message contains + the following fields: + + + + + + + + + + + + + + + + + + + + +Lougheed & Rekhter [Page 5] + +RFC 1267 BGP-3 October 1991 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+ + | Version | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | My Autonomous System | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hold Time | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | BGP Identifier | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Auth. Code | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | Authentication Data | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Version: + + This 1-octet unsigned integer indicates the protocol version + number of the message. The current BGP version number is 3. + + My Autonomous System: + + This 2-octet unsigned integer indicates the Autonomous System + number of the sender. + + Hold Time: + + This 2-octet unsigned integer indicates the maximum number of + seconds that may elapse between the receipt of successive + KEEPALIVE and/or UPDATE and/or NOTIFICATION messages. + + + BGP Identifier: + This 4-octet unsigned integer indicates the BGP Identifier of + the sender. A given BGP speaker sets the value of its BGP + Identifier to the IP address of one of its interfaces. + The value of the BGP Identifier is determined on startup + and is the same for every local interface and every BGP peer. + + Authentication Code: + + This 1-octet unsigned integer indicates the authentication + mechanism being used. Whenever an authentication mechanism is + specified for use within BGP, three things must be included in the + specification: + + + +Lougheed & Rekhter [Page 6] + +RFC 1267 BGP-3 October 1991 + + + - the value of the Authentication Code which indicates use of + the mechanism, + - the form and meaning of the Authentication Data, and + - the algorithm for computing values of Marker fields. + Only one authentication mechanism is specified as part of this + memo: + - its Authentication Code is zero, + - its Authentication Data must be empty (of zero length), and + - the Marker fields of all messages must be all ones. + The semantics of non-zero Authentication Codes lies outside the + scope of this memo. + + Note that a separate authentication mechanism may be used in + establishing the transport level connection. + + Authentication Data: + + The form and meaning of this field is a variable-length field + depend on the Authentication Code. If the value of Authentication + Code field is zero, the Authentication Data field must have zero + length. The semantics of the non-zero length Authentication Data + field is outside the scope of this memo. + + Note that the length of the Authentication Data field can be + determined from the message Length field by the formula: + + Message Length = 29 + Authentication Data Length + + The minimum length of the OPEN message is 29 octets (including + message header). + +4.3 UPDATE Message Format + + UPDATE messages are used to transfer routing information between BGP + peers. The information in the UPDATE packet can be used to construct + a graph describing the relationships of the various Autonomous + Systems. By applying rules to be discussed, routing information + loops and some other anomalies may be detected and removed from + inter-AS routing. + + In addition to the fixed-size BGP header, the UPDATE message contains + the following fields (note that all fields may have arbitrary + alignment): + + + + + + + + +Lougheed & Rekhter [Page 7] + +RFC 1267 BGP-3 October 1991 + + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Total Path Attributes Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + / Path Attributes / + / / + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Network 1 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + / / + / / + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Network n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Total Path Attribute Length: + + This 2-octet unsigned integer indicates the total length of the + Path Attributes field in octets. Its value must allow the (non- + negative integer) number of Network fields to be determined as + specified below. + + Path Attributes: + + A variable length sequence of path attributes is present in every + UPDATE. Each path attribute is a triple <attribute type, + attribute length, attribute value> of variable length. + + Attribute Type is a two-octet field that consists of the Attribute + Flags octet followed by the Attribute Type Code octet. + + 0 1 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Attr. Flags |Attr. Type Code| + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + The high-order bit (bit 0) of the Attribute Flags octet is the + Optional bit. It defines whether the attribute is optional (if + set to 1) or well-known (if set to 0). + + The second high-order bit (bit 1) of the Attribute Flags octet is + the Transitive bit. It defines whether an optional attribute is + transitive (if set to 1) or non-transitive (if set to 0). For + well-known attributes, the Transitive bit must be set to 1. (See + Section 5 for a discussion of transitive attributes.) + + + +Lougheed & Rekhter [Page 8] + +RFC 1267 BGP-3 October 1991 + + + The third high-order bit (bit 2) of the Attribute Flags octet is + the Partial bit. It defines whether the information contained in + the optional transitive attribute is partial (if set to 1) or + complete (if set to 0). For well-known attributes and for + optional non-transitive attributes the Partial bit must be set to + 0. + + The fourth high-order bit (bit 3) of the Attribute Flags octet is + the Extended Length bit. It defines whether the Attribute Length + is one octet (if set to 0) or two octets (if set to 1). Extended + Length may be used only if the length of the attribute value is + greater than 255 octets. + + The lower-order four bits of the Attribute Flags octet are unused. + They must be zero (and must be ignored when received). + + The Attribute Type Code octet contains the Attribute Type Code. + Currently defined Attribute Type Codes are discussed in Section 5. + + If the Extended Length bit of the Attribute Flags octet is set to + 0, the third octet of the Path Attribute contains the length of + the attribute data in octets. + + If the Extended Length bit of the Attribute Flags octet is set to + 1, then the third and the fourth octets of the path attribute + contain the length of the attribute data in octets. + + The remaining octets of the Path Attribute represent the attribute + value and are interpreted according to the Attribute Flags and the + Attribute Type Code. + + The meaning and handling of Path Attributes is discussed in + Section 5. + + Network: + + Each 4-octet Internet network number indicates one network whose + Inter-Autonomous System routing is described by the Path + Attributes. Subnets and host addresses are specifically not + allowed. The total number of Network fields in the UPDATE message + can be determined by the formula: + + Message Length = 19 + Total Path Attribute Length + 4 * #Nets + + The message Length field of the message header and the Path + Attributes Length field of the UPDATE message must be such that + the formula results in a non-negative integer number of Network + fields. + + + +Lougheed & Rekhter [Page 9] + +RFC 1267 BGP-3 October 1991 + + + The minimum length of the UPDATE message is 37 octets (including + message header). + +4.4 KEEPALIVE Message Format + + BGP does not use any transport protocol-based keep-alive mechanism to + determine if peers are reachable. Instead, KEEPALIVE messages are + exchanged between peers often enough as not to cause the hold time + (as advertised in the OPEN message) to expire. A reasonable maximum + time between KEEPALIVE messages would be one third of the Hold Time + interval. + + KEEPALIVE message consists of only message header and has a length of + 19 octets. + +4.5 NOTIFICATION Message Format + + A NOTIFICATION message is sent when an error condition is detected. + The BGP connection is closed immediately after sending it. + + In addition to the fixed-size BGP header, the NOTIFICATION message + contains the following fields: + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Error code | Error subcode | Data | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + + Error Code: + + This 1-octet unsigned integer indicates the type of NOTIFICATION. + The following Error Codes have been defined: + + Error Code Symbolic Name Reference + + 1 Message Header Error Section 6.1 + 2 OPEN Message Error Section 6.2 + 3 UPDATE Message Error Section 6.3 + 4 Hold Timer Expired Section 6.5 + 5 Finite State Machine Error Section 6.6 + 6 Cease Section 6.7 + + + + + + +Lougheed & Rekhter [Page 10] + +RFC 1267 BGP-3 October 1991 + + + Error subcode: + + This 1-octet unsigned integer provides more specific information + about the nature of the reported error. Each Error Code may have + one or more Error Subcodes associated with it. If no appropriate + Error Subcode is defined, then a zero (Unspecific) value is used + for the Error Subcode field. + + Message Header Error subcodes: + + 1 - Connection Not Synchronized. + 2 - Bad Message Length. + 3 - Bad Message Type. + + OPEN Message Error subcodes: + + 1 - Unsupported Version Number. + 2 - Bad Peer AS. + 3 - Bad BGP Identifier. + 4 - Unsupported Authentication Code. + 5 - Authentication Failure. + + UPDATE Message Error subcodes: + + 1 - Malformed Attribute List. + 2 - Unrecognized Well-known Attribute. + 3 - Missing Well-known Attribute. + 4 - Attribute Flags Error. + 5 - Attribute Length Error. + 6 - Invalid ORIGIN Attribute + 7 - AS Routing Loop. + 8 - Invalid NEXT_HOP Attribute. + 9 - Optional Attribute Error. + 10 - Invalid Network Field. + + + Data: + + This variable-length field is used to diagnose the reason for the + NOTIFICATION. The contents of the Data field depend upon the + Error Code and Error Subcode. See Section 6 below for more + details. + + Note that the length of the Data field can be determined from the + message Length field by the formula: + + Message Length = 21 + Data Length + + + + +Lougheed & Rekhter [Page 11] + +RFC 1267 BGP-3 October 1991 + + + The minimum length of the NOTIFICATION message is 21 octets + (including message header). + +5. Path Attributes + + This section discusses the path attributes of the UPDATE message. + + Path attributes fall into four separate categories: + + 1. Well-known mandatory. + 2. Well-known discretionary. + 3. Optional transitive. + 4. Optional non-transitive. + + Well-known attributes must be recognized by all BGP implementations. + Some of these attributes are mandatory and must be included in every + UPDATE message. Others are discretionary and may or may not be sent + in a particular UPDATE message. Which well-known attributes are + mandatory or discretionary is noted in the table below. + + All well-known attributes must be passed along (after proper + updating, if necessary) to other BGP peers. + + In addition to well-known attributes, each path may contain one or + more optional attributes. It is not required or expected that all + BGP implementations support all optional attributes. The handling of + an unrecognized optional attribute is determined by the setting of + the Transitive bit in the attribute flags octet. Paths with + unrecognized transitive optional attributes should be accepted. If a + path with unrecognized transitive optional attribute is accepted and + passed along to other BGP peers, then the unrecognized transitive + optional attribute of that path must be passed along with the path to + other BGP peers with the Partial bit in the Attribute Flags octet set + to 1. If a path with recognized transitive optional attribute is + accepted and passed along to other BGP peers and the Partial bit in + the Attribute Flags octet is set to 1 by some previous AS, it is not + set back to 0 by the current AS. Unrecognized non-transitive optional + attributes must be quietly ignored and not passed along to other BGP + peers. + + New transitive optional attributes may be attached to the path by the + originator or by any other AS in the path. If they are not attached + by the originator, the Partial bit in the Attribute Flags octet is + set to 1. The rules for attaching new non-transitive optional + attributes will depend on the nature of the specific attribute. The + documentation of each new non-transitive optional attribute will be + expected to include such rules. (The description of the INTER-AS + METRIC attribute gives an example.) All optional attributes (both + + + +Lougheed & Rekhter [Page 12] + +RFC 1267 BGP-3 October 1991 + + + transitive and non-transitive) may be updated (if appropriate) by ASs + in the path. + + The sender of an UPDATE message should order path attributes within + the UPDATE message in ascending order of attribute type. The + receiver of an UPDATE message must be prepared to handle path + attributes within the UPDATE message that are out of order. + + The same attribute cannot appear more than once within the Path + Attributes field of a particular UPDATE message. + + Following table specifies attribute type code, attribute length, and + attribute category for path attributes defined in this document: + + Attribute Name Type Code Length Attribute category + ORIGIN 1 1 well-known, mandatory + AS_PATH 2 variable well-known, mandatory + NEXT_HOP 3 4 well-known, mandatory + UNREACHABLE 4 0 well-known, discretionary + INTER-AS METRIC 5 2 optional, non-transitive + + ORIGIN: + + The ORIGIN path attribute defines the origin of the path + information. The data octet can assume the following values: + + Value Meaning + 0 IGP - network(s) are interior to the originating AS + 1 EGP - network(s) learned via EGP + 2 INCOMPLETE - network(s) learned by some other means + + AS_PATH: + + The AS_PATH attribute enumerates the ASs that must be traversed to + reach the networks listed in the UPDATE message. Since an AS + identifier is 2 octets, the length of an AS_PATH attribute is + twice the number of ASs in the path. Rules for constructing an + AS_PATH attribute are discussed in Section 9. + + If a previously advertised route has become unreachable, then + the AS_PATH path attribute of the unreachable route may be + truncated when passed in the UPDATE message. Truncation is + achieved by constructing the AS_PATH path attribute that consists + of only the autonomous system of the sender of the UPDATE message. + To make the truncated AS_PATH semantically correct, the sender + also sends the ORIGIN path attribute with the value INCOMPLETE. + Note that truncation may be done only over external BGP links. + + + + +Lougheed & Rekhter [Page 13] + +RFC 1267 BGP-3 October 1991 + + + NEXT_HOP: + + The NEXT_HOP path attribute defines the IP address of the border + router that should be used as the next hop to the networks listed + in the UPDATE message. If this border router belongs to the same + AS as the BGP peer that advertises it, it is called an internal + border router. If this border router belongs to a different AS + than the one that the BGP peer that advertises it, it is called an + external border router. A BGP speaker can advertise any internal + border router as the next hop provided that the interface + associated with the IP address of this border router (as + specified in the NEXT_HOP path attribute) shares a common subnet + with both the local and remote BGP speakers. A BGP speaker can + advertise any external border router as the next hop, provided + that the IP address of this border router was learned from one + of the BGP speaker's peers, and the interface associated with + the IP address of this border router (as specified in the + NEXT_HOP path attribute) shares a common subnet with the local + and remote BGP speakers. A BGP speaker needs to be able to + support disabling advertisement of external border routers. + + The NEXT_HOP path attribute has meaning only on external BGP + links. However, presence of the NEXT_HOP path attribute in the + UPDATE message received via an internal BGP link does not + constitute an error. + + UNREACHABLE: + + The UNREACHABLE attribute is used to notify a BGP peer that some + of the previously advertised routes have become unreachable. + + INTER-AS METRIC: + + The INTER-AS METRIC attribute may be used on external (inter-AS) + links to discriminate between multiple exit or entry points to the + same neighboring AS. The value of the INTER-AS METRIC attribute + is a 2-octet unsigned number which is called a metric. All other + factors being equal, the exit or entry point with lower metric + should be preferred. If received over external links, the INTER- + AS METRIC attribute may be propagated over internal links to other + BGP speaker within the same AS. The INTER-AS METRIC attribute is + never propagated to other BGP speakers in neighboring AS's. + + If a previously advertised route has become unreachable, then + the INTER-AS METRIC path attribute may be omitted from the UPDATE + message. + + + + + +Lougheed & Rekhter [Page 14] + +RFC 1267 BGP-3 October 1991 + + +6. BGP Error Handling. + + This section describes actions to be taken when errors are detected + while processing BGP messages. + + When any of the conditions described here are detected, a + NOTIFICATION message with the indicated Error Code, Error Subcode, + and Data fields is sent, and the BGP connection is closed. If no + Error Subcode is specified, then a zero must be used. + + The phrase "the BGP connection is closed" means that the transport + protocol connection has been closed and that all resources for that + BGP connection have been deallocated. Routing table entries + associated with the remote peer are marked as invalid. The fact that + the routes have become invalid is passed to other BGP peers before + the routes are deleted from the system. + + Unless specified explicitly, the Data field of the NOTIFICATION + message that is sent to indicate an error is empty. + +6.1 Message Header error handling. + + All errors detected while processing the Message Header are indicated + by sending the NOTIFICATION message with Error Code Message Header + Error. The Error Subcode elaborates on the specific nature of the + error. + + The expected value of the Marker field of the message header is all + ones if the message type is OPEN. The expected value of the Marker + field for all other types of BGP messages determined based on the + Authentication Code in the BGP OPEN message and the actual + authentication mechanism (if the Authentication Code in the BGP OPEN + message is non-zero). If the Marker field of the message header is + not the expected one, then a synchronization error has occurred and + the Error Subcode is set to Connection Not Synchronized. + + If the Length field of the message header is less than 19 or greater + than 4096, or if the Length field of an OPEN message is less than + the minimum length of the OPEN message, or if the Length field of an + UPDATE message is less than the minimum length of the UPDATE message, + or if the Length field of a KEEPALIVE message is not equal to 19, or + if the Length field of a NOTIFICATION message is less than the + minimum length of the NOTIFICATION message, then the Error Subcode is + set to Bad Message Length. The Data field contains the erroneous + Length field. + + If the Type field of the message header is not recognized, then the + Error Subcode is set to Bad Message Type. The Data field contains + + + +Lougheed & Rekhter [Page 15] + +RFC 1267 BGP-3 October 1991 + + + the erroneous Type field. + +6.2 OPEN message error handling. + + All errors detected while processing the OPEN message are indicated + by sending the NOTIFICATION message with Error Code OPEN Message + Error. The Error Subcode elaborates on the specific nature of the + error. + + If the version number contained in the Version field of the received + OPEN message is not supported, then the Error Subcode is set to + Unsupported Version Number. The Data field is a 2-octet unsigned + integer, which indicates the largest locally supported version number + less than the version the remote BGP peer bid (as indicated in the + received OPEN message). + + If the Autonomous System field of the OPEN message is unacceptable, + then the Error Subcode is set to Bad Peer AS. The determination of + acceptable Autonomous System numbers is outside the scope of this + protocol. + + If the BGP Identifier field of the OPEN message is syntactically + incorrect, then the Error Subcode is set to Bad BGP Identifier. + Syntactic correctness means that the BGP Identifier field represents + a valid IP host address. + + If the Authentication Code of the OPEN message is not recognized, + then the Error Subcode is set to Unsupported Authentication Code. If + the Authentication Code is zero, then the Authentication Data must be + of zero length. Otherwise, the Error Subcode is set to + Authentication Failure. + + If the Authentication Code is non-zero, then the corresponding + authentication procedure is invoked. If the authentication procedure + (based on Authentication Code and Authentication Data) fails, then + the Error Subcode is set to Authentication Failure. + +6.3 UPDATE message error handling. + + All errors detected while processing the UPDATE message are indicated + by sending the NOTIFICATION message with Error Code UPDATE Message + Error. The error subcode elaborates on the specific nature of the + error. + + Error checking of an UPDATE message begins by examining the path + attributes. If the Total Attribute Length is too large (i.e., if + Total Attribute Length + 21 exceeds the message Length), or if the + (non-negative integer) Number of Network fields cannot be computed as + + + +Lougheed & Rekhter [Page 16] + +RFC 1267 BGP-3 October 1991 + + + in Section 4.3, then the Error Subcode is set to Malformed Attribute + List. + + If any recognized attribute has Attribute Flags that conflict with + the Attribute Type Code, then the Error Subcode is set to Attribute + Flags Error. The Data field contains the erroneous attribute (type, + length and value). + + If any recognized attribute has Attribute Length that conflicts with + the expected length (based on the attribute type code), then the + Error Subcode is set to Attribute Length Error. The Data field + contains the erroneous attribute (type, length and value). + + If any of the mandatory well-known attributes are not present, then + the Error Subcode is set to Missing Well-known Attribute. The Data + field contains the Attribute Type Code of the missing well-known + attribute. + + If any of the mandatory well-known attributes are not recognized, + then the Error Subcode is set to Unrecognized Well-known Attribute. + The Data field contains the unrecognized attribute (type, length and + value). + + If the ORIGIN attribute has an undefined value, then the Error + Subcode is set to Invalid Origin Attribute. The Data field contains + the unrecognized attribute (type, length and value). + + If the NEXT_HOP attribute field is syntactically or semantically + incorrect, then the Error Subcode is set to Invalid NEXT_HOP + Attribute. + + The Data field contains the incorrect attribute (type, length and + value). Syntactic correctness means that the NEXT_HOP attribute + represents a valid IP host address. Semantic correctness applies + only to the external BGP links. It means that the interface + associated with the IP address, as specified in the NEXT_HOP + attribute, shares a common subnet with the receiving BGP speaker. + + The AS route specified by the AS_PATH attribute is checked for AS + loops. AS loop detection is done by scanning the full AS route (as + specified in the AS_PATH attribute) and checking that each AS occurs + at most once. If a loop is detected, then the Error Subcode is set + to AS Routing Loop. The Data field contains the incorrect attribute + (type, length and value). + + If an optional attribute is recognized, then the value of this + attribute is checked. If an error is detected, the attribute is + discarded, and the Error Subcode is set to Optional Attribute Error. + + + +Lougheed & Rekhter [Page 17] + +RFC 1267 BGP-3 October 1991 + + + The Data field contains the attribute (type, length and value). + + If any attribute appears more than once in the UPDATE message, then + the Error Subcode is set to Malformed Attribute List. + + Each Network field in the UPDATE message is checked for syntactic + validity. If the Network field is syntactically incorrect, or + contains a subnet or a host address, then the Error Subcode is set to + Invalid Network Field. + +6.4 NOTIFICATION message error handling. + + If a peer sends a NOTIFICATION message, and there is an error in that + message, there is unfortunately no means of reporting this error via + a subsequent NOTIFICATION message. Any such error, such as an + unrecognized Error Code or Error Subcode, should be noticed, logged + locally, and brought to the attention of the administration of the + peer. The means to do this, however, lies outside the scope of this + document. + +6.5 Hold Timer Expired error handling. + + If a system does not receive successive KEEPALIVE and/or UPDATE + and/or NOTIFICATION messages within the period specified in the Hold + Time field of the OPEN message, then the NOTIFICATION message with + Hold Timer Expired Error Code must be sent and the BGP connection + closed. + +6.6 Finite State Machine error handling. + + Any error detected by the BGP Finite State Machine (e.g., receipt of + an unexpected event) is indicated by sending the NOTIFICATION message + with Error Code Finite State Machine Error. + +6.7 Cease. + + In absence of any fatal errors (that are indicated in this section), + a BGP peer may choose at any given time to close its BGP connection + by sending the NOTIFICATION message with Error Code Cease. However, + the Cease NOTIFICATION message must not be used when a fatal error + indicated by this section does exist. + +6.8 Connection collision detection. + + If a pair of BGP speakers try simultaneously to establish a TCP + connection to each other, then two parallel connections between this + pair of speakers might well be formed. We refer to this situation as + connection collision. Clearly, one of these connections must be + + + +Lougheed & Rekhter [Page 18] + +RFC 1267 BGP-3 October 1991 + + + closed. + + Based on the value of the BGP Identifier a convention is established + for detecting which BGP connection is to be preserved when a + collision does occur. The convention is to compare the BGP + Identifiers of the peers involved in the collision and to retain only + the connection initiated by the BGP speaker with the higher-valued + BGP Identifier. + + Upon receipt of an OPEN message, the local system must examine all of + its connections that are in the OpenSent state. If among them there + is a connection to a remote BGP speaker whose BGP Identifier equals + the one in the OPEN message, then the local system performs the + following collision resolution procedure: + + 1. The BGP Identifier of the local system is compared to the + BGP Identifier of the remote system (as specified in the + OPEN message). + + 2. If the value of the local BGP Identifier is less than the + remote one, the local system closes BGP connection that + already exists (the one that is already in the OpenSent + state), and accepts BGP connection initiated by the remote + system. + + 3. Otherwise, the local system closes newly created BGP + connection (the one associated with the newly received OPEN + message), and continues to use the existing one (the one + that is already in the OpenSent state). + + Comparing BGP Identifiers is done by treating them as + (4-octet long) unsigned integers. + + A connection collision with existing BGP connections that + are either in OpenConfirm or Established states causes + unconditional closing of the newly created connection. Note + that a connection collision cannot be detected with + connections that are in Idle, or Connect, or Active states. + + Closing the BGP connection (that results from the collision + resolution procedure) is accomplished by sending the + NOTIFICATION message with the Error Code Cease. + +7. BGP Version Negotiation. + + BGP speakers may negotiate the version of the protocol by making + multiple attempts to open a BGP connection, starting with the highest + version number each supports. If an open attempt fails with an Error + + + +Lougheed & Rekhter [Page 19] + +RFC 1267 BGP-3 October 1991 + + + Code OPEN Message Error, and an Error Subcode Unsupported Version + Number, then the BGP speaker has available the version number it + tried, the version number its peer tried, the version number passed + by its peer in the NOTIFICATION message, and the version numbers that + it supports. If the two peers do support one or more common + versions, then this will allow them to rapidly determine the highest + common version. In order to support BGP version negotiation, future + versions of BGP must retain the format of the OPEN and NOTIFICATION + messages. + +8. BGP Finite State machine. + + This section specifies BGP operation in terms of a Finite State + Machine (FSM). Following is a brief summary and overview of BGP + operations by state as determined by this FSM. A condensed version + of the BGP FSM is found in Appendix 1. + + Initially BGP is in the Idle state. + + Idle state: + + In this state BGP refuses all incoming BGP connections. No + resources are allocated to the BGP neighbor. In response to + the Start event (initiated by either system or operator) the + local system initializes all BGP resources, starts the + ConnectRetry timer, initiates a transport connection to other + BGP peer, while listening for connection that may be initiated + by the remote BGP peer, and changes its state to Connect. + The exact value of the ConnectRetry timer is a local matter, + but should be sufficiently large to allow TCP initialization. + + Any other event received in the Idle state is ignored. + + Connect state: + + In this state BGP is waiting for the transport protocol + connection to be completed. + + If the transport protocol connection succeeds, the local system + clears the ConnectRetry timer, completes initialization, sends + an OPEN message to its peer, and changes its state to OpenSent. + + If the transport protocol connect fails (e.g., retransmission + timeout), the local system restarts the ConnectRetry timer, + continues to listen for a connection that may be initiated by + the remote BGP peer, and changes its state to Active state. + + In response to the ConnectRetry timer expired event, the local + + + +Lougheed & Rekhter [Page 20] + +RFC 1267 BGP-3 October 1991 + + + system restarts the ConnectRetry timer, initiates a transport + connection to other BGP peer, continues to listen for a + connection that may be initiated by the remote BGP peer, and + stays in the Connect state. + + Start event is ignored in the Active state. + + In response to any other event (initiated by either system or + operator), the local system releases all BGP resources + associated with this connection and changes its state to Idle. + + Active state: + + In this state BGP is trying to acquire a BGP neighbor by + initiating a transport protocol connection. + + If the transport protocol connection succeeds, the local system + clears the ConnectRetry timer, completes initialization, sends + an OPEN message to its peer, sets its hold timer to a large + value, and changes its state to OpenSent. + + In response to the ConnectRetry timer expired event, the local + system restarts the ConnectRetry timer, initiates a transport + connection to other BGP peer, continues to listen for a + connection that may be be initiated by the remote BGP peer, and + changes its state to Connect. + + If the local system detects that a remote peer is trying to + establish BGP connection to it, and the IP address of the + remote peer is not an expected one, the local system restarts + the ConnectRetry timer, rejects the attempted connection, + continues to listen for a connection that may be initiated by + the remote BGP peer, and stays in the Active state. + + Start event is ignored in the Active state. + + In response to any other event (initiated by either system or + operator), the local system releases all BGP resources + associated with this connection and changes its state to Idle. + + OpenSent state: + + In this state BGP waits for an OPEN message from its peer. + When an OPEN message is received, all fields are checked for + correctness. If the BGP message header checking or OPEN + message checking detects an error (see Section 6.2), or + a connection collision (see Section 6.8) the local + system sends a NOTIFICATION message and changes its state to + + + +Lougheed & Rekhter [Page 21] + +RFC 1267 BGP-3 October 1991 + + + Idle. + + If there are no errors in the OPEN message, BGP sends a + KEEPALIVE message and sets a KeepAlive timer. The hold timer, + which was originally set to an arbitrary large value (see + above), is replaced with the value indicated in the OPEN + message. If the value of the Autonomous System field is the + same as our own, then the connection is "internal" connection; + otherwise, it is "external". (This will effect UPDATE + processing as described below.) Finally, the state is changed + to OpenConfirm. + + If a disconnect notification is received from the underlying + transport protocol, the local system closes the BGP connection, + restarts the ConnectRetry timer, while continue listening for + connection that may be initiated by the remote BGP peer, and + goes into the Active state. + + If the hold time expires, the local system sends NOTIFICATION + message with error code Hold Timer Expired and changes its + state to Idle. + + In response to the Stop event (initiated by either system or + operator) the local system sends NOTIFICATION message with + Error Code Cease and changes its state to Idle. + + Start event is ignored in the OpenSent state. + + In response to any other event the local system sends + NOTIFICATION message with Error Code Finite State Machine Error + and changes its state to Idle. + + Whenever BGP changes its state from OpenSent to Idle, it closes + the BGP (and transport-level) connection and releases all + resources associated with that connection. + + OpenConfirm state: + + In this state BGP waits for a KEEPALIVE or NOTIFICATION + message. + + If the local system receives a KEEPALIVE message, it changes + its state to Established. + + If the hold timer expires before a KEEPALIVE message is + received, the local system sends NOTIFICATION message with + error code Hold Timer expired and changes its state to Idle. + + + + +Lougheed & Rekhter [Page 22] + +RFC 1267 BGP-3 October 1991 + + + If the local system receives a NOTIFICATION message, it changes + its state to Idle. + + If the KeepAlive timer expires, the local system sends a + KEEPALIVE message and restarts its KeepAlive timer. + + If a disconnect notification is received from the underlying + transport protocol, the local system changes its state to Idle. + + In response to the Stop event (initiated by either system or + operator) the local system sends NOTIFICATION message with + Error Code Cease and changes its state to Idle. + + Start event is ignored in the OpenConfirm state. + + In response to any other event the local system sends + NOTIFICATION message with Error Code Finite State Machine Error + and changes its state to Idle. + + Whenever BGP changes its state from OpenConfirm to Idle, it + closes the BGP (and transport-level) connection and releases + all resources associated with that connection. + + Established state: + + In the Established state BGP can exchange UPDATE, NOTIFICATION, + and KEEPALIVE messages with its peer. + + If the local system receives an UPDATE or KEEPALIVE message, it + restarts its Holdtime timer. + + If the local system receives a NOTIFICATION message, it changes + its state to Idle. + + If the local system receives an UPDATE message and the UPDATE + message error handling procedure (see Section 6.3) detects an + error, the local system sends a NOTIFICATION message and + changes its state to Idle. + + If a disconnect notification is received from the underlying + transport protocol, the local system changes its state to + Idle. + + If the Holdtime timer expires, the local system sends a + NOTIFICATION message with Error Code Hold Timer Expired and + changes its state to Idle. + + If the KeepAlive timer expires, the local system sends a + + + +Lougheed & Rekhter [Page 23] + +RFC 1267 BGP-3 October 1991 + + + KEEPALIVE message and restarts its KeepAlive timer. + + Each time the local system sends a KEEPALIVE or UPDATE message, + it restarts its KeepAlive timer. + + In response to the Stop event (initiated by either system or + operator), the local system sends a NOTIFICATION message with + Error Code Cease and changes its state to Idle. + + Start event is ignored in the Established state. + + In response to any other event, the local system sends + NOTIFICATION message with Error Code Finite State Machine Error + and changes its state to Idle. + + Whenever BGP changes its state from Established to Idle, it + closes the BGP (and transport-level) connection, releases all + resources associated with that connection, and deletes all + routes derived from that connection. + +9. UPDATE Message Handling + + An UPDATE message may be received only in the Established state. + When an UPDATE message is received, each field is checked for + validity as specified in Section 6.3. + + If an optional non-transitive attribute is unrecognized, it is + quietly ignored. If an optional transitive attribute is + unrecognized, the Partial bit (the third high-order bit) in the + attribute flags octet is set to 1, and the attribute is retained for + propagation to other BGP speakers. + + If an optional attribute is recognized, and has a valid value, then, + depending on the type of the optional attribute, it is processed + locally, retained, and updated, if necessary, for possible + propagation to other BGP speakers. + + If the network and the path attributes associated with a route to + that network are correct, then the route is compared with other + routes to the same network. + + When a BGP speaker receives a new route from a peer over external BGP + link, it shall advertise that route to other BGP speakers in its + autonomous system by means of an UPDATE message if either of the + following conditions occur: + + a) the newly received route is considered to be better + than the other routes to the same network (as listed + + + +Lougheed & Rekhter [Page 24] + +RFC 1267 BGP-3 October 1991 + + + in the UPDATE message) that have been received over + external BGP links, or + + b) there are no other acceptable routes to the network + (as listed in the UPDATE message) that have been + received over external BGP links. + + When a BGP speaker receives an unreachable route from a BGP peer over + external BGP link, it shall advertise that route to all other BGP + speakers in its autonomous system, indicating that it has become + unreachable, if the following condition occur: + + a) a corresponding acceptable route to the same destination + was considered to be the best one among all routes to that + destination that have been received over external BGP links + (that is the local system has been advertising the + route to all other BGP speakers in its autonomous system + before it received the UPDATE message that reported it + as unreachable). + + Whenever a BGP speaker selects a new route (among all the routes + received from external and internal BGP peers), or determines that + the reachable destinations within its own autonomous system have + changed, it shall generate an UPDATE message and forward it to each + of its external peers (peers connected via external BGP links). + + If a route in the UPDATE was received over an internal link, it is + not propagated over any other internal link. This restriction is due + to the fact that all BGP speakers within a single AS form a + completely connected graph (see above). + + If the UPDATE message is propagated over an external link, then the + local AS number is prepended to the AS_PATH attribute, and the + NEXT_HOP attribute is updated with an IP address of the router that + should be used as a next hop to the network. If the UPDATE message + is propagated over an internal link, then the AS_PATH attribute and + the NEXT_HOP attribute are passed unmodified. + + Generally speaking, the rules for comparing routes among several + alternatives are outside the scope of this document. There are two + exceptions: + + - If the local AS appears in the AS path of the new route being + considered, then that new route cannot be viewed as better than + any other route. If such a route were ever used, a routing loop + would result. + + - In order to achieve successful distributed operation, only routes + + + +Lougheed & Rekhter [Page 25] + +RFC 1267 BGP-3 October 1991 + + + with a likelihood of stability can be chosen. Thus, an AS must + avoid using unstable routes, and it must not make rapid + spontaneous changes to its choice of route. Quantifying the terms + "unstable" and "rapid" in the previous sentence will require + experience, but the principle is clear. + +10. Detection of Inter-AS Policy Contradictions + + Since BGP requires no central authority for coordinating routing + policies among ASs, and since routing policies are not exchanged via + the protocol itself, it is possible for a group of ASs to have a set + of routing policies that cannot simultaneously be satisfied. This + may cause an indefinite oscillation of the routes in this group of + ASs. + + To help detect such a situation, all BGP speakers must observe the + following rule. If a route to a destination that is currently used + by the local system is determined to be unreachable (e.g., as a + result of receiving an UPDATE message for this route with the + UNREACHABLE attribute), then, before switching to another route, this + local system must advertize this route as unreachable to all the BGP + neighbors to which it previously advertized this route. + + This rule will allow other ASs to distinguish between two different + situations: + + - The local system has chosen to use a new route because the old + route become unreachable. + + - The local system has chosen to use a new route because it + preferred it over the old route. The old route is still + viable. + + In the former case, an UPDATE message with the UNREACHABLE attribute + will be received for the old route. In the latter case it will not. + + In some cases, this may allow a BGP speaker to detect the fact that + its policies, taken together with the policies of some other AS, + cannot simultaneously be satisfied. For example, consider the + following situation involving AS A and its neighbor AS B. B + advertises a route with a path of the form <B,...>, where A is not + present in the path. A then decides to use this path, and advertises + <A,B,...> to all its neighbors. B later advertises <B,...,A,...> + back to A, without ever declaring its previous path <B,...> to be + unreachable. Evidently, A prefers routes via B and B prefers routes + via A. The combined policies of A and B, taken together, cannot be + satisfied. Such an event should be noticed, logged locally, and + brought to the attention of AS A's administration. The means to do + + + +Lougheed & Rekhter [Page 26] + +RFC 1267 BGP-3 October 1991 + + + this, however, lies outside the scope of this document. Also outside + the document is a more complete procedure for detecting such + contradictions of policy. + + While the above rules provide a mechanism to detect a set of routing + policies that cannot be satisfied simultaneously, the protocol itself + does not provide any mechanism for suppressing the route oscillation + that may result from these unsatisfiable policies. The reason for + doing this is that routing policies are viewed as external to the + protocol and as determined by the local AS administrator. + +Appendix 1. BGP FSM State Transitions and Actions. + + This Appendix discusses the transitions between states in the BGP FSM + in response to BGP events. The following is the list of these states + and events. + + BGP States: + + 1 - Idle + 2 - Connect + 3 - Active + 4 - OpenSent + 5 - OpenConfirm + 6 - Established + + + BGP Events: + + 1 - BGP Start + 2 - BGP Stop + 3 - BGP Transport connection open + 4 - BGP Transport connection closed + 5 - BGP Transport connection open failed + 6 - BGP Transport fatal error + 7 - ConnectRetry timer expired + 8 - Holdtime timer expired + 9 - KeepAlive timer expired + 10 - Receive OPEN message + 11 - Receive KEEPALIVE message + 12 - Receive UPDATE messages + 13 - Receive NOTIFICATION message + + The following table describes the state transitions of the BGP FSM + and the actions triggered by these transitions. + + + + + + +Lougheed & Rekhter [Page 27] + +RFC 1267 BGP-3 October 1991 + + + Event Actions Message Sent Next State + -------------------------------------------------------------------- + Idle (1) + 1 Initialize resources none 2 + Start ConnectRetry timer + Initiate a transport connection + others none none 1 + + Connect(2) + 1 none none 2 + 3 Complete initialization OPEN 4 + Clear ConnectRetry timer + 5 Restart ConnectRetry timer none 3 + 7 Restart ConnectRetry timer none 2 + Initiate a transport connection + others Release resources none 1 + + Active (3) + 1 none none 3 + 3 Complete initialization OPEN 4 + Clear ConnectRetry timer + 5 Close connection 3 + Restart ConnectRetry timer + 7 Restart ConnectRetry timer none 2 + Initiate a transport connection + others Release resources none 1 + + OpenSent(4) + 1 none none 4 + 4 Close transport connection none 3 + Restart ConnectRetry timer + 6 Release resources none 1 + 10 Process OPEN is OK KEEPALIVE 5 + Process OPEN failed NOTIFICATION 1 + others Close transport connection NOTIFICATION 1 + Release resources + + OpenConfirm (5) + 1 none none 5 + 4 Release resources none 1 + 6 Release resources none 1 + 9 Restart KeepAlive timer KEEPALIVE 5 + 11 Complete initialization none 6 + Restart Holdtime timer + 13 Close transport connection 1 + Release resources + others Close transport connection NOTIFICATION 1 + Release resources + + + +Lougheed & Rekhter [Page 28] + +RFC 1267 BGP-3 October 1991 + + + Established (6) + 1 none none 6 + 4 Release resources none 1 + 6 Release resources none 1 + 9 Restart KeepAlive timer KEEPALIVE 6 + 11 Restart Holdtime timer KEEPALIVE 6 + 12 Process UPDATE is OK UPDATE 6 + Process UPDATE failed NOTIFICATION 1 + 13 Close transport connection 1 + Release resources + others Close transport connection NOTIFICATION 1 + Release resources + --------------------------------------------------------------------- + + The following is a condensed version of the above state transition + table. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Lougheed & Rekhter [Page 29] + +RFC 1267 BGP-3 October 1991 + + +Events| Idle | Active | Connect | OpenSent | OpenConfirm | Estab + | (1) | (2) | (3) | (4) | (5) | (6) + |-------------------------------------------------------------- + 1 | 2 | 2 | 3 | 4 | 5 | 6 + | | | | | | + 2 | 1 | 1 | 1 | 1 | 1 | 1 + | | | | | | + 3 | 1 | 4 | 4 | 1 | 1 | 1 + | | | | | | + 4 | 1 | 1 | 1 | 3 | 1 | 1 + | | | | | | + 5 | 1 | 3 | 3 | 1 | 1 | 1 + | | | | | | + 6 | 1 | 1 | 1 | 1 | 1 | 1 + | | | | | | + 7 | 1 | 2 | 2 | 1 | 1 | 1 + | | | | | | + 8 | 1 | 1 | 1 | 1 | 1 | 1 + | | | | | | + 9 | 1 | 1 | 1 | 1 | 5 | 6 + | | | | | | +10 | 1 | 1 | 1 | 1 or 5 | 1 | 1 + | | | | | | +11 | 1 | 1 | 1 | 1 | 6 | 6 + | | | | | | +12 | 1 | 1 | 1 | 1 | 1 | 1 or 6 + | | | | | | +13 | 1 | 1 | 1 | 1 | 1 | 1 + | | | | | | + --------------------------------------------------------------- + +Appendix 2. Comparison with RFC 1163 + + To detect and recover from BGP connection collision, a new field (BGP + Identifier) has been added to the OPEN message. New text (Section + 6.8) has been added to specify the procedure for detecting and + recovering from collision. + + The new document no longer restricts the border router that is passed + in the NEXT_HOP path attribute to be part of the same Autonomous + System as the BGP Speaker. + + New document optimizes and simplifies the exchange of the information + about previously reachable routes. + +Appendix 3. Comparison with RFC 1105 + + All of the changes listed in Appendix 2, plus the following. + + + +Lougheed & Rekhter [Page 30] + +RFC 1267 BGP-3 October 1991 + + + Minor changes to the RFC1105 Finite State Machine were necessary to + accommodate the TCP user interface provided by 4.3 BSD. + + The notion of Up/Down/Horizontal relations present in RFC1105 has + been removed from the protocol. + + The changes in the message format from RFC1105 are as follows: + + 1. The Hold Time field has been removed from the BGP header and + added to the OPEN message. + + 2. The version field has been removed from the BGP header and + added to the OPEN message. + + 3. The Link Type field has been removed from the OPEN message. + + 4. The OPEN CONFIRM message has been eliminated and replaced + with implicit confirmation provided by the KEEPALIVE message. + + 5. The format of the UPDATE message has been changed + significantly. New fields were added to the UPDATE message + to support multiple path attributes. + + 6. The Marker field has been expanded and its role broadened to + support authentication. + + Note that quite often BGP, as specified in RFC 1105, is referred to + as BGP-1, BGP, as specified in RFC 1163, is referred to as BGP-2, and + BGP, as specified in this document is referred to as BGP-3. + +Appendix 4. TCP options that may be used with BGP + + If a local system TCP user interface supports TCP PUSH function, then + each BGP message should be transmitted with PUSH flag set. Setting + PUSH flag forces BGP messages to be transmitted promptly to the + receiver. + + If a local system TCP user interface supports setting precedence for + TCP connection, then the BGP transport connection should be opened + with precedence set to Internetwork Control (110) value (see also + [6]). + + + + + + + + + + +Lougheed & Rekhter [Page 31] + +RFC 1267 BGP-3 October 1991 + + +Appendix 5. Implementation Recommendations + + This section presents some implementation recommendations. + +5.1 Multiple Networks Per Message + + The BGP protocol allows for multiple networks with the same AS path + and next-hop gateway to be specified in one message. Making use of + this capability is highly recommended. With one network per message + there is a substantial increase in overhead in the receiver. Not only + does the system overhead increase due to the reception of multiple + messages, but the overhead of scanning the routing table for flash + updates to BGP peers and other routing protocols (and sending the + associated messages) is incurred multiple times as well. One method + of building messages containing many networks per AS path and gateway + from a routing table that is not organized per AS path is to build + many messages as the routing table is scanned. As each network is + processed, a message for the associated AS path and gateway is + allocated, if it does not exist, and the new network is added to it. + If such a message exists, the new network is just appended to it. If + the message lacks the space to hold the new network, it is + transmitted, a new message is allocated, and the new network is + inserted into the new message. When the entire routing table has been + scanned, all allocated messages are sent and their resources + released. Maximum compression is achieved when all networks share a + gateway and common path attributes, making it possible to send many + networks in one 4096-byte message. + + When peering with a BGP implementation that does not compress + multiple networks into one message, it may be necessary to take steps + to reduce the overhead from the flood of data received when a peer is + acquired or a significant network topology change occurs. One method + of doing this is to limit the rate of flash updates. This will + eliminate the redundant scanning of the routing table to provide + flash updates for BGP peers and other routing protocols. A + disadvantage of this approach is that it increases the propagation + latency of routing information. By choosing a minimum flash update + interval that is not much greater than the time it takes to process + the multiple messages this latency should be minimized. A better + method would be to read all received messages before sending updates. + +5.2 Processing Messages on a Stream Protocol + + BGP uses TCP as a transport mechanism. Due to the stream nature of + TCP, all the data for received messages does not necessarily arrive + at the same time. This can make it difficult to process the data as + messages, especially on systems such as BSD Unix where it is not + possible to determine how much data has been received but not yet + + + +Lougheed & Rekhter [Page 32] + +RFC 1267 BGP-3 October 1991 + + + processed. + + One method that can be used in this situation is to first try to read + just the message header. For the KEEPALIVE message type, this is a + complete message; for other message types, the header should first be + verified, in particular the total length. If all checks are + successful, the specified length, minus the size of the message + header is the amount of data left to read. An implementation that + would "hang" the routing information process while trying to read + from a peer could set up a message buffer (4096 bytes) per peer and + fill it with data as available until a complete message has been + received. + +5.3 Processing Update Messages + + In BGP, all UPDATE messages are incremental. Once a particular + network is listed in an Update message as being reachable through an + AS path and gateway, that piece of information is expected to be + retained indefinitely. + + In order for a route to a network to be removed, it must be + explicitly listed in an Update message as being unreachable or with + new routing information to replace the old. Note that a BGP peer will + only advertise one route to a given network, so any announcement of + that network by a particular peer replaces any previous information + about that network received from the same peer. + + One useful optimization is that unreachable networks need not be + advertised with their original attributes. Instead, all unreachable + networks could be sent in a single message, perhaps with an AS path + consisting of the local AS only and with an origin set to INCOMPLETE. + + This approach has the obvious advantage of low overhead; if all + routes are stable, only KEEPALIVE messages will be sent. There is no + periodic flood of route information. + + However, this means that a consistent view of routing information + between BGP peers is only possible over the course of a single + transport connection, since there is no mechanism for a complete + update. This requirement is accommodated by specifying that BGP peers + must transition to the Idle state upon the failure of a transport + connection. + +5.4 BGP Timers + + BGP employs three timers: ConnectRetry, Holdtime, and KeepAlive. + Suggested value for the ConnectRetry timer is 120 seconds. + Suggested value for the Holdtime timer is 90 seconds. + + + +Lougheed & Rekhter [Page 33] + +RFC 1267 BGP-3 October 1991 + + + Suggested value for the KeepAlive timer is 30 seconds. + An implementation of BGP shall allow any of these timers to be + configurable. + +5.5 Frequency of Route Selection + + An implementation of BGP shall allow a border router to set up the + minimum amount of time that must elapse between selection and + subsequent advertisement of better routes received by a given BGP + speaker from BGP speakers located in adjacent ASs. + + Since fast convergence is needed within an AS, deferring selection + does not apply to selection of better routes chosen as a result of + UPDATEs from BGP speakers located in the advertising speaker's own + AS. To avoid long-lived black holes, it does not apply to + advertisement of previously selected routes which have become + unreachable. In both of these situations, the local BGP speaker must + select and advertise such routes immediately. + + If a BGP speaker received better routes from BGP speakers in adjacent + ASs, but have not yet advertised them because the time has not yet + elapsed, the reception of any routes from other BGP speakers in its + own AS shall trigger a new route selection process that will be based + on both updates from BGP speakers in the same AS and in adjacent ASs. + +References + + [1] Mills, D., "Exterior Gateway Protocol Formal Specification", RFC + 904, BBN, April 1984. + + [2] Rekhter, Y., "EGP and Policy Based Routing in the New NSFNET + Backbone", RFC 1092, T.J. Watson Research Center, February 1989. + + [3] Braun, H-W., "The NSFNET Routing Architecture", RFC 1093, + MERIT/NSFNET Project, February 1989. + + [4] Postel, J., "Transmission Control Protocol - DARPA Internet + Program Protocol Specification", RFC 793, DARPA, September 1981. + + [5] Rekhter, Y., and P. Gross, "Application of the Border Gateway + Protocol in the Internet", RFC 1268, T.J. Watson Research Center, + IBM Corp., ANS, October 1991. + + [6] Postel, J., "Internet Protocol - DARPA Internet Program Protocol + Specification", RFC 791, DARPA, September 1981. + + + + + + +Lougheed & Rekhter [Page 34] + +RFC 1267 BGP-3 October 1991 + + +Security Considerations + + Security issues are not discussed in this memo. + +Authors' Addresses + + Kirk Lougheed + cisco Systems, Inc. + 1525 O'Brien Drive + Menlo Park, CA 94025 + + Phone: (415) 326-1941 + Email: LOUGHEED@CISCO.COM + + + Yakov Rekhter + T.J. Watson Research Center IBM Corporation + P.O. Box 218 + Yorktown Heights, NY 10598 + + Phone: (914) 945-3896 + Email: YAKOV@WATSON.IBM.COM + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Lougheed & Rekhter [Page 35] +
\ No newline at end of file |