BGP Operating Modes
BGP operates on a CX device in either of the following modes, as shown in Figure 1:
- Internal BGP (IBGP)
- External BGP (EBGP)
BGP is called IBGP when it runs within an AS; it is called EBGP when it runs between ASs.
Roles in Transmitting BGP Messages
- Speaker: The CX device that sends BGP messages is called a BGP speaker. The speaker receives or generates new routing information, and then advertises the routing information to other BGP speakers. When receiving a new route from another AS, a BGP speaker compares the route with the current route. If the route takes precedence over the existing route, or the route is new, the speaker advertises this route to all other BGP speakers except the BGP speaker that sends this route.
- Peer: The BGP speakers that exchange messages with each other are called peers. Multiple peers compose a peer group.
BGP runs by sending messages. There are five types of BGP messages, namely, the Open message, Update message, Notification message, Keepalive message, and Route-refresh message.
- Open message: It is the first message that is sent after a TCP connection is set up, and is used to set up BGP peer relationships. After the peer receives an Open message and the peer negotiation succeeds, the peer sends a Keepalive message to confirm and maintain the peer relationship. Then, peers can exchange Update, Notification, Keepalive, and Route-refresh messages.
- Update message: It is used to exchange routes between BGP peers. The Update message can be used to advertise multiple reachable routes with the same attributes, or to withdraw multiple unreachable routes.
- An Update message can be used to advertise multiple reachable routes with the same attributes. These routes can share a group of route attributes. The route attributes contained in an Update message are applicable to all destination addresses (expressed by IP prefixes) contained in the Network Layer Reachability Information (NLRI) field of the Update message.
- An Update message can be used to withdraw multiple unreachable routes. Each route is identified by its destination address, which identifies the routes previously advertised between BGP speakers.
- An Update message can be used only to withdraw routes. In this case, it does not need to carry the path attributes or NLRI. On the contrary, an Update message can be used only to advertise the reachable routes, so it does not need to carry information about the withdrawn routes.
- Notification message: When BGP detects an error, it sends a Notification message to its peer. The BGP connection is then torn down immediately.
- Keepalive message: BGP periodically sends a Keepalive message to the peer to maintain the peer relationship.
- Route-refresh message: It is used to notify the peer of the capability to refresh routes.
If all CX devices of BGP are enabled with the Route-refresh capability, the local BGP CX device sends Route-refresh messages to peers when the import routing policy of BGP changes. After receiving the message, the peers resend their routing information to the local BGP CX device. In this manner, the routing table of BGP can be dynamically refreshed and the new routing policy can be used, without tearing down BGP connections.
BGP Finite State Machine
The BGP Finite State Machine (FSM) has six states, namely, Idle, Connect, Active, OpenSent, OpenConfirm, and Established.
During the establishment of BGP peer relationships, BGP is usually in the Idle, Active, or Established state.
- In the Idle state, BGP denies all connection requests. This is the initial status of BGP.
- In the Active state, BGP attempts to set up a TCP connection. This is the intermediate status of BGP.
- In the Established state, BGP peers can exchange Update messages, Route-Refresh messages, Keepalive messages, and Notification messages.
The BGP peer relationship can be established only when both the BGP peers are in the Established state. The two peers send Update messages to exchange routes.
- BGP adopts TCP as its transport layer protocol. Therefore, before the BGP peer relationship is set up, a TCP connection must be set up between the peers. Then, BGP peers negotiate related parameters by exchanging Open messages, and finally establish the BGP peer relationship.
- After the peer relationship is set up, BGP peers exchange BGP routing tables. BGP does not periodically update the routing table. When BGP routes change, however, BGP updates the BGP routing table incrementally through Update messages.
- BGP sends Keepalive messages to maintain the BGP connection between peers. When detecting an error on a network, for example, error packets or packets indicating unsupported negotiation capability are received, BGP sends a Notification message to report the error, and the BGP connection is torn down accordingly.
The BGP route attribute is a set of parameters that further describe routes. With the BGP route attribute, BGP can filter and select routes. All BGP route attributes are classified into the following types:
- Well-known mandatory: It can be identified by all BGP CX devices. This type of attribute is mandatory and must be carried in Update messages. Without this attribute, errors occur in the routing information.
- Well-known discretionary: It can be identified by all BGP CX devices. The attribute is discretionary and is not necessarily carried in Update messages.
- Optional transitive: It indicates the transitive attribute between ASs. A BGP CX device may not recognize this attribute, but it still receives these attributes and advertises them to other peers.
- Optional non-transitive: If a BGP CX device does not support this attribute, the corresponding attributes are ignored and are not advertised to other peers.
The following part describes the common BGP route attributes:
The Origin attribute defines the origin of a route. It marks the paths of a BGP route. The Origin attribute is classified into the following types:
- Interior Gateway Protocol (IGP): It is of the highest priority. For the routing information obtained through an IGP of the AS that originates the route, the Origin attribute is IGP. For example, for the routes imported to the BGP routing table through the network command, the Origin attribute is IGP.
- Exterior Gateway Protocol (EGP): It is of the second highest priority. The Origin attribute of the routes obtained through EGP is EGP.
- Incomplete: It is of the lowest priority. The Origin attribute of the routes learned by other means is Incomplete. For example, for the routes imported through the import-route command by BGP, the Origin attribute is Incomplete.
The AS_Path is used to record all ASs that a route passes through from the local end to the destination in the distance-vector (DV) order.
Assume that the BGP speaker advertises a local route:
- When advertising the route to other ASs, the BGP speaker adds the local AS number in the AS_Path list, and advertises it to the neighboring CX devices through Update messages.
- When advertising the route to the local AS, the BGP speaker creates an empty AS_Path list in an Update message.
Assume that the BGP speaker advertises the routes learned from the Update messages of other BGP speakers:
- When advertising the route to other ASs, the BGP speaker adds the local AS number to the leftmost of the AS_Path list. According to the AS_Path attribute, the BGP CX device that receives the route can know the ASs through which the route passes to the destination. The number of the AS that is nearest to the local AS is placed on the top of the list. The other AS numbers are arranged in sequence.
- When the BGP speaker advertises the route to the local AS, it does not change the AS_Path.
The Next_Hop attribute of BGP is different from that of IGP. It is not necessarily the IP address of a neighboring CX device. Generally, the Next_Hop attribute complies with the following principles:
- When advertising a route to an EBGP peer, the BGP speaker sets the next hop of the route to be the address of the local interface through which the BGP peer relationship is set up.
- When advertising a locally generated route to an IBGP peer, the BGP speaker sets the next hop of the route to be the address of the local interface through which the BGP peer relationship is set up.
- When advertising a route learned from an EBGP peer to an IBGP peer, the BGP speaker does not change the next hop of the route.
The Multi-Exit-Discriminator (MED) is exchanged only between two neighboring ASs. The AS that receives the MED does not advertise it to any other ASs.
The MED serves as the metric used by an IGP. It is used to determine the optimal route when traffic enters an AS. When a BGPCX device obtains multiple routes to the same destination address but with different next hops through EBGP peers, the route with the smallest MED value is selected as the optimal route.
The Local_Pref attribute is exchanged only between IBGP peers and is not advertised to other ASs. It indicates preferences of the BGP CX devices.
The Local_Pref attribute is used to determine the optimal route when traffic leaves an AS. When a BGP CX device obtains multiple routes to the same destination address but with different next hops through IBGP peers, the route with the largest Local_Pref value is selected.
BGP Route Selection Rules
When there are multiple routes to the same destination, BGP selects routes according to the following policies:
- Prefers the route with the largest PreVal.
PrefVal is a Huawei-specific parameter. It is valid only on the device where it is configured.
- Prefers the route with the highest Local_Pref.
A route without Local_Pref is considered to have had the value set by using the default local-preference command or to have a value of 100 by default.
- Prefers a locally originated route. A locally originated route takes precedence over a route learned from a peer.
Locally originated routes include routes imported by using the network command or the import-route command, manually aggregated routes, and automatically summarized routes.
- A summarized route is preferred. A summarized route takes precedence over a non-summarized route.
- A route obtained by using the aggregate command is preferred over a route obtained by using the summary automatic command.
- A route imported by using the network command is preferred over a route imported by using the import-routecommand.
- Prefers the route with the shortest AS_Path.
- The AS_CONFED_SEQUENCE and AS_CONFED_SET are not included in the AS_Path length.
- An AS_SET counts as 1, no matter how many ASs are in the set.
- After the bestroute as-path-neglect command is run, the AS_Path attributes of routes are not compared in the route selection process.
- Prefers the route with the highest Origin type. IGP is higher than EGP, and EGP is higher than Incomplete.
- Prefers the route with the lowest Multi Exit Discriminator (MED).
- The MEDs of only routes from the same AS but not a confederation sub-AS are compared. MEDs of two routes are compared only when the first AS number in the AS_SEQUENCE (excluding AS_CONFED_SEQUENCE) is the same for the two routes.
- A route without any MED is assigned a MED of 0, unless the bestroute med-none-as-maximum command is run. If the bestroute med-none-as-maximum command is run, the route is assigned the highest MED of 4294967295.
- After compare-different-as-med command is run, the MEDs in routes sent from peers in different ASs are compared. Do not use this command unless it is confirmed that different ASs use the same IGP and route selection mode. Otherwise, a loop may occur.
- If the bestroute med-confederation command is run, MEDs are compared for routes that consist only of AS_CONFED_SEQUENCE. The first AS number in the AS_CONFED_SEQUENCE must be the same for the routes.
- After the deterministic-med command is run, routes are not selected in the sequence in which routes are received.
- Prefers EBGP routes over IBGP routes.
EBGP is higher than IBGP, IBGP is higher than LocalCross, and LocalCross is higher than RemoteCross.
If the ERT of a VPNv4 route in the routing table of a VPN instance on a PE matches the IRT of another VPN instance on the PE, the VPNv4 route will be added to the routing table of the second VPN instance. This is called LocalCross. If the ERT of a VPNv4 route from a remote PE is learned by the local PE and matches the IRT of a VPN instance on the local PE, the VPNv4 route will be added to the routing table of that VPN instance. This is called RemoteCross.
- Prefers the route with the lowest IGP metric to the BGP next hop.
Assume that load balancing is configured. If the preceding rules are the same and there are multiple external routes with the same AS_Path, load balancing will be performed based on the number of configured routes.
- Prefers the route with the shortest Cluster_List.
- Prefers the route advertised by the CX device with the smallest router ID.
If routes carry the Originator_ID, the originator ID is substituted for the router ID during route selection. The route with the smallest Originator_ID is preferred.
- Prefers the route learned from the peer with the smallest address if the IP addresses of peers are compared in the route selection process.
Policies for BGP Route Advertisement
BGP adopts the following policies to advertise routes:
- The BGP speaker advertises only the optimal route to its peer when there are multiple valid routes.
- The BGP speaker advertises the routes learned from EBGP CX devices to all BGP peers, including EBGP peers and IBGP peers.
- The BGP speaker does not advertise the routes learned from IBGP CX devices to its IBGP peers.
- The BGP speaker advertises the routes learned from IBGP CX devices to its EBGP peers.
- The BGP speaker advertises all BGP routes to the new peers when the peer relationship is established.