1. Field of Invention
The present invention relates generally to network systems. More particularly, the present invention relates to enabling a substantially internal trace process to be used to identify nodes within a provider network which are not able to forward external prefixes.
2. Description of the Related Art
The demand for data communication services is growing at an explosive rate. Much of the increased demand is due to the fact that more residential and business computer users are becoming connected to the Internet. Within a network, as for example an optical network, different provider networks, or autonomous systems, may be in communication with one another. For example, an overall network may include multiple autonomous systems which each include various nodes such as routers and servers. For a first customer to communicate with a second customer, the first customer generally initiates the transmission of a packet which may pass through one or more autonomous systems en route to the second customer. FIG. 1 is a diagrammatic representation of an overall network which includes a plurality of autonomous systems. A first autonomous system 102 or provider network may be in communication with a second autonomous system 106 and a third autonomous system 110. Typically, autonomous system 120 is a network associated with one provider while autonomous systems 106, 110 may be networks associated with other providers.
First autonomous system 102 includes edge nodes, e.g., routers or servers, 114 and core nodes, e.g., routers or servers, 118. Similarly, second autonomous system 106 includes edge nodes 124 and core nodes 128. As will be appreciated by those skilled in the art, a border gateway protocol (BGP) effectively enables first autonomous system 102 to learn about routes to second autonomous system 106. Edge routers 114 of first autonomous system 102 may be communicably coupled to edge routers in other autonomous systems. By way of example, edge router 114g may be in communication with edge router 124a of second autonomous system 106, while edge router 114e may be in communication with an edge router 134 of third autonomous system 110.
Edge router 114a is in communication with a customer edge 140 that wishes to access or communicate with a node 144. As shown, node 144 is not a part of either first autonomous system 102 or second autonomous system 106, although customer 140 may communicate with node 144 using routers 114, 118 included in autonomous system 102 and routers 124, 128 included in autonomous system 106.
When customer 140, or an overall source, wishes to communicate with node 144, or a destination address, customer 140 will forward a packet or a message which specifies a destination as node 144. The packet that is forwarded will pass through any number of domains, e.g., first autonomous system 102 and second autonomous system 106, en route to node 144. Prefixes associated with the packet which pertain to the destination address, as well as available routes, are generally advertised by autonomous systems 102, 106 to their customers such as customer 140 through a standard exterior routing protocol like a Border Gateway Protocol (BGP). When there are no failures within first autonomous system 102 or second autonomous system 106, then a packet forwarded to node 144 by customer 140 will successfully reach customer 140. However, when there is at least one failure at an intermediate point, i.e., a failure of a node 114, 118 within first autonomous system 102 or a failure of a node 124, 128 within second autonomous system 106, the packet may not successfully reach node 144. A failure of an intermediate point may be the result of an intermediate point being off line, or of forwarding entries not being properly setup at an intermediate point such that a packet or, in general, traffic passing through that intermediate point is effectively dropped.
Failures of intermediate points along a path between customer 140 and node 144 are considered to be silent failures, since an operator of an autonomous system such as first autonomous system 102 or second autonomous system 106 is generally not aware of a problem within his or her autonomous system unless one of the end users, i.e., either customer 140 or node 144, notices the problem. If customer 140 notices that a packet that was sent to node 144 was not acknowledged by node 144, customer 140 may send a ping towards node 144 which effectively causes nodes 114, 118 within first autonomous system 102 and nodes 124, 128 within second autonomous system 106 to be pinged. Until a ping is sent, an operator of first autonomous system 102 or second autonomous system 106 would not be aware of any potential failures within either first autonomous system 102 or second autonomous system 106, respectively.
FIG. 2a is a block diagram representation of a customer edge which forwards a message through a provider network to a destination address. A provider network 206, which generally includes at least one provider edge node and a provider core node, may advertise prefixes and routes to customer edge 202. When customer edge 202 wishes to communicate with a destination address 210, customer edge 200 may forward a message 220 through provider network 206 to destination address 210, as discussed above.
When a traffic drop occurs as a result of a silent failure within provider network 206, e.g., when forwarding entries within provider network 206 are not properly set up, a message that is sent from customer edge 200 and intended for destination address 210 may not reach destination address 210. As shown in FIG. 2b, when a message 224 fails to reach destination address 210 because of a failure within provider network 206, customer edge 202 may send a ping 250 through provider network 206 towards destination address 210. Ping 250 is generally arranged to enable a determination to be made that there is a failure associated with provider network 206, and to enable customer edge 202 to notify provider network 206 that there is a failure within provider network 206. Diagnostic processes may then be performed on provider network 206 to ascertain where within provider network 206 a failure has occurred.
FIG. 3 is a diagrammatic representation of an autonomous system in which a core node of a provider network has failed. When a customer edge 340 attempts to send a message 336 to a destination address 344 through provider network 302 which includes provider edge nodes 314 and core nodes 318, message 336 may be sent through a best path as determined by a best path algorithm. Message 336 passes through provider edge node 314a and core node 318c. However, since a failure is associated with core node 318d, message 336 may not be correctly forwarded by core node 318d, and is effectively dropped.
If customer edge 340 expects a response to message 336 and one is not received, customer edge 340 may send a ping to destination address 344 through provider network 302. The ping may enable a determination to be made that a failure has occurred with a node 314, 318 associated with provider network 302. Typically, a customer associated with customer edge 340 may inform a provider that provider network 302 has effectively caused the customer a loss in connectivity. Hence, once the provider is aware that there is a failed node 314, 318, procedures may be performed on provider network 302 to identify node 318d as failing to forward message 336 and steps may be taken to substantially remedy the failure of node 318d, as will be appreciated by those skilled in the art.
Although pings which are sent by customer edges, i.e., customer edge nodes, are useful in enabling node failures within a provider network to be identified, customer edges may not necessarily initiate pings to determine why a forwarded message may not have reached an intended destination address. As a result, a provider may not be aware of a failure within its network. A ping generally may not be sent by a node of a provider network to a destination within the provider network to determine if there is a failure within the provider network, as it is often quite difficult to collocate additional equipment at each point in the network to enable such a ping to be sent. As will be appreciated by those skilled in the art, for non-shared media connectivity through a core, a separate provisioned layer 2 path from the edge node to the core node is generally required. This again breaks the point of checking connectivity, starting right from the edge. Hence, a provider may not readily determine that an intermediate point within a provider network is not set up to properly forward entries and is potentially dropping traffic. Therefore, unless a customer initiates a ping and notifies a provider of a potential failure within the provider network, the provider is generally unaware that a failure may exist at one of the nodes within the provider network. If the provider is unaware of a failure, a failure may not be corrected, and customers which use the provider network may be dissatisfied with the performance of the provider network.
Therefore, what is desired is a method and an apparatus for enabling a provider to readily determine whether there is a failure of an intermediate point within a network or autonomous system associated with the provider. More specifically, what is needed is a system which enables a provider edge node to effectively initiate a ping-like message which allows it to be determined whether there is a failure within an autonomous system which includes the provider edge node.