Packet switching networks typically incorporate a parameter termed Quality of Service (QoS) to provide end-users with a guarantee of a certain degree of end-to-end connectivity coupled with data transport reliability. Data transport reliability encompasses various parameters such as minimum guaranteed bandwidth, minimum guaranteed transmission latency and a minimum guaranteed error rate. Unfortunately, providing a customer with a QoS guarantee often turns out to be merely an estimate and often fails to accurately predict a desirable standard of performance. This failure occurs, in large part, due to the unpredictable nature of the volume of data packets carried over a packet network and the network's vulnerability to congestion.
While equipment failure is one cause of congestion, a second cause arises out of the cost-driven emphasis on over-subscription. Over-subscription exists when a service provider offers service to more customers than the network can handle, such service being offered under the assumption that not all customers will concurrently access the network.
When a first communication device such as, for example, a switch, a router, a server, or a computer, is connected to a second communication device, which also may be a switch, a router, a server, or a computer, to form one or more connections of a communications network, the two communication devices are referred to as network elements of the communications network. When a network element (NE) suffers congestion it is labeled as a point-of-congestion (POC) in the network.
While eliminating such a POC would significantly improve network transmission efficiency, identifying one or more NEs that are POCs proves to be a challenging task due to the lack of effective prior-art network-oriented fault isolation systems. Existing fault isolation systems are largely NE-oriented rather than network-oriented, and consequently provide fault isolation information at an NE level rather than at a network level.
Existing NE-oriented fault isolation systems and congestion control mechanisms are incorporated into various architectures such as, for example, simple network management protocol (SNMP), connection admission control (CAC), asynchronous transfer mode (ATM), and transmission control protocol (TCP).
NE-oriented fault isolation systems typically process data packets inside an NE to detect and measure parameters that indicate faults and/or congestion. These parameters include, for example, an excessive bit error rate (BER), an erroneous cyclic redundancy check (CRC), and the number of missing data packets. An ATM switch, for example, may process multiple incoming data packet streams by analyzing the cell header contents carried by ATM frames to identify one or more streams having excessive errors. Under this congestion analysis, the ATM switch may also be designed to identify which particular data packet stream is contributing to congestion or excessive errors. The ability to identify the offending data packet stream however, does not lead to identifying the network element causing the congestion, because the ATM switch generally does not have/use information related to the network architecture to identify the network element.
Existing network-oriented fault isolation systems generally utilize monitoring and detection software that is installed in multiple NEs across the network. For example, SNMP uses a SNMP manager software and a Management Information Base (MIB) that is located in a manager device to interact with SNMP agent software installed in one or more managed objects. Typically, a SNMP manager device processes a high-level transmission protocol, such as internet protocol (IP), to establish performance statistics of a managed device. The manager device does not generally collect performance statistics that involve processing other protocols that may be used on data packets that are carried over this high-level transmission protocol.
To illustrate, if a user were using a voice-over-IP (VoIP) protocol, an SNMP system would be unable to provide error information that is specifically related to VoIP. In this scenario, a service provider offering VoIP services to this user will have difficulty in guaranteeing a specific QoS. Furthermore, the service provider may not own the NEs in the network used to provide this service, and consequently, may not be authorized to query such NEs and obtain performance parameters that will enable the service provider to establish a level of QoS that can be offered.
It is therefore desirable to provide a network-oriented fault isolation system that can be used to isolate a fault-contributing NE in a packet switching communication network. It is also preferable that such a system utilize existing transmission formats and fault isolation parameters without requiring installation of customized software in multiple NEs of the network.