1. Field of the Invention
The invention relates to a real time fault tolerant transaction processing system and particularly one suited for use in a service control point.
2. Description of the Prior Art
Currently, many computer controlled systems, particularly those used in transaction processing, must operate at extremely high levels of reliability. Unfortunately, a computer which controls such a system, due to the complexity of the computer hardware and attendant software, is oftentimes the least reliable component in the system and is frequently a main cause of system failure. Therefore, in an effort to provide increased reliability, the art has turned to so-called fault tolerant computers for use in transaction processing systems. However, computers of this type possess serious drawbacks which, as discussed in detail below, severely limit their utility. As a result, such computers can not be used to impart increased reliability to a variety of computer based transaction processing systems.
One specific transaction processing system that must operate at an extremely high level of reliability is a service control point ("SCP") that controls routing of telephone calls, within the telephone network, that require special handling, such as 800 and calling ("credit") card calls. In particular, whenever a telephone subscriber dials such a call, this call is first routed to an equal access switch, located either at a local office or elsewhere, which has service switching point capability (such a switch will hereinafter be referred to as an "SSP"). Primarily, the SSP processes calls that require remote data base translation. Now, whenever the SSP recognizes an incoming a call as one that requires special handling, the SSP suspends normal call processing, launches a message over a common channel signalling ("CCS") network to an SCP to determine how the call is to be routed, and finally upon receipt of a return message from the SCP, routes the call in a manner specified by the return message.
Specifically, whenever a subscriber places a call to an 800 number, the local switch routes the call to an SSP. The SSP fabricates an query in the form of a packet. This packet contains the 800 called number and a request for a destination routing number associated with the 800 number and, additionally, identification of a particular long distance (inter-exchange) carrier over which the call is to be routed. This packet is then routed over a common channel signalling line in a CCS network to a particular one of several geographically separated signalling transfer points (STPs). Specifically, the CCS network typically consists of a multi-level hierarchy of STPs, wherein each STP is a primarily a packet switch. The first STP to receive the packet, i.e. the "originating" STP, then routes the packet, over an available link to an SCP for processing or, via a specified link, to another STP for eventual routing to an SCP. To perform this routing correctly, each STP that routes this packet examines a routing field located in the packet and, in response, determines where the packet should be routed. In particular, for an 800 call, the first six digits of the dialed 800 number specify the appropriate SCP that is to receive a corresponding packet from an SSP.
The SCP itself is a fault tolerant transaction processing system that contains various databases that collectively provide desired call routing information. These databases contain a "customer record" which specifies how each 800 call is to be routed. This record frequently contains one or more destination routing numbers associated with a particular 800 number and specifies the manner in which one of these destination routing numbers is to be selected, e.g. in accordance with the time of day, day of month, originating numbering plan area of the caller or other pre-defined method. The SCP is a transaction processor in which a transaction involves the receipt of a packet and the generation of a corresponding response. In particular, whenever a query is received, via an incoming packet, the SCP performs associated database access operations to obtain a necessary destination routing number and an inter-exchange carrier identification for the corresponding 800 call. The resulting information is then transmitted, as a packet, over the CCS network by the SCP, via one or more STPs, back to an "originating" SSP which generated the corresponding query. Once a packet containing the destination routing number and inter-exchange carrier selection is received, the originating SSP appropriately routes the 800 call to the destination routing number. If the destination routing number is within the local access and transport area (LATA) served by the local telephone company, then the 800 call is routed by the SSP, via switching and transport facilities provided by that telephone company, directly to the destination routing number. Alternatively, if the destination routing number is outside the LATA served by the local telephone company, then the SSP routes the 800 call to the identified inter-exchange carrier with specific instructions (e.g. the destination routing number) for routing the call to its final destination.
To ensure that 800 service is reliably provided all the time, an SCP must operate with minimal down time, preferably less than three minutes per year. Two techniques are often used in the art to impart a high degree of fault tolerance to the SCP: use of redundant link connections from the CCS network to an SCP and use of fault tolerant processors within the SCP.
First, each SSP is connected, by separate links, to two different STPs, each of which, in turn, is connected by separate links to two different SCPs. In the event one of these links fails, then traffic destined to a particular SCP is re-routed thereto via another link or a different STP.
Second, the art teaches that each processor used within the SCP must be fault tolerant. Generally, SCPs known in the art contain two front end processors, two back end processors and a fault tolerant mass memory device, such as a dual ported disk drive. Each front end processor is connected to both incoming links that terminate at the SCP. Each back end processor is connected to both front end processors. Also, each back end processor is connected to the disk drive through one of its dual ports. In operation, one of the front end processors formulates a database query in response to each packet appearing on either link. One of the back end processors receives a query from the front end processor and, in response, performs various database look-up operations, through one of the ports of the disk drive, to obtain the desired information. The front and back end processors are both fault tolerant. Specifically, the fault tolerant processors, as is typically taught in the art, operate in a hot/standby configuration in which one processor (the "hot" processor) is actively processing packets or queries while the other (the "standby" processor) remains in a standby condition ready to take over processing in the event the hot processor fails. A typical example of such a processor is shown in U.S. Pat. No. 4,484,275 (issued Nov. 20, 1984 to J. Katzman et al and hereinafter referred to as the '275 patent).
Fault tolerant processors known in the art possess serious drawbacks which substantially limit their utility in an SCP.
First, fault tolerant processors are very expensive. While a non-fault tolerant minicomputer having sufficient throughput for use in an SCP may cost upwards of $20,000, a fault tolerant processor of the same capacity may cost in the range of $100,000 or more. The additional cost is incurred as the direct result of highly specialized hardware and very sophisticated software, particularly for fault detection and recovery, needed in these processors to provide an acceptable degree of fault tolerance. Specifically, the software used in fault tolerant processors is often of such complexity that it accounts for upwards of 65% of the total cost of the processor. Thusfar, the art has failed to provide any "off the shelf" software that can be purchased and installed in two relatively low cost commercially available stand alone processors to provide an acceptable degree of fault tolerant operation therebetween.
Second, in a traditional fault tolerant processor, such as that shown in the '275 patent, the hot processor handles the entire processing load while the standby processor remains idle. Consequently, in an application, such as the SCP, where queries may occur at a rate in excess of 200/second, the hot processor itself must be capable of handling all these queries. This mandates that the each processor must have a sufficiently large throughput which further increases the cost of these processors. Now, in an attempt to reduce cost, the art teaches, as shown by the architecture described in U.S. Pat. No. 4,356,546 (issued Oct. 26, 1982 to A. Whiteside et al), that a set of different tasks can be assigned to and executed by a different processor. In this manner, each processor is not required to handle the entire processing load but rather a proportionate share thereof. Consequently, processors having a reduced capacity can be used to implement a fault tolerant architecture. Unfortunately, complex hardware and sophisticated and costly software is needed to monitor the execution of all tasks and, in the event a processor fails, re-schedule tasks among all active remaining processors. As a result, the cost savings that may accrue through the use of smaller processors are more than offset by the added expense of task monitoring and re-scheduling hardware and its attendant software.
Third, traditional fault tolerant processors, such as that described in the above-noted '275 patent, utilize separate processors that are tightly coupled together through multiple busses. Unfortunately, inter-connecting processors in this fashion forces each processor to be somewhat dependent upon the other processors. Specifically, should one such processor fail, then depending upon the severity of the failure, e.g. whether a processor inter-connection bus is taken down as a result, the failed processor may well disrupt the operation of one or more of the remaining processors and potentially halt all further processing occurring throughout the fault tolerant system. Other illustrative fault tolerant architectures that rely on relatively tight inter-processor coupling are shown, for example, in U.S. Pat. Nos.: 4,484,273 (issued Nov. 20, 1984 to J. Stiffler et al); 4,421,955 (issued Dec. 20, 1983 to H. Mori et al); and 4,412,281 (issued Oct. 25, 1983 to G. Works). A similar dependency problem arises in those fault tolerant processors known in the art which rely on distributing tasks through a common switch to any one of several inter-connected processors. Such an arrangement is illustratively shown in U.S. Pat. No. 4,392,199 (issued July 5, 1983 to E. Schmitter et al). Here, the overall reliability of the fault tolerant processor is limited by the reliability of the common switch. Should the switch fail, all processing will stop.
Fourth, other fault tolerant processors known in the art rely on simultaneously processing tasks through paired redundant hardware and comparing the results to isolate faults. For example, in the architecture described in U.S. Pat. No. 4,453,215 (issued June 5, 1984 to R. Reid), identical tasks are simultaneously processed, on a synchronous basis, through two identical dual processors. The results obtained from both processors within a dual processor are compared against each other to detect any errors (i.e. any differences) occurring within that dual processor and, should an error be detected, to immediately take that one of the two dual processors out of service. As a result, the faulty dual processor is taken out of service before it has a chance to transfer any possibly erroneous information onward to a bus. Although this arrangement may, in certain instances, provide an acceptable level of fault tolerance, substantial expense is incurred for quadruplicating processing elements.
Thus, a need exists in the art for a transaction processing system, particularly one suitable for use in a service control point, that provides a very high degree of fault tolerance without complex hardware, sophisticated software and the attendant high cost associated with traditional fault tolerant processing systems.