Radio frequency (RF) trunked communications systems are well-known in the art. Such systems use one or more system controllers to allocate communication resources (e.g., channels) among subscribers throughout the system. Accordingly, a reliable, computer based system architecture is required to maintain system performance and provide real time fault tolerance.
Fault tolerance can be achieved using one or more fundamental hardware architectures. Such architectures include, but are not limited to, systems employing: i) hot-standbys with voting, and ii) dynamic redundancy. Additionally, fault tolerance software techniques for supporting these architectures include: i) N-version programming, and ii) check-pointing (i.e. through use of recovery blocks). Unfortunately, each of the foregoing methodologies is inadequate for meeting the rigorous requirements of today's radio communication systems. These shortcomings are illustrated in the following discussion of each
A hot-standby system with voting typically utilizes multiple processors, such as a microprocessor or the like, and an arbitrator. Each processor, while processing identical system inputs in parallel with each other, provides input to the arbitrator. The arbitrator might then elect the proper output based on the inputs provided (e.g., by comparing the respective outputs of the three microprocessors, and selecting that output which is identical to at least one other output). The problem with the foregoing approach is the requirement for additional hardware (i.e., two extra processors, in addition to the arbitrator hardware). Further, voting schemes typically do not isolate the location of a real-time fault, as any one of the processor outputs may be invalid at a given time. That is, the occurrence of an intermittent fault may go undetected until the individual outputs are sampled for validity. The extra step of sampling the outputs represents an inefficient method of obtaining fault tolerance, particularly where system up-time is critical, as in a radio communication system that might be providing emergency service communication..
Dynamic redundancy systems typically include a dual microprocessor arrangement, where both processors are processing inputs, or stimulus, while only one processor (i.e., the so-called active processor) generates an output, or response. This arrangement, while an improvement over a single, non-redundant microprocessor scheme, still has significant limitations which need to be overcome to make it suitable for use in a real-time communication system. In particular, problems of synchronizing information between the two processors, as well as the time required to detect failure of the active processor, are but a few of the notable shortcomings of such a system. Of course, the potential loss of information, and an undesirable time delay associated with switch-over after a fault is detected, make this approach impractical to use in a radio communication system.
As with any computer-based system, the hardware components perform tasks in response to software instructions. It should be noted that the foregoing hardware architectures are typically supported by one of two software (i.e., programming) methods: 1) N-version programming, or 2) check-point programming.
An N-version programming method can be defined as N independently programmed, but functionally equivalent, programs operating concurrently. For example, in a two-processor arrangement, there exists two separate operating systems, each providing directives to one of the processors. This approach, however, has a disadvantage in that the software development required is increased by a factor of N. Of course, as N increases, the software development costs increase, thereby making this approach an even less desirable alternative
By contrast, check-point programming involves a technique under which a primary task is divided into blocks, the end of which each constitute a so-called check point. During normal system operation, these blocks are executed and the process state is saved at each check point. In the event of a task failure, the failed task can be re-executed from the last check point. That is, by retrieving the recorded process state data from the last check point, the system is able to service the fault and continue processing. While check pointing provides a marginal improvement over the N-version programming approach, it still does not provide adequate fault recovery for a real time system. In particular, the efficiency of the system is directly proportional to the frequency of check point operations. That is, for a check point system to be truly fault tolerant (i.e., where faults are virtually transparent to the user, and time delays minimal) there would have to be a large number of check points. Of course, storing process data consumes otherwise available processor time. For this reason, such a system could not be efficiently employed as a radio communication system controller.
Accordingly, there exists a need for a radio communications system controller which, through limited hardware and software redundancy, provides a continuous, real time output. This output should be reliable throughout normal system operation, and should, through fault detection logic, maintain a smooth transition between the primary and auxiliary processing units.