There are many applications for software based digital processing systems that need to be particularly reliable. The need for high reliability may be for the provision of safety, as in the case of software controlling the flight surfaces of an otherwise inherently unstable aircraft, or for the control of a potentially hazardous industrial process. High reliability is also required in systems handling financial transactions. The reliability issue can be divided into software reliability and hardware reliability.
Regarding software reliability, there is a difficulty in proving correct operation since xe2x80x9cconventionalxe2x80x9d programming languages are based upon the destructive assignment statement at the very heart of the von Neumann paradigm of computation. The level of indirection introduced as a consequence of xe2x80x9clocation addressingxe2x80x9d, i.e. that the contents of a given storage location has no relation with its address, results in non-tractable problems which, arguably, become manifest throughout an entire computer system. This essentially means that entities such as formal system proofs and meaningful system metrics cannot be attained whilst using this class of language. Also, conventional processors tend to have instruction sets which are completely defined. As a consequence, if there is an error in code being executed, the resulting output may be unpredictable, even if the error is well defined.
At the same time, it is well known that purely declarative programs are very amenable to formal proof of conformance to a given system specification, and that with the assistance of various formal toolsets, can also highlight any inconsistencies or ambiguities contained within that system specification.
The inventors have recognised that if the software component of a system is provably correct at the outset, it will remain correct in the future. On the other hand, even if the hardware component of the system is provably correct at the outset, it will eventually fail. This leads to the aim of producing a hardware design which is provably correct at the outset, and that the integrity of its operational correctness will be checked repeatedly throughout its life.
The VIPER 1 and VIPER 2 projects were a very serious attempt to realise a formally verified processor, made some years ago, and were used in a railway signalling application. Besides being criticised as being too slow and restrictive, they did not gain widespread adoption. VIPER 1 and VIPER 2 were processors based upon the xe2x80x9cconventionalxe2x80x9d location addressing paradigm. In effect, the degree of proof which they could attain was the consequence of an engineering trade-off where a severely restricted set of machine code instructions impeded the usefulness of the reduced instruction set computer.
Specialised processors more suited to some declarative languages have been developed. For example, dataflow architectures have been developed over several decades. One is shown in an article in Communications of the ACM, January 1985, vol 28 no. 1, xe2x80x9cThe Manchester Prototype Dataflow computerxe2x80x9d by Gurd et al. This involves a pipelined ring structure including a token queue, for storing data and instructions to be processed, a combiner (also called a matching unit), for combining instructions and associated data, an instruction store containing machine code for each instruction, and a number of execution units coupled in parallel for carrying out particular functions. The ring also contains a switch for switching the output of the execution units either to a system output, or back to the start of the ring, the token queue, for further processing.
The reliability and provability of correct operation of this type of hardware architecture, or of its component parts still presents problems. Conventional ways of improving hardware reliability include specifying high reliability components for mission critical parts, carrying out burn in of parts, and providing redundancy at component and/or system level. A drawback with redundancy is the additional cost, and the risk of the failure in the hardware for detecting failure and selecting which of the redundant systems or components to choose in the event of a fault. Such additional complexity makes the task of verifying correct operation, or of being certain of detecting faulty operation, much more difficult. Another conventional way of handling both hardware and software faults such as radiation induced errors in stored values, is to include a checking mechanism where, for example, a parity check is the simplest method of detecting the occurrence of a single error. This is used in some random access memories (RAM), which store a parity check bit for each byte of data, then verify the parity bit is correct for that byte when the byte is read out.
It is an object of the invention to provide improved arrangements which address the above mentioned problems.
According to the invention there is provided a processor for executing instructions, comprising a data store, an instruction store, a combiner for combining instructions and data associated with a respective one of the instructions, processing elements for carrying out the instructions and outputting results, wherein the processing elements and the combiner comprise trusted circuitry, the trusted circuitry comprising circuitry whose design has been proven to operate correctly, and comprising self checking circuitry for checking that it has not operated incorrectly, the processor further comprising circuitry for checking for errors in data and instructions input to the processing elements and to the combiner.
An advantage of the combination of error checking at the inputs to the processing elements and the combiner, and having self-checking circuitry for these parts, is that the amount of circuitry which needs to be trusted, (i.e. of proven design, and verified operation) can be advantageously limited. This enables other parts of the processor to use circuitry which is not necessarily rigorously verified, and therefore can be constructed more simply and to operate faster.
An advantage of checking the combined data and instruction, rather than e.g. conventional parity checking of individual bytes read out of RAM, is that a wider range of errors, such as addressing errors, and multi-bit errors, can be detected before execution.
Furthermore, an advantage of having error checking at the input to the processing elements is that it enables different types of data to be segregated. Different parts of the circuitry can be allocated to handle different types of data, and finer granularity of checking and error confinement can be used to check that the right type or form of data is being input. This can provide a greater guarantee of correct segregation of data, and thus further guarantee integrity of the system. It is a simpler and more direct method than existing software based segregation of different types of data.
Preferably the processor comprises circuitry for detecting an error in data output by the processor.
An advantage of this is that it enables untrusted (or possibly flawed) circuitry such as data storage elements, to be used before the data is output, yet still maintain reassurance that correct data is being output.
Preferably circuitry for checking the data comprises circuitry for adding error detection information to the data before the data is passed to untrusted circuitry, and circuitry for using the error detecting information to detect errors in the data after it has passed through the untrusted circuitry. This is an efficient way of verifying that the data has not been corrupted by the untrusted parts of the circuitry, with little reduction in bandwidth or additional cost in processing time or hardware.
Preferably the circuitry for detecting an error and adding the error detection information comprises trusted circuitry. An advantage of this is that otherwise any errors in the detecting or adding of this information might not be captured, and the error detection cannot be trusted completely. This helps ensure that every possible error will be captured and contained.
Preferably the processor is arranged to handle data of different types, and comprises circuitry for detecting which type a given piece of data is and checking that the type is a valid type for whatever operation is to be carried out on the data.
This is a preferred way of enabling segregation of different types of data, such as highly critical data, or partially processed data, to ensure such types are processed or output at the correct time, and by the correct piece of hardware or output to the correct destination for example.
Preferably the type of data is indicated in a label attached to the given piece of data.
Again, this is an efficient way of enabling the different types of data to be certainly and assuredly segregated and processed accordingly.
Preferably the error detection information relates to a bound data packet comprising the label and the associated data.
An advantage of the error detection being at this level is that it can catch incorrect labels or data, and also catch an incorrect association of otherwise correct label and data.
Preferably the self-checking circuitry comprises a series of state machines, comprising least a first and a second state machine, the first state machine being arranged to receive one or more data inputs to be checked, and provided with data outputs for reflecting the one or more data inputs, and an alarm output for indicating that the data inputs are incorrect, the second state machine being coupled to the data outputs and the alarm output of the first state machine, and being arranged to verify that the data output and the alarm output of the first state machine are correct
An advantage of such cascaded state machines for checking is that because the outputs mirror the inputs, it is possible to use identical or near identical state machines throughout the series. The more finite state machines there are in the series, the higher is the assurance that any error in the inputs or in the self-checking circuitry will be detected. Thus once the circuit design is proved for one state machine, others can be added easily to give any desired degree of assurance, without increasing the burden of proving the design. In particular this gives reliable detection of multiple errors in the state machines. In contrast, in a parallel redundant scheme, it is possible for some multiple errors to go undetected.
Preferably the first state machine comprises a processing function and is arranged to output one or more processed data outputs, and all the subsequent state machines in the series are arranged to receive the processed data outputs from a respective preceding one of the state machines, check if any are incorrect and output them to a respective succeeding one of the state machines.
An advantage of integrating the processing function is that greater assurance of correct operation can be obtained than if the processing function is separate, and only its outputs are checked.
Preferably a data output of the last in the series of state machines is fed back into any of the state machines. An advantage arising from the feedback is that the verification can now include not only the operation of intermediate state machines, but the operation of the final state machine which drives the output data signal or signals. This is useful to cover this gap in the trusted circuitry. It may be warranted if for example the data output triggers expensive remedial action, rather than merely flagging a warning light for example.
Preferably the circuitry for checking the operation of the processing elements further comprises two or more state machines coupled in series, and circuitry for carrying out a sequence which causes toggling of each output of each state machine to verify the operation of each output of the state machines.
This enables both the logical operation of the state machines, and circuitry between the state machines for example, to be verified.
According to another aspect of the invention, there is provided a circuit arrangement comprising a series of state machines, comprising least a first and a second state machine, the first state machine being arranged to have a data output, and an alarm output for indicating incorrect internal operation, the second state machine being coupled to the data output and the alarm output of the first state machine, and being arranged to verify that the data output and the alarm output of the first state machine are correct, a last state machine in the series being arranged to output an indication of correct operation, and a data output, the data output being fed back as an input into one of the series of state machines.
An advantage of such state machines is that their internal operation can be dynamically verified during operation. An additional advantage arising from the feedback is that the verification can now include not only the logical operation, but also the correctness of operation of the circuitry which drives the output data signal or signals.
According to another aspect of the invention, there is provided an arrangement of two or more redundant processing systems, each outputting processed data, and a selector for selecting one of the processed data outputs, the arrangement further comprising circuitry for checking the correct operation of the respective processing system, the circuitry for checking the processed outputs, and the circuitry for selecting between the processed outputs comprising trusted circuitry, the trusted circuitry comprising circuitry whose design has been proved to be correct, and comprising self-checking circuitry for checking if it has operated incorrectly.
An advantage of this arrangement is that it is no longer necessary to provide an odd number of redundant systems as is employed in conventional xe2x80x9cvotingxe2x80x9d techniques. Instead, the trusted circuitry is sufficient to know which of an even number of systems is working incorrectly. Thus fewer redundant systems will be needed to assure a given level of reliability.
Preferably the circuitry for checking the correct operation comprises the above circuit arrangement having a series of state machines.
A further aim is to identify and provide a single basic building block from which we can construct the hardware platform upon which to support that formal language. This is achieved by identifying and building a basic universal computing functional component that is expressly amenable to be designed to possess assuredly correct operation. Because such a design is difficult and expensive to achieve, the approach set out below adopts the notion of having a single design of hardware building block. However, the building block is capable of being reconfigured so as to provide a set of assured computational functions, together with a set of functions that assist in the self-checking of each functional device.
This approach uses a Boolean function which can be described as being xe2x80x9creversiblexe2x80x9d. This means that the Boolean function is its own inverse. Many such reversible Boolean functions exist. One has been chosen to exemplify how a reversible function assists in the checking of operational correctness. Thus, we show how a set of assured universal computational functions is obtained.
By appropriately combining a number of simple reversible functions in order to obtain a function of higher order, that higher order function can itself be trusted. The checking described can be used recursively and nested at various levels in the design. The approach described is independent of an implementation technology. It will apply to any digital processing system. The bistate devices could be implemented in, for example, an optical or electronic technology where switching is performed by bistate elementary devices.
According to this aspect of the invention, there is provided circuitry which forms a reversible gate, the circuitry comprising three or more inputs, denoted A, B, C, and the same number of outputs, a first of the outputs taking the same value as input A, a second of the outputs taking the value of input B, and a third of the outputs being arranged to have a logic value which is a reversible Boolean function of the three inputs
An advantage of such an elementary gate is that it enables assured checking of a given logic function since it is reversible. Simply by taking the three specified outputs and applying them to inputs A, B, C of a second identical circuit, the outputs of the second circuit should regenerate the original inputs to the first circuit, A, B and C. The basic circuit can be used in serial combinations to perform a given computational function, and that overall function will also posses reversibility. The reversibility lends itself to the provision of assured checking. This checking can be achieved by comparing the outputs of the reversed function, with the original input signals. Thus by combining multiple such blocks, any complex Boolean expression can be implemented efficiently. Also, since each of the blocks are trusted, the circuitry required for checking the correctness of operation of the complex Boolean expression can be provided easily and proved with a minimum of effort. Such types of logic also lend themselves to implementation in optical circuitry or any other type of digital technology, for appropriate applications.
Preferably the reversible Boolean function comprises the function (A AND B) XOR C. This type of reversible Boolean logic is particularly useful as it can be used to form logical AND, XOR, NOT and COPY functions, simply by tying one of the three inputs as described below in the detailed description section.
According to a further aspect of the invention, there is provided an arrangement comprising a first and a second reversible logic block coupled in series, and a comparator arranged to verify the operation of the first reversible logic block by comparing an input of the first logic block with an output of the second reversible logic block, the first and second logic blocks having the same internal operation.
An advantage of such an arrangement is that since the logic block is reversible, the same block can be used for verification, as for implementing the function. Thus once the internal design of the block is proved to be correct, to implement the desired function, the circuitry for verification that the block is operating correctly, can be added with little or no extra effort required to prove that the verification circuitry has been designed correctly.
Preferably, the first reversible logic block comprises two or more of the above mentioned reversible gates coupled so as to implement a more complex boolean logic function.
Another aspect of the invention provides a processor for a processor for executing instructions, comprising a data store, and instruction store, a combiner for combining instructions and data associated with a respective one of the instructions, and processing elements for carrying out the instructions and outputting results, the data store having an arrangement to turn off a portion of the data store found to be faulty during operation of the data store. This enables reliability to be improved easily and cost effectively. A convenient way of achieving this is to use a content addressable memory. Another aspect of the invention provides a processor for a processor for executing instructions, comprising a data store, an instruction store, a combiner for combining instructions and data associated with a respective one of the instructions, processing elements for carrying out the instructions and outputting results, one or more external interfaces and a selector for selectively coupling the external interfaces to the processing elements. This brings two advantages, firstly redundancy can be provided, to avoid a failed processor element blocking an interface. Secondly, it can enable a single interface to be coupled to multiple processors in parallel for faster operation, or multiple interfaces to be coupled in parallel, as appropriate.
A further aspect of the invention provides a memory arrangement having storage elements, addressable by a content addressing arrangement, and an arrangement for turning off storage elements found to be faulty during operation, while maintaining availability of the remaining storage elements.
A further aspect of the invention provides a processor for executing instructions, comprising a data store, an instruction store, a combiner for combining instructions, and data associated with a respective one of the instructions, processing elements for carrying out the instructions and outputting results, the processor having a fault detector for indicating whether an instruction has been carried out successfully, the processor being arranged to store an instruction until it has been carried out successfully, and to repeat an instruction in response to an indication that the instruction has not been carried out successfully. Such recovery from faults again enables the reliability to be improved in a cost effective manner.
Further embodiments of the invention provide methods of operation of the above hardware, methods of using the above hardware to produce output signals, and systems for running software written in a declarative language on the above hardware.
The preferred features may be combined in any manner, or combined with any of the aspects of the invention, as would be apparent to a person skilled in the art. Other advantages than those mentioned above will be apparent to a person skilled in the art, particularly in relation to prior art other than that discussed above.