Owing largely to the history of microprocessor development, the von Neumann processor, with a single arithmetic-logic unit through which all data must pass, is a common reference against which other processing models are compared. The computational model incorporating the von Neumann processor typically envisions a sequential processor, a randomly-addressable memory, a single arithmetic-logic unit (ALU), and a control unit. The memory stores information and instructions, and the ALU transforms bit patterns. The control unit reads data and instructions from memory and routes data through the ALU, and back into memory. This Computational model is deeply embedded in programming languages such as C and Mathlab. For example, when the computer executes a function, such as sin(x), the main flow of execution stops; the sin(x) function is executed, typically to termination, and the main program flow resumes where it left off. Sequential processors execute alternative computations by switching the program flow through conditional branching. Program agility is thus achieved by changing the flow of execution. Time efficiency is not intrinsic in a sequential model of operation. According to different control inputs, different programs or sub-programs are granted run-priority. In one case, the processor executes one sequence of instructions. In another case, the processor executes a different sequence of instructions.
FIG. 1 illustrates one embodiment of a von Neumann type processor. An input 101 is received by the microprocessor 105. A memory module 103 coupled to the microprocessor 105 contains various processes and algorithms for processing incoming data, and is capable of downloading these programs into the processor 105. An output module 107 is coupled with the processor output, and is capable of receiving the processor output.
FIG. 2 is an exemplary looping process commonly occurring in conjunction with the architecture of the von Neumann processor illustrated in FIG. 1. According to the Step 110, the processor 105 receives input data “D” from the input module 101. According to the step 112, a counter value “n” is set to one. According to the step 114, the process n within the memory area 103 is loaded into the microprocessor 105. According to the step 116, the data D is processed with algorithm n. According to the step 118, the output module 107 evaluates whether the processed data falls within a pre-determined range. According to the step 120, if the processed data falls within the pre-determined range, the processed data is sent to the output 120. If in the step 118, the processed data falls outside the predetermined range, the value n is incremented by one in the step 122, and the process returns to the step 114, loading the process n into the microprocessor. According to the process illustrated in FIG. 2, the “looping” is recurrent until a desired data outcome is derived. The number of loops may be determined by control signals which are themselves generated by output data. Alternatively, the number of loops may depend upon the execution of a predetermined sequence of operations. The process illustrated in FIG. 2 is exemplary of one form of a “looping” program, wherein successive outputs are discarded if they are not within a specified range. Alternative looping programs are possible, such as accumulating successive outputs of processed data which have been processed by various algorithms successively loaded into the processor. The essential point illustrated by FIG. 2, however, is that looping programs which require multiple iterations become time consuming, each iteration consuming more and more processing time. The same phenomena occurs with branching programs wherein a branch “dead ends” and must be recalculated according to a different algorithm. Thus, such architectures are not time optimized.
A second limitation of serial processing techniques generally associated with RISC (Reduced Instruction Set Computer), DSP (Digital Signal Processor) and von Neumann type serial processors inheres from the inability of serial processing techniques to take full advantage of ultra low power (“ULP”) technology. In spacecraft applications, the need to conserve power is critical. This makes ultra low power (ULP) technology particularly attractive in spacecraft applications. The limitation of serial processing techniques in ULP technology can be illustrated by understanding the sources of power consumption in a CMOS circuit. Dynamic power consumption occurs when a transistor switches state, and is proportional to the square of the voltage. From this, it is easily understood that, when power voltage levels are reduced from approximately five volts to approximately one-half volt, dynamic power consumption may be reduced somewhere on the order of two orders of magnitude. Static, or parasitic power consumption, on the other hand, is generally proportional to the source of the drain area, and therefore increases with the number of transistors in the circuit. Static power dissipation generally occurs due to leakage in parasitic source and drain diodes. In conventional CMOS circuits in the 5 volt range, the dynamic power consumption is typically the dominant source of energy consumption. For this reason, there is little parallelism in most serial type processing models. However, if the same fundamental schematic used in a traditional 5-volt CMOS circuit were used for a ULP circuit, the ratio of power lost through static or parasitic power consumption would increase. Static power consumption occurs regardless of processing.
Additionally, resistance to radiation is particularly vital in spacecraft applications. Without the earth's atmosphere, a circuit in outer space is bombarded with a higher level of background radiation than earthbound circuits. However, traditional CMOS processors are not easily radiation hardened without a significant performance degradation. Without radiation hardening, single event upsets, single event latchup, total ionizing dose and other radiation effects due to cosmic bombardment dramatically increase the likelihood of onboard failure in spacecraft applications.
The single processing path concept inherent in the von Neumann processor, is often referred to as exhibiting “minimum granularity.” As illustrated in FIG. 3, the von Neumann processor is at one end of the granularity spectrum. RISC and DSP processors are more granular than von Neumann processors. At the other end of the spectrum are Field Programmable Gate Arrays (FPGAs). FPGAs have maximum granularity, and are programmable down to the gate level. Fine-grained reconfigurable granularity offers great flexibility, and enables the architecture of the processor to be modified to closely match the architecture of the computation problem, offering the possibility of very high performance. However, fine-grained reconfigurability exacts a high price in area. It is estimated that only 1% of the area of a typical FPGA is available for useable logic; the rest is consumed in interconnect and configuration memory. Within the spectrum illustrated in FIG. 3, complex programmable logic devices (CPLDs) are slightly less granular than FPGAs, while digital signal processors (DSPs) and super-scalar CPUs are more granular than simple von Neumann-type microprocessors. Additionally, FPGAs are not typically radiation-hardened, making them particularly failure-pron in spacecraft applications where cosmic rays are unfiltered by the earth's atmosphere. Manufacture of radiation-tolerant FGPAs exacts a large prince in that the currently-available radiation-tolerant FGPAs have two orders of magnitude fewer equivalent gates than non-hardened FGPAs. Moreover, complex models synthesized from existing gates in FPGAs cannot take advantage of the circuit-level and layout-level optimizations which are attainable when these models are designed by hand.
What is needed, therefore, is a processor design configuration method that can be used advantageously in ULP applications. Additionally, the need exists for a processor that can be easily manufactured to exhibit a high degree of radiation tolerance. The need also exists for a processor which can reduce the amount of wasted CMOS circuitry associated with Field Programmable gate array devices. There is further a need for a processing device that is user-configurable to maximize efficiency. There is a further need for a processing device that reduces or eliminates conditional branching, looping, retracing and re-calculating of data, as well as other programming procedures that slow processing throughput.