This invention relates to data processors containing separate multiply/divide logic such as those that are contained in calculators, microcomputers, minicomputers, and large scale computers generally, and more particular to high speed data processors with high speed multiply and divide circuits. This invention relates to, Ser. No 520,880 filed 8/5/83, and Ser. No. 538,634, filed 10/25/83.
The central theme for a high speed processor is speed. In the past, to implement the multiply and divide operation often required a large size of logic which was prohibited when implementing a single chip operation. Therefore, the processing units that were used to implement these functions were both large and costly.
Processing units, sometimes referred to as data processors, are very effective devices in the handling of menial and unimaginative tasks. These tasks typically require the monitoring of a physical phenomena or the brute force manipulation of data either in arithmatic or logical operation. The speed of the processing unit is determinative of its applicability and usefulness. Processors which are normally referred to as "slow", are generally unfeasible for certain tasks due to the time and/or cost restraints involved that result from the slow operation of a processing unit. Other processing units which are normally referred to as "fast" have traditionally been so expensive that they are impractical for certain applications.
Data processors are generally classified according to their size and ability. At the lowest end of the classification of the capability spectrum are the hand-held calculators which perform simple or routine operations; microcomputers are used in consumer and small business environments due to their moderate cost and somewhat slow computing speed; minicomputers have a larger memory and capability and are used in industrial, laboratory or in a medium business setting; and large scale computers range in size, depending on their specified task and typically handle large data bases and multiple users.
At the opposing ends of the classification spectrum, as discussed above, the hand-held calculators and giant computers, do not meet the demands or constraints of the average users. A balancing of costs points the average user to the micro or minicomputer which, because of the low cost, the user is willing to accept slow computing speeds.
The ability of a data processor to operate in a real time environment or mode is very advantageous. Real time operation allows task execution prioritizing which permits a higher priority task to be performed by interrupting a low priority task. To perform the higher priority task, the criteria and state of the lower priority task are temporarily placed in memory so that the data processing unit may, at a later date retreive this data and continue the task where it was prematurely terminated. The amount of time that the data processor takes to dump the material associated with a lower priority task into a section of memory reserved for this purpose and then retrieve it determines if the use of the interrupts and change to a higher priority task is impractical. If a disproportional amount of time is devoted to dumping or retrieving data, obviously the efficiency of the processor suffers dramatically.
The architecture of a processing unit pertains to the various components parts of the processor and the interconnection between them. A data processor typically uses a Central Processing Unit (CPU) as the control means for the other component parts. The CPU is generally interfaced to or includes an Arithmetic Logic Unit (ALU).
The ALU is a device which accepts data and performs certain operations thereon. These operations are generally grouped as either arithmetic or logical in nature. The CPU controls the data delivered to and selects the operation of the ALU. One interface of a CPU to an ALU is illustrated in U.S. Pat. No. 3,761,698 issued Sept. 25, 1973.
The Arithmetic Logic Unit (ALU) performs an operation on the actual bit structure of the data so as to implement the desired function. Acceptance of data may be either sequential by bit, byte, data word or/and multiple or submultiple or above via a data buss. The data is stored within the CPU or alternatively in memory in the form of data words. The length of the data word is used to describe the data processor since the length is directly related to the precision of the data processor. A 16-bit data word processor has the capability of defining the number with much more precision than a four-bit data word processor.
The processor accepts data, manipulates it using an Arithmetic Logic Unit, and places it in an inactive state such as retaining it in a memory until the data is later needed. A communication channel electrically connects the CPU and the memory. Examples of the memory includes such devices as a Random Access Memory (RAM), a Read Only Memory (ROM), or a Magnetic Storage Device such as a magnetic tape or disk. An example of the interconnection between a processor and ROM or RAM is illustrated in U.S. Pat. No. 4,064,554 issued Sept. 20, 1977 to Tubbs.
The CPU responds to instruction storage as machine language. Machine language is instructions, coded into a word of the same length as the data word. The instructions are stored in the memory and are retrieved by the CPU according to locations code which may be a sequential location that are sequentially addressed by the CPU.
Since the memory contains both data and instructions for the processor, some flag or signal is used to keep the processor from confusing what it is receiving from the memory. A Von Neumann architecture provides for flagging of the data and instructions stored in memory. This arrangement allows the processor to perform tasks according to prioritization. When a high priority task interrupts a lower priority task, then the lower priority task operation is halted and the data in the processor and the status information relating to the lower task is stored in a memory until the higher priority task is completed. Once completed, the processor is set at the state where the lower priority task was interrupted.
The ability to dump data into memory and then to retrieve it at a later time is an important advantage for the data processor since multiple terminals or tasks are thereby serviced in line with their priority.
The structure of a memory into words, pages and chapters allows the processor more flexibility in its operation in that the data may be easily stored or retrieved through a word, page and chapter address.
Data processors generally act in conjunction with other data processors and exchange data and information to accomplish a particular goal. An example of such an application is U.S. Pat. No. 3,700,866 issued Oct. 24, 1972 to Taylor et al. In the Taylor patent cascading processors are used to achieve a minimum of uncertainty in the output signal. Typically a system of processors are arranged heirarchically so that data shifts through the lower levels to the higher level processors.
It should be noted, that the dimensions of Stephenson, Tubbs and Taylor, as referenced above, do not achieve a speed commensurate with the modern demands on a processing unit. Their basic handling of instructions, architecture structure and manipulation of data prevents them from achieving a high speed.
The time used by the processor to complete a single instruction, a single clock cycle, or the time between the rising edge of a single clock pulse to the rising edge of a falling clock pulse, is referred to as instruction or cycle time. Each device utilizes varying cycle times and sometimes it may take more than one cycle for a particularly involved operation to be performed.
In order to streamline or improve the cycle time on a processor, a method known as "look ahead" or "prefetch" has emerged. In a prefetched operation the next sequential instruction is obtained and decoded so that when the current instruction is completed, the next instruction is ready for operation. Since each instruction may be decoded prior to use, this technique eliminates the idle time experienced by the processor until that instruction is decoded by the processor. One such "look ahead" or "prefetch" method is described in U.S. Pat. No. 3,573,853 issued Apr. 6, 1971 to Watson et al.
To achieve prefetching, the appropriate timing of the instructions and data along the common bus must be maintained so as not to confuse the processor. Once the processor becomes confused as to whether it is receiving data or instructions, all further operations will only result in unintelligilble data being generated.
Prefetching along, though, will not transform a "slow" processor into a "fast" processor which is defined as a processor with a cycle time of approximately 200 nano-seconds. Economical speed is the key. A low-cost, high-speed processor does not yet exist in the art; such a device will open new areas of application and permit the utilization of such a device in areas which are presently economically prohibitive and/or required great speed.