Fuzzy logic, neural networks, and other parallel, array oriented applications are becoming very popular and important in data processing. Most digital data processing systems today have not been designed with fuzzy logic, neural networks, and other parallel, array oriented applications specifically in mind. Thus there are considerable performance and cost benefits to be gained in designing digital data processing systems which are especially adapted and designed to meet the requirements of fuzzy logic, neural networks, and other parallel, array oriented applications.
Saturation Protection
Certain arithmetic operations, such as addition and subtraction, may result in overflow in either the positive or negative direction, "Overflow" refers to a situation in which the resulting value from the arithmetic operation exceeds the maximum value which the destination register can store (e.g. attempting to store a result of % 100000001 in an 8-bit register), "Saturation" or "saturation protection" refers to a method of handling overflow situations in which the value in the register is replaced with an upper or lower boundary value, for example $FF for an 8-bit unsigned upper boundary value, In general, there are two common ways to handle overflow. First, the result may be allowed to roll over, i.e. $01 may be stored in the destination register (non-saturating approach). Second, the result value may be replaced by either an upper bound value or a lower bound value (saturating approach),
A common problem in data processors is the need to perform arithmetic computations on data values which are wider, i.e. have more bits, than can be accommodated by the available registers and by the available Arithmetic Logic Unit (ALU) circuitry. For example, it is not uncommon for a data processor to be required to add two 32-bit data values using a 16-bit ALU. An approach was needed which would efficiently support saturation protection for extended length operations.
Communications Between Data Processors
It is desirable for fuzzy logic, neural networks, and other parallel, array oriented applications to utilize a multi-dimensional array of integrated circuits. Thus, the communications between integrated circuits in fuzzy logic, neural networks, and other parallel, array oriented applications is often quite important.
In some prior art data processing systems, such as, for example the transputer, the communications between integrated circuits is controlled interactively by the execution of instructions within the integrated circuits. Thus one or more instructions are required to transfer data to other integrated circuits, and one or more instructions are required to receive data from other integrated circuits. In yet other prior art data processing systems, such as telephone switching networks and certain computer networks, the data itself which is being transferred contains routing information regarding which integrated circuits are the intended recipients of the data.
The goal for fuzzy logic, neural networks, and other parallel, array oriented applications is to develop an integrated circuit communications technique and an integrated circuit pin architecture which will allow versatile data passing capabilities between integrated circuits, yet which: (1) will not require a significant amount of circuitry external to the array of integrated circuits; (2) will not require significant software overhead for data passing capabilities; and (3) which will require as few dedicated integrated circuit pins as possible.
Extended Length Operations in a Data Processor
A common problem in data processors is the need to perform arithmetic computations on data values which are wider, i.e. have more bits, than can be accommodated by the available Arithmetic Logic Unit (ALU) circuitry in one ALU cycle. For example, it is not uncommon for a data processor to be required to add two 32-bit data values using a 16-bit ALU. Prior art data processors typically support such extended arithmetic by providing a single "carry" or "extension" bit and by providing two versions of computation instructions in order to specify whether or not the carry bit is used as an input to the instruction (e.g., "add" and "add with carry", "subtract" and "subtract with borrow", "shift right" and "shift right with extension", etc.). This traditional approach is adequate for a limited repertoire of operations, but it does not efficiently support other extended length operations. An approach was needed which would efficiently support an expanded repertoire of extended length operations.
Data Movement Operations in a Data Processor
A common problem in data processors using vectors is the need to calculate the sum, or total, of the elements of a vector. In some applications, only a scalar result (i.e. the total of all vector elements) is required. In other applications, a vector of cumulative sums must be calculated. The need for combining vector elements into a single overall aggregate value or into a vector of cumulative partial aggregates is not limited to addition. Other aggregation operations, such as minimum and maximum, are also required for some applications. A more effective technique and mechanism for combining vector elements into a single overall aggregate value is required.
Multi-Level Conditional Execution of Instructions
Conditional execution of instructions is a very useful feature in all types of data processors. In many data processors, conditional branch instructions have been used to implement conditional execution of instructions. However, in SIMD (Single Instruction Multiple Data) processors, enable or mask bits alone are not suitable for complex derision trees which require the next state of the enable or mask bits to be calculated using a series of complex logical operations. A solution is needed which will allow the conditional execution of instructions to be implemented in a more straightforward manner.
Data Processor Architecture
SISD (Single Instruction Single Data) processors are most useful for performing certain types of data processing tasks. SIMD (Single Instruction Multiple Data) processors are most useful for performing other types of data processing tasks. Some applications, such as fuzzy logic, neural networks, and other parallel, array oriented applications tend to utilize some data processing tasks that are best performed by SISD processors, as well as some data processing tasks that are best performed by SIMD processors.
Loading Incoming Data into a Data Processor
It is desirable for fuzzy logic, neural networks, and other parallel, array oriented applications to utilize a multi-dimensional array of integrated circuits which require the transfer of considerable amounts of data. Thus the technique used by integrated circuits to select and store incoming data is of considerable importance in fuzzy logic, neural networks, and other parallel, array oriented applications. The technique used by integrated circuits to select and store incoming data must be flexible in order to allow incoming data to be selected and stored in a variety of patterns, depending upon the particular requirements of the data processing system.
In the related prior art, DMA (Direct Memory Access) is a technique whereby an input/output device is given direct access to memory across an address and data bus; the input/output device therefore does not have to access memory by means of a processor. Also in the related prior art, processors of various types internally generate addresses in response to instructions which utilize various addressing modes.
Stalling Technique and Mechanism for a Data Processor
An integrated circuit used in fuzzy logic, neural networks, and other parallel, array oriented applications may be executing instructions at the same time that the integrated circuit is receiving data from an external source. The problem that arises is data coherency. The integrated circuit must have a mechanism to determine the validity of the data which is to be used during the execution of an instruction. The use of invalid data is generally a catastrophic problem, and is thus unacceptable in most data processing systems.
In the related prior art, many techniques are used to ensure data coherency. There are many software data passing or synchronization techniques, such as for example, semaphores. In addition, there are many hardware data passing techniques, such as status bits at data interfaces. Unfortunately, with hardware status bits, a polling or interrupt software routine may be required, or alternately a queuing scheme may be required.
For fuzzy logic, neural networks, and other parallel, array oriented applications, a data coherency technique and mechanism is needed which ensures data coherency for both vector and scalar instructions, which requires minimal software overhead, and which can be implemented using minimal circuitry.
Maximum and Minimum Determinations
A common operation required by fuzzy logic, neural networks, and other parallel, array oriented applications is a comparison operation to determine which data value or data values in a group of two or more data values equal the maximum value. Likewise, a common operation required by fuzzy logic, neural networks, and other parallel, array oriented applications is a comparison operation to determine which data value or data values in a group of two or more data values equal the minimum value.
It is desirable to support both signed (2's complement) and unsigned numbers. Also, it is desirable to support extended length (multi-byte) operands. Because it is desirable for fuzzy logic, neural networks, and other parallel, array oriented applications to utilize a multi-dimensional array of integrated circuits, it is additionally desirable to be able to perform such maximum and minimum comparisons across the boundaries of integrated circuits.
A software routine which performs a maximum determination or a minimum determination could alternatively be implemented using prior art software instructions. However, such a software routine would involve a long sequence of instructions and it would take a long time to execute. In addition, it would be difficult to extend a software implementation across the boundaries of integrated circuits running different software programs.