This invention relates to the loading of program code, data, and control information into a processing engine. More particularly, this invention relates to loading program code, data, and control information into a processing engine through a single data path.
A processing engine processes input data to generate output data. A processing engine typically includes the following: a memory, a program counter, control logic, an execution pipeline, and a register file. The memory holds a stored program, which is a set of instructions stored in a block of memory. The program counter typically contains an address to an instruction in memory. After an instruction is read, the program counter is reloaded with the address of the next instruction to be read. Control logic decodes the current instruction, thereby controlling the execution pipeline. An instruction typically contains the following: one or more operands (e.g., contents of source registers or memory addresses, or constants), an operation code (e.g., add, subtract, multiply, shift, load, store, branch, etc.), and a destination register or memory address for a resulting value. The execution pipeline computes output data from input data by performing arithmetic and logic operations defined by the decoded instruction. Each operation is preferably processed in multiple stages (e.g., a fetch stage, a decode stage, an execute stage, and a write-back stage) such that, where possible, the different stages can be overlapped to increase throughput (i.e., the rate at which temporary values or partial results are computed by the execution pipeline). The register file typically holds temporary values or partial results computed by the execution pipeline. In addition, the register file can hold constants that are initialized before the program executes. The temporary values and constants can be changed from time to time during program execution.
Processing engines of a known class typically do not have the ability to perform random access of their input data. Instead, those engines can only access (or read) a next piece of data from the input data stream. If no data is available, such processing engines stall until data becomes available. Logically, the input data stream can be considered the output of a first-in-first-out (FIFO) system. The data positioned first in the input data stream is processed first. Successive sequences of input data may need to be processed by different programs or by using different sets of constants in the register file. However, the memory and register file are often too small to accommodate all possible programs and constants simultaneously. As a result, either the memory, register file, or both will need to be reloaded in preparation for each sequence of input data.
A typical sequence of operations required by known processing engines is shown in table (1).
TABLE (1)1Load a program into Memory2Load constants into the Register File3Initialize the Program Counter4Supply an arbitrary number of pieces of input data5Wait for all of the input data to be processed6Load new constants into the Register File7Initialize the Program Counter8Supply an arbitrary number of pieces of input data9Wait for all of the input data to be processed10 Load a new program into Memory11 Initialize the Program Counter. . .. . .
Different data paths are used for initializing the system and executing a stored program. To ensure that a data stream has been processed to completion before loading a new program or a new set of constants, initialization must be synchronized by external control logic.
Known processing engines have several disadvantages. First, the use of multiple input data paths increases hardware complexity. Additional wiring and logic circuitry are needed. For example, multiplexers are needed to select whether initialization data or program data is to be sent to a register file. Also, multiple operations (e.g., run mode and setup mode) do not implement the same functions the same way. For example, the run mode (i.e., program mode) function of writing to a register file is typically different than the setup mode function of writing to a register file. In setup mode, an address is sent to a register file directly from an initialization path and a multiplexer. In run mode, an address is sent to the register file via control logic and the multiplexer. Second, the use of external control logic increases hardware and creates a dependency on an external source. External control logic is needed to synchronize the loading and execution of data through the multiple input data paths such that data is processed in the correct order at the appropriate time. The processing engine must function in the same timing domain as the external control logic—the processing engine cannot operate in a separate timing domain nor can data be supplied to it at a sporadic rate.
In view of the foregoing, it would be desirable to provide a processing engine with an efficient mechanism for loading program code, data, and control information.
It would also be desirable to provide a processing engine that allows a program to be processed with little or no external control logic.