The prior art offer several attempts to maximize the efficiency of logic circuitry for a particular purpose while enabling robust, full purpose general computing by means of a same data processing or computational engine. These prior art efforts include method and systems (a.) based on the Von Neumann architecture; (b.) that apply field programmable gate array devices and/or programmable logic devices; (c.) that include application-specific integrated circuit devices; (d.) based on Side-by-Side processing; (e.) that apply Very Long Instruction Word concepts; and/or (f.) that instantiate Cell Processor.
With all of these efforts in improved data processing and accelerated computational processing, most conventional electronic logic processors still include a dedicated arithmetic logic unit (hereinafter, “ALU”) that is tasked with computation. The ALU may be fed by a register file; that is, the computations that are performed in the ALU are computations performed on data that is stored in this register file. Information or words of data travel from the register file into the ALU and then the results travel from the ALU back into the register file.
The prior art teaches that the data stored in the memory may be moved, read from and acted upon by the ALU and then stored away is a sequence of operating codes (hereinafter, “op-codes”). Each individual op-code is formed by a unique set of instructions which in combination direct the processor to perform a small incremental operation, i.e., the size of one computation, such as one add or one multiply, or one load from a location in memory into a register in the register file.
Similarly, words in memory, e.g., stored in an external memory or cache, in prior art structures generally go from memory into the register file, and then from the register file back into memory. Early on in the computer era, memory could be directly connected to ALU, but such connectivity is substantially impractical in most modern architectures, and the prior art teaches away from reducing mediation of interactivity between a memory and (a.) a data processing array; and/or (b.) or a computational engine of a computer.
Yet significant processing delays are introduced in forcing heterogeneous circuit elements of a data processing system, to include general-purpose computers, to transfer instructions through circuitous steps in the process of organizing resources to apply configuration and operating data in order to create a desired output.
Certain prior art processors include a register called a program counter that may contain the address of a next instruction to be executed by the processor. Prior art branch operations, to include fetch and many control functions, are often executed in the prior art by modifying this program counter register to sequentially point to differing and appropriate addresses within a system memory and/or addresses of other memory accessible to the processor where executable instructions are stored.
In one example, prior art op-code based processors typically perform only a small amount of computational work per instruction, wherein the prior art processor might, for example, first require receiving and executing many instructions before the processor might be enabled to calculate or determine where to next branch to within a software-directed process. In novel distinction, in a computational or data processing system operating in accordance with certain optional aspects of the method of the present invention, an instruction may be provided that enables a processor to execute or instantiate the equivalent of dozens or hundreds of prior art instructions. Another optional aspect of invented method optionally includes providing the instruction to the processor having a portion of the currently executable instruction that specifically determines how a succeeding executable instruction may be read, acquired, received, and/or generated for use by the processor.
The execution of pluralities or multiplicities of prior art instructions by a prior art processor require the commitment of multiplicities of system clock cycles to perform certain required operational activities and thus fail to optimally employ the data processing and computational potential of processor operations. There is therefore a long-felt need to provide superior methods and systems that more efficiently and flexibly execute computational and data processing tasks.