Programmable devices are a class of general-purpose integrated circuits that can be configured for a wide variety of applications. Such programmable devices have two basic versions, mask programmable devices, which are programmed only by a manufacturer, and field programmable devices, which are programmable by the end user. In addition, programmable devices can be further categorized as programmable memory devices or programmable logic devices. Programmable memory devices include programmable read only memory (PROM), erasable programmable read only memory (EPROM) and electronically erasable programmable read only memory (EEPROM). Programmable logic devices include programmable logic array (PLA) devices, programmable array logic (PAL) devices, erasable programmable logic devices (EPLD) devices, and programmable gate arrays (FPGA).
As chip capacity continues to increase significantly, the use of field programmable gate arrays (FPGAs) is quickly replacing the use of application specific integrated circuits (ASICs). An ASIC is a specialized integrated circuit that is designed for a particular application and can be implemented as a specialized microprocessor. Notably, FPGAs can be designed using a variety of architectures that can include user configurable input/output blocks (IOBs), and programmable logic blocks having configurable interconnects and switching capability.
The advancement of computer chip technology has also resulted in the development of embedded processors and controllers. An embedded processor or controller can be a microprocessor or microcontroller circuitry that has been integrated into an electronic device as opposed to being built as a standalone module or “plugin card.” Advancement of FPGA technology has led to the development of FPGA-based system-on-chips (SoC) including FPGA-based embedded processor SoCs. An SoC is a fully functional product having its electronic circuitry contained on a single chip. While a microprocessor chip requires ancillary hardware electronic components to process instructions, an SoC would include all required ancillary electronics. For example, an SoC for a cellular telephone can include a microprocessor, encoder, decoder, digital signal processor (DSP), RAM and ROM. It should be understood within contemplation of the present invention that an FPGA-Based SoC does not necessarily include a microprocessor or microcontroller. For example, an SoC for a cellular telephone could include an encoder, decoder, digital signal processor (DSP), RAM and ROM that rely on an external microprocessor. An SoC could also include multiple processing modules coupled to each other via a bus or several busses. It should also be understood herein that “FPGA-based embedded processor SoCs” are a specific subset of FPGA-based SoCs that would include their own processors.
In order for FPGA users to develop FPGA-based SoCs or FPGA-based embedded processor SoCs, it is necessary for them to acquire intellectual property rights for system components and/or related technologies that are utilized to create the FPGA-based SoCs. These system components and/or technologies are called cores or Intellectual Property (IP) cores. An electronic file containing system component information can typically be used to represent the core. More generically, the IP cores can form one or more of the processing modules in an FPGA-based SoCs. The processing modules can either be hardware or software based.
Notwithstanding advantages provided by using FPGA-based SoCs, the development of these SoCs can be very challenging. One of the challenges includes communication among multiple hardware and software processors embedded in a FPGA-based SoC. Typically, such communication occurs over a bus. Unfortunately, communication over a bus involves a large amount of overhead due to bus arbitration times. Therefore, several clock cycles are typically needed for simple communication among processing modules.
Microprocessors typically have fixed instruction sets. Even though microprocessor vendors have usually provided a means to extend the instruction set through the use of accelerators or co-processors, such additional custom instructions are extremely limited due to the capabilities of the processors. Typically, users are allowed to add one or more instructions to perform some operation that is critical to their specific application, such as multiply-and-add or multiply-and-accumulate instructions operating on 1-3 pieces of data. Custom instructions of this form can slow down or stall the processor if they cannot perform their computation within a single clock cycle for the processor. In practice, this limits the complexity of the custom instruction due to the processor clock speed. Thus, there fails to exist not only a method and system for enabling fast linked processing modules with programmable links, but a method and system that can perform multiple custom operations within a single clock cycle on a processor.