Logic functionality may be configurable or programmable—these terms are essentially synonymous. Programmability or configurability may be performed at the semiconductor foundry where it is also said to be mask-programmable resulting in a hard-wired function after configuration. Programmability may also be field-programmable where a device may be programmed after delivery to the user. Field-programmable logic devices may also be re-programmable where the logic function may be changed from time-to-time in the field after purchase. Processors are said to be software programmable where a program in memory can be changed to alter the functionality, however, for most processors, their basic hardware logic function and instruction set is fixed of hard-wired.
Historically, DSP functionality has taken two forms: software programmable processors with arithmetically oriented instruction sets such as those offered by TI, Analog Devices, Motorola, and Agere (Lucent), and dedicated logic hardware functionality specifically performing arithmetic tasks. In recent years, an alternative approach to programmable DSP functionality has arisen where arrays of arithmetically oriented function modules are connected by reprogrammable routing resources, in a manner similar to that utilized in Field Programmable Gate Arrays (FPGAs), creating reprogrammable array DSP solutions. Reprogrammable array DSP solutions are being offered by companies like PACT, Leopard Logic, and Elixent as embeddable cores and by Chameleon as a discrete component. A core is an embeddable block of semiconductor functionality that can be included in a System-On-Chip (SOC) ASIC (Application Specific Integrated Circuit) design. These reprogrammable array DSP solutions always operate independently of any classical software programmable DSP architecture.
Meanwhile a different evolution in processor architecture has occurred for RISC (Reduced Instruction Set Computer) processors where synthesizeable processor cores are being offered by companies like ARC and Tensilica with the ability to customize instruction set extensions. Variations on these processors are also offered with multiplier-accumulator functions added enabling DSP applications to be better addressed. However, these processor cores are only customizable at the time the logic function is synthesized—which means some time prior to the construction of actual silicon. Their instruction set cannot be altered or reconfigured once the silicon implementation has been fabricated.
At the same time, it has been shown by companies such as ARC and Tensilica that the ability to create customized instructions can greatly improve the performance of a processor. Unfortunately, since these instructions are not alterable in the field (once the processor has been delivered to the customer) they cannot adapt to the surprises that arise when real-world phenomena are encountered upon powering-up the first prototype. Such discrepancies are even more prevalent for DSPs since they often deal with real-world phenomena like voice and video, and noisy communications mediums like cable modems, DSL, and wireless where unpredictability is inherent.
A research project summary presented at the Instat/MDR Embedded Processor Forum (Apr. 29, 2002) by Francesco Lertora, a System Architect at ST Microelectronics, had some similarities to the present invention. It was entitled “A Customized Processor for Face Recognition” and demonstrated a custom processor based on Tensilica's Xtensa processor core. Here, they coupled the configurable (not field programmable) instruction extensions of the Tensilica processor to a block of FPGA technology on a custom SOC design. To augment the Tensilica processor, they implemented arithmetic functions in the FPGA to perform DSP-type functions. In this example, the FPGA functionality not only performs operations where results are returned to the RISC processor, it also performs some I/O functions directly, essentially functioning at times as a coprocessor. While not combining a conventional DSP with an FPGA fabric in a tightly-coupled and dedicated manner with the FPGA subordinate to the conventional DSP as embodied in the present invention, this demonstration by ST does reveal some of the benefits of a processor with re-programmable instructions since it was able to considerably accelerate the required functionality. However, ST's chip designers gave in to the temptation to allow the FPGA to perform functions independently. In general, this adds a substantial amount of hardware dependence to the design flow, making it far more difficult for designers to use. DSP designers typically prefer to design in a high-level language like C and not have to deal with hardware dependencies. As soon as the FPGA is allowed to execute tasks in parallel with the conventional software programmable DSP, the overall DSP program must be partitioned into parallel tasks, a complex issue involving intimate knowledge of the hardware.
Another company that has discussed FPGA fabric performing instruction is GateChange. However, the proposed architecture includes an ARM (RISC) processor and also allows the FPGA fabric full co-processing capability, with complete access to the device's I/Os—certainly not constraining the FPGA fabric to be fully subordinate to the DSP as in the present invention.
FPGAs have been used for years to construct dedicated DSP functionality, sometimes in conjunction with a conventional DSP but operating as a separate functional element. In recent years, some FPGA suppliers like Xilinx and Altera have added dedicated multiplier functions. These essentially create a heterogeneous fabric where most of the modules are conventional Look-Up Table (LUT) based programmable modules, and some are fixed multiplier functions. This has made these devices more effective in terms of performance and density when arithmetic (DSP) functions are performed in dedicated hardware. These same FPGA suppliers now also offer RISC processors embedded in their FPGA devices. However, their FPGA functionality is not constrained to be subordinate to the processor—in fact their paradigm is just the opposite, with the processor acting as an enhancement to the FPGA function.
It is a generally accepted fact that for conventional, software programmable DSPs, less than 10% of the code often accounts for more than 90% of the execution cycles. It therefore follows that if a software programmable DSP were created with a field-configurable (field-programmable) instruction set, where dedicated functions with a high degree of parallelism can be applied to perform the functions consuming 90% of the cycles, the overall processor performance could be increased significantly.
However, a software programmable DSP with a field programmable instruction set does not exist. It appears that when reprogrammable array DSP solutions are developed, the creators are determined that this technology alone is the solution to the problem and it should be used as a separate functional entity from the conventional software programmable DSP. As offered, reprogrammable array DSP solutions are used for all DSP functions including the large quantity of instructions that normally occupy only 10% of the execution cycles. Unfortunately, this focus ignores the paradigm that exists for DSP development and the fact that DSP programmers—who are typically software engineers with an expertise in math—prefer to work in a software environment without having to be concerned with hardware uniqueness. Reprogrammable array DSP solutions do not fit cleanly into the flow that DSP programmers prefer to use. A software programmable DSP with a field programmable instruction set, on the other hand, would fit well—and increase processor performance significantly at the same time.
Unfortunately, any FPGA fabric that might be used in these solutions consumes between 20 and 40 times as much silicon area as the standard-cell ASIC implementations normally used in SOC design and at least 10× to 20× as much silicon area as a Gate Array implementation. Therefore, after a design initially implemented in FPGA has reached production and proven to be stable, it is often desirable to convert the design to an ASIC fabric of some kind. However, migrating an FPGA design to a lower-cost ASIC implementation is known to be fraught with timing and testability problems. It is known that these problems can be eliminated if designs are synchronous with a common clock, however in the current development paradigm for FPGAs, this restriction cannot be enforced. As long as hardware designers can add any arbitrary function to an FPGA, these migration problems will persist.
Someday, it may be viable from a cost perspective to use reprogrammable technology for volume production. However, in the meantime, there is a need for DSP solutions that take advantage of flexibility benefits of FPGA technology for development and market entry, while also providing an effective and practical solution for volume production.
Also, when FPGA devices are currently used in conjunction with conventional DSP processors, designers must deal with a two-chip solution along with the inherent partitioning and hardware issues, as well as the complexities of debugging two separate devices requiring different debug methodologies for each. DSP designers, who are typically software engineers with math backgrounds as opposed to hardware design, are accustomed to debugging conventional software-programmable DSPs with a software debugging program. These situations create a need for an integrated DSP processor/FPGA solution where the two are integrated together from a silicon standpoint, including an integrated debugging methodology where the conventional software “debugger environment” is extended to include the FPGA or ASIC arithmetic fabric that is serving the purpose of accelerating specific algorithms.
A DSP device with integrated FPGA functionality does not exist today, and while integrating the two technologies on a single die is desirable, an alternative would be to combine two die in a single package to serve prototyping and initial production applications. Modern IC packaging solutions can make this viable by allowing the creation of a System-in-Package (SIP) solution where two (or more) die are integrated into a single integrated circuit package. This type of solution would be useful anywhere fixed functionality, in particular processors, are used in both the prototype and volume production solution, and where it is desirable to have a prototype and initial production solution that also includes logic that is field programmable.