An increasing number of devices used in business and domestically are controlled by small embedded microprocessors. Generally, these embedded processors are low-cost and include a limited amount of memory or storage for executing applications. Consequently, the applications executed on these embedded processors must also be relatively small and compact.
It is also desirable that these small applications also be interoperable with a large class of devices, such as cellular phones, manufactured by different companies. This reduces the costs associated with developing software applications and therefore decreases the overall cost of ownership for the device. For example, cellular phone users should be able to transfer applications to each other and download them into their phone for processing. This would greatly enhance the flexibility and feature set on cellular phones even though the phones may be different models designed by different manufacturers.
A general purpose stack based processor fits these requirements well because stack instructions tend to be small and compact. The general purpose stack based processor includes a stack for storing operands and a stack processor which processes instructions by popping one or more operands off the stack, operating on them, and then pushing the results back on the stack for another instruction to process. Essentially, stack based executables are compact because the stack instructions reference operands implicitly on the stack rather than explicitly in the instructions. The storage space saved by not referencing operands such as registers, memory addresses, or immediate values explicitly can be used to store additional stack instructions.
Embedding a general purpose stack based processor in a wide variety of devices is also very cost effective. Compared with RISC (reduced instruction set computer) or CISC (complex instruction set computer) processors, stack processor research and development costs are relatively low. Stack processors are well understood and relatively simple to design. Another part of the cost effectiveness is based on developing software that can be shared and used by a wide variety of different devices. By increasing software interoperability between devices, stack based processors can be produced in high volumes, low profit margins, and yet have high overall profits. For example, software applications consisting of architecturally neutral bytecode instructions can be readily shared when designed for execution on a Java Virtual Machine (JVM) stack based processor such as described in the book, "The Java Virtual Machine Specification" by Tim Lindholm and Frank Yellin, published by Addison-Wesley, 1997. These bytecode instruction based software applications are compact and substantially interoperable with almost any device utilizing, or simulating, a JVM stack based processor.
Unfortunately, general purpose stack based processors are generally not well suited for high-performance multimedia or other real time processing. In part, performance is often impacted on a stack based processor manipulating the stack to gain access to the operands. Generally, numerous machine cycles are spent pushing and popping operands on the stack. For example, graphic processing on a stack based processor is difficult because the instruction can not manipulate groups of pixels or data points as needed when performing various digital signal processing based compression/decompression techniques such as MPEG video or digital Dolby/AC-3 based audio. Consequently, processing groups of pixels on a stack based processor requires numerous stack operations and is inefficient. Potentially, each pixel value would have to be pushed on the stack and operated on. Each calculation would be a separate operation and it would be difficult to take advantage of redundant calculations that generally occur in image processing and audio processing. Clearly, additional processing required on a stack based processor would make it difficult to perform these calculations in a time frame acceptable for users expecting real-time multimedia effects.
Register based processors typically access operands quickly but require much wider instructions and thus larger executables. For example, 5 bits are required to address each register in a register based processor having 32 registers. A typical instruction addressing two source registers, which contain operand values, and a destination register, for storing the results, requires at least 15 bits just to address the necessary registers. This does not include the additional 8 bits required for the opcode and other portions of the instruction. Consequently, even the smallest software application executable on a register based processor may be too large to fit in the available memory or storage area associated with a particular device.
What is needed is a method and apparatus for coupling a stack based processor to a register based functional unit. The register based functional unit should be capable of performing real-time and graphics operations while the stack based processor is performing stack based instructions.