An increasing number of devices used in business and home are controlled by small embedded microprocessors. Generally, these embedded processors are low-cost and include a limited amount of memory or storage for executing applications. Consequently, the applications executed on these embedded processors must also be relatively small and compact.
It is also desirable that these small applications be interoperable with a large class of devices, such as cellular phones, manufactured by different companies. This reduces the costs associated with developing software applications and therefore decreases the overall cost of ownership for the device. For example, cellular phone users should be able to transfer applications to each other and download them into their phone for processing. This would greatly enhance the flexibility and feature set on cellular phones even though the phones may be different models designed by different manufacturers.
A general purpose stack based processor fits these requirements well because stack instructions tend to be small and compact. The general purpose stack based processor includes a stack for storing operands and a stack processor which processes instructions by popping one or more operands off the stack, operating on them, and then pushing the results back on the stack for another instruction to process. Essentially, stack based executables are compact because the stack instructions reference operands implicitly using the stack rather than explicitly in the instructions. Storage saved by not referencing operands such as registers, memory addresses, or immediate values explicitly can then be used to store additional stack instructions.
Embedding a general purpose stack based processors in a wide variety of devices is also very cost effective. Compared with RISC (reduced instruction set computer) or CISC (complex instruction set computer) processors, stack processor research and development costs are relatively low. Another part of the cost effectiveness is based on developing software that can be shared and used by a wide variety of different devices. By increasing software interoperability between devices, stack based processors can be produced in high volumes, low profit margins, and yet high overall profits. For example, software applications consisting of architecturally neutral bytecode instructions can be readily shared when designed for execution on a Java Virtual Machine (JVM) stack based processor such as described in the book, "The Java Virtual Machine Specification" by Tim Lindholm and Frank Yellin, published by Addison-Wesley, 1997. These bytecode instruction based software applications are compact and substantially interoperable with almost any device utilizing, or simulating, a JVM stack based processor.
In most cases, these embedded processors are also required to perform many high performance multimedia operations involving digital video and audio. Typically, the embedded processor must decode images and audio stored in a compressed data format called MPEG.sup.1. MPEG employs two basic techniques for video compression: block-based motion compensation for the reduction of the temporal redundancy (i.e. several frames in a video sequence remain substantially the same over time) and discrete cosine transform (DCT) coding for the reduction of spatial redundancy (i.e. images within a frame are the same color or have the same intensity levels). To exploit temporal redundancy, MPEG uses intra-frames (I frames), predicted frames (P frames), and bidirectionally interpolated frames (B frames) to record the differences between each video image rather than the complete image. To exploit spatial redundancy, discrete cosine transforms (DCT) are used to convert pel (picture element) values from the spatial domain (i.e. intensity variations based on distance in an image) into the frequency domain (i.e. intensity variations based in terms of frequency in an image). In the frequency domain, the pixels in an image are represented as a combination of frequencies using a series of DCT coefficients. If images are moving within a frame, motion-compensated prediction uses motion vectors to represent a current frame as a translation of pixel values from a reference frame. Finally, the remaining pixel values and DCT coefficients are further compressed using run-length coding techniques well known in the art. FNT 1. MPEG (Moving Pictures Expert Group) is a group of people that meet under ISO (International Standards Organization) to generate standards for digital video (temporal sequencing of images) and audio compression. MPEG is a nickname for the video/audio standards and the official name is ISO/IEC JTC 1 SC29 WG11 where IEC is International Electro-technical commission; JTC1 is Joint Technical Committee 1; SC29 is Subcommittee 29; and WG11 is Work Group 11.
Unfortunately, general purpose stack based processors are generally not well suited for performing high-performance multimedia or other real time processing. In part, performance is often impacted on a stack based processor manipulating the stack to gain access to the operands. Generally, numerous machine cycles are spent pushing and popping operands on the stack. For example, graphic processing on a stack based processor is difficult because the instruction can not manipulate groups of pixels or data points as needed when performing various digital signal processing based compression/ decompression techniques such as MPEG video or digital Dolby/AC-3 based audio. Processing groups of pixels on a stack based processor would require numerous stack operations and would be inefficient. Potentially, each pixel value would have to be pushed on the stack and operated on. Each calculation would be a separate operation and it would be difficult to take advantage of redundant calculations that generally occur in image processing and audio processing. Clearly, additional processing required on a stack based processor would make it difficult to perform these calculations in a time frame acceptable for users expecting real-time multimedia effects.
General purpose hardware processors are also not well suited for multimedia applications due to inherent architectural limitations. For example, many general purpose processors employ a cache memory and cache controller to manage the use of high speed cache among many applications. Typically, the cache controller swaps data in and out of cache to minimize the number of cache misses. Unfortunately, MPEG video streams decoding and other multimedia operations need cache and other high speed memory for extended periods of time. Further, many multimedia applications often need more space than these cache can provide. Thus, the unavailability of memory and frequent cache misses can make multimedia operations, such as MPEG video, appear choppy, slow, and generally low quality.
Several manufacturers have integrated graphics functions into the general purpose processor to increase multimedia performance executing applications. The UltraSparc processors designed and marketed by Sun Microsystems, Inc. employ VIS (visual instruction set) integrated graphic functions. Designing a central processor with integrated graphics functions is described in U.S. patent application Ser. No. 08/638,390, filed Apr. 26, 1996, entitled "A CENTRAL PROCESSING UNIT WITH INTEGRATED GRAPHICS FUNCTIONS", authored by Robert Yung, and assigned to the assignee of the present invention. Intel Corporation has also designed several processors with integrated graphic functions based upon the MMX instruction set. Many of these solutions attempt to operate the multiple integrated graphics functions in parallel using sophisticated compilers and hardware mechanisms to schedule instructions. However, the complexity of these integrated graphic functions often leads to structural hazards (i.e. contention for the same hardware resources) and data hazards (i.e. data dependencies) which leave portions of the processor idle waiting for results or resources.
What is needed is a processor complex capable of executing multimedia applications and scalable to a wide range of processing environments.