1. Field of the Invention
The present invention relates generally to processing systems, and more particularly to methods and systems for software extensible multi-processing.
2. Description of the Prior Art
Computationally intensive applications, such as modeling nuclear weaponry, simulating pharmaceutical drug interactions, predicting weather patterns, and other scientific applications, require a large amount of processing power. General computing platforms or engines have been implemented to provide the computational power to perform those applications. Such general computer computing platforms typically include multiple single-chip processors (i.e., central processor units, or “CPUs”) arranged in a variety of different configurations. The number of CPU's and the interconnection topology typically defines those general computing platforms.
To improve the functionality, reduce cost, increase speed, etc. of the general computer computing platforms, the multiprocessors and their architectures are migrating onto a system-on-a-chip (“SOC”). However, these conventional approaches to designing multiprocessor architectures are focused on either the general programming environment or on a particular application. These conventional approaches, however, cannot make many assumptions about (i.e., predict) or adapt its resources to optimize computations and communications in accordance with the user's application. This deficiency exists because the number of applications varies widely and each often has requirements that vary dynamically over time, depending on the amount of resources required. Also, those approaches that are focused on one particular application often provide high performance for only one specific application and thereby are inflexible to a user's changing needs. Further, the traditional approaches do not allow a user to optimize the amount of hardware for the user's specific application, resulting in a multiprocessor architecture with superfluous resources, among other deleterious effects.
Additionally, conventional approaches do not optimize communications among processors of a multiprocessor architecture for increased speeds and/or do not easily allow scalability of the processors of such an architecture. For example, one approach provides for “cache coherency,” which allows for creation of a programming model that is relatively less resource-intensive. With cache coherency, the programming model is similar to programming a uniprocessor. However, cache coherency is expensive in terms of hardware, for example, and does not scale well as the number of nodes increases. Scaling cache coherency beyond four nodes usually requires significant hardware complexity. In contrast, another approach provides for “message passing” to obtain a more scalable solution. But this message passing typically requires the users to learn a new programming model. Furthermore, message passing machines and architectures often have additional hardware overhead as each processor element must have its own copy of the program for execution.
Some multiprocessor systems have used interface protocols, such as HyperTransport from the HyperTransport Technology Consortium of Sunnyvale, Calif., for communications between processors. Other examples of interface protocols used are Peripheral Component Interconnect (PCI) Express and RapidIO from the RapidIO Trade Association of Austin, Tex. These interface protocols have been primarily used in high-performance processing systems such as super computers, which are very expensive. The interface protocols have also been used in general purpose processing systems. In one example, one system used Hypertransport channels in an array of Advanced Micro Devices (AMD) processors from Advanced Micro Devices, Inc. of Sunnyvale, Calif. These general purpose processing systems are more expensive than embedded systems because the general purpose processing systems have to include additional functionality to run a variety of applications that may change dynamically.