1. Field of the Invention
The present invention relates to a method of optimizing multi-set context switch for embedded processors, and more particularly to a method of optimizing multi-set context switch for embedded VLIW (very long instruction word) processors.
2. Description of the Related Art
The development of embedded systems has attained rapid growth in recent years. These embedded systems are widely used in industry in a variety of ways, such as communication, multimedia and automotive control systems. With the current progress in the silicon technology, implementing embedded systems using the system-on-chip (SoC) designs becomes preferable to assembled ASICs due to issues of cost, performance and power consumption. The ITRS (International Technology Roadmap for Semiconductor. http://public.itrs.net) roadmap describes the design trend of the SoC towards the involvement of multi-core organization, which demands increasing integration of MPU, DSP, I/O cores, etc. FIG. 1 shows a conventional multi-core SoC architecture 1 including a MPU core 11 running the main application, plural dedicated process cores 12 (e.g., DSP), other IP 13 (e.g., memory, peripherals) and an interconnect 14, which is adopted by most designers and products, to achieve application-specific granularity and flexibility by using more efficient cores with dedicated functions. In contrast to the classic computer architecture of multiprocessor systems, the programming and interfacing between the heterogeneous processors within the conventional multi-core SoC architecture 1 are usually provided by different models. Moreover, the dedicated process cores 12 (i.e., the included heterogeneous processors) with different instruction sets increase the complexity of developing applications. Programming at the assembly level today will not be easy anymore in the future. System software support such as compiler and OS will play a more important role than ever.
For supporting more effective application development in a multi-core system, the software organization should be customized for each processor core, as a stack of layers on top of the hardware. In the past, many of the programs developed on the dedicated processors are implemented as the mixtures of the functional code and the specific code, which perform minor scheduling and resource management, without separate layers. This non-layered design scheme limits the flexibility/portability and turns into one of the bottlenecks in the SoC software design. As a result, a layer of OS services is usually demanded to minimize the difficulty of handling multitasking, complex inter-process communication, and miscellaneous resource management. The MPU core 11 (i.e., the main processor core) typically reuses a state-of-art embedded OS to support complete services and managements at the application level. However, using a generic OS to support function-specific programs running on the dedicated process core 12 (e.g., DSP) is not realistic due to code size and performance reasons. Therefore, a customized kernel-style lightweight. OS service is more preferable when applied to support the dedicated processor programming in recent years. Texas Instruments, for example, has developed DSP/BIOS for all platforms using their DSP products (Texas Instruments, Inc. TMS320 DSP/BIOS User's Guide, November, 2001). In addition, to reduce the amount of read and write ports in register files of the VLIW architectures for reducing power and cost in designs, distributed register file and multi-bank register architectures are being adopted for high-performance and low power VLIW DSP processors (refer to Tay-Jyi Lin et al, Proceedings of 2005 IEEE International Symposium on VLSI design, Automation and Test, 2005, 335-338 and S. Rixner et al, International Symposium on High Performance Computer Architectures, 2000, 375-386). The distributed register file and multi-bank register architectures present challenges for micro-kernel designs in reducing context switch overhead.