1. Field of the Invention
The present invention relates to the field of computers and, more particularly to a linker method and cache structure for minimizing worst-case-execution-time (WCET) of a computer program for computer architectures utilizing an instruction cache.
2. Description of the Related Art
The execution time of a computer program will vary depending on numerous factors such as, for example, the system hardware the program is running on, the amount of program inputs and outputs and the efficiency of the program code itself. Often times, it is desirable to determine the execution time of a program running on a particular computer system prior to its incorporation into an end product or system. In real-time applications, for example, it is essential to determine a program's worst-case-execution-time (WCET) to ascertain whether the program will meet predetermined timing deadlines and other system performance requirements. The WCET is an estimation of the worst possible execution time for the program and the system it is running on. Once determined, the WCET will indicate to system engineers whether hardware, software or other system changes are needed in order to satisfy system requirements.
One potential way to improve the WCET of a program is to incorporate cache memory, particularly an instruction cache, into the system architecture. This method, however, is rarely used due to the unpredictable nature of caches. A party required to give a WCET guarantee will not use an instruction cache due to this unpredictability. It, however, remains desirable to use an instruction cache to minimize WCET.
Typically, cache memory resides on-chip with the main processor of the system architecture while the memory containing the program instructions resides off-chip. As is known in the art, an executing program's performance is greatly enhanced by placing blocks of program instructions that are to be executed into an instruction cache as opposed to individually and continuously accessing the system's memory to retrieve the instructions. Thus, the use of an instruction cache avoids time consuming memory accesses associated with the retrieval of program instructions since the instructions are already loaded into the processor's cache.
Recently developed tools, such as Princeton University's CINDERELLA, exist for estimating the WCET of a program running on a system having an instruction cache. The execution time of a cached instruction is dependent upon whether the instruction results in a cache hit (i.e., found within the cache) or miss (i.e., not found within the cache) when the processor attempts to execute the instruction. A cache miss generally results from an address conflict within the cache (i.e., the cache address of the instruction was overwritten by another instruction having the same cache address). Upon a cache miss, the processor must retrieve the instruction from main memory and thus, the execution of the instruction takes a longer period of time. This tool performs a dataflow analysis of the program code and generate an integer-linear-programming problem (ILP) for that code. The ILP models the addressing of the cache to determine conflicts that would result in cache misses. This tool determines the WCET by providing a solution for the ILP. A description of such a tool is found in Li et al, "Cache Modeling for Real-time Software: Beyond Direct Mapped Instruction Caches," in Proceeding of the IEEE Real-Time Systems Symposium, December 1996. It is desirable, however, to use these methods to improve the design of an instruction cache to suitably minimize WCET.
Typically, a program is divided into numerous modules called "functions." Functions are groupings of program instructions that perform a specific task or tasks. A program will typically include a "main module" that initiates function calls to perform the tasks of the program. It is well known that splitting a program into individual functions enhances the ability to create, debug and operate the program.
Generally, having all program code in one long contiguous region of memory does not minimize cache conflicts, particularly when numerous individual program functions are involved. FIG. 1a illustrates the situation where a top level function F0 calls a second function F1 which in turn calls a third function F2. Depending upon the address locations of these functions F0, F1, F2, they can all be mapped to the same cache lines causing the second function F1 to thrash out the first function F0 upon its entry into the cache. That is, depending upon the location of the functions F0, F1, F2 in main memory, it is possible that the functions F0, F1, F2 will be placed into the same locations of the cache, causing one function, F1 for example, to overwrite another function, F0, for example, even though the function F1 will return to function F0. When the second function F1 returns, the cache will no longer hold any instructions from the first function F0, resulting in excess cache misses and a large WCET for the program.
It has been determined that relocating program functions F0, F1, F2 in memory could result in better cache mapping and accordingly, better WCET for the program. FIG. 1b illustrates how the three functions F0, F1, F2 can be placed into non-contiguous memory by placing code gaps between them and possibly permuting the order of the functions F0, F1, F2 (in general, code gaps can do anything that permuting the order of the functions can do and more). This would improve the caching of these functions since they would not be mapped into the same cache lines (as they would in the contiguous placement illustrated in FIG. 1a). Although this may provide better cache mapping, it would lead to severe memory wastage requiring the system to have much more memory than necessary and thus, increasing the size and cost of the system. Accordingly, there is a need and desire to minimize the WCET of a program running on a system architecture utilizing an instruction cache inexpensively and without wasting system resources.