Multiprocessor architectures are attracting growing attention as a design approach for making computer systems faster. However, it is difficult for human programmers of multiprocessor systems to keep track of the execution of their programs. Moreover, it is generally desirable to practice multithreading of the computer programs, in order to take maximum advantage of the parallel architecture. However, the gain in speed potentially achievable through multithreading is at least partially set off by the increased Operating System (OS) overhead incurred by the multithreading setup. Thus, a judgment needs to be made as to when, and at what granularity, multithreading will be worthwhile. There is a need for a software development tool that will help the programmer make such a judgment, by, e.g., keeping track of calls to the OS made by the program, and by gathering statistics that describe the execution of the program on the multiprocessor system. (A program intended to perform an external task will hereafter be referred to as an "application".)
In fact, certain software development tools, known as "profiling tools," are commercially available. These tools add some form of instrumentation, such as counters, to the executable code for measuring or estimating the number of times each basic block of code is executed. Under the assumption that cpu time is allocated with perfect efficiency, these measurements or estimates can be used to infer the amounts of time spent executing various parts of the code. However, the assumptions that underlie the use of these tools are seldom fully justified. Moreover, these tools achieve a resolution of several milliseconds, which is not fine enough for many code optimization problems. Still further, these tools provide no cross-processor coverage, and they provide only limited cross-process coverage.
Also available commercially are analysis tools that can show the user the percentages of time spent in user mode, system mode, and idle time. However, tools of this kind do not reveal how or where (i.e., in which calls to the OS) the application is spending its time when it is in system mode. These tools also fail to provide a comprehensive view of what is occurring, within a given time window, in all of the various processors at once.