Computers have become an essential part of our everyday lives. They have the capability to process information quickly and accurately. Because of this, society has embraced utilizing computers for critical needs such as banking, space flight, medical care, and air traffic control and the like. Thus, a computer's speed and accuracy are paramount in these types of critical transactions. These characteristics have also been embraced, however, by people expecting the same great performance from computers in non-critical applications such as for large information storage and retrieval systems. Thus, programs, such as database programs and the like, that execute high numbers of transactions per second also require high performance computing systems. These extreme demands on computing systems have driven great gains in the area of computing performance.
A computing system is generally composed of hardware and software components that interact with each other. The hardware components can be described generally as those parts of the computing system that a person can physically touch. These include processors, memory chips, hard drives, connecting wires and traces, and other supporting hardware devices. Typically, the processing hardware components are constructed so that they can recognize two logical states, namely a “0” state (or low electrical state) and a “1” state (or high electrical state). Employing a number of these states together in a sequence allows data to be stored and processed by the hardware. The software components contain instruction sets that utilize the hardware to accomplish a particular task. They are typically written in “code” that is a high level software language for representing the desired zeroes and ones (or “low” and “high” states). In this manner, software can be written to accurately control the hardware components to return a desired effect.
As can be expected as technology progresses, the lines between what is hardware and what is software tends to blur a little. Thus, the concept of “firmware” arises where the name indicates that it is not quite hardware but also not quite software. Generally speaking, firmware is ones and zeroes that reside in somewhat of a permanent state on a hardware component to allow control of the hardware at a low level or “root” level. It is considered “firm” because it does not change often and is utilized for a particular type of hardware component or platform. Firmware typically handles hardware specific interfaces and the startup sequences of the hardware components.
When computing systems were first developed, it was desirable to have some common software that could handle reading and writing to hard drives and some basic repetitive tasks necessary to operate the computing system. These included diagnostics, data file structures, and human-machine interfaces. A disk operating system was developed initially to handle file structures and basic interfaces. This progressed into what is known today as an “operating system.” Gone are text based user-interfaces, and now graphical user interfaces (“GUI”) are considered the norm. Thus, the disk operating system has developed into a full blown, user-oriented operating system that provides a greater amount of flexibility, ease of use, and control over a computing system than was previously achievable.
With fast hardware and an easy to use operating system, all that is needed is a way to get the computing system to behave in a way that gives a desired result. This could be achieved by continuously altering an operating system. However, people typically have different tasks that they want a computing system to perform. So, the operating system remains “common” software and additional task specific software is written to perform those specific tasks, called “application” software (or executable software). For example, if users want to balance their checkbook, they can install financial application software on their computing system and perform that task. Thus, having application software allows the computing system to expand its tasking capabilities without changing its hardware components and/or operating system. Utilizing this type of hardware and software architectural structure allows almost infinite task capability for a given computing system.
The typical limitations on a computing system's task capability are generally dictated by its processor speed. The amount of information and how fast a computing system can handle it usually indicates the performance that the system is capable of achieving. Therefore, increasing the performance of a computing system allows it to be more flexible and to do more work. This can be accomplished in any one of the architectural levels of a computing system. Thus, strides have been made in optimizing hardware components and also software components for speed. As competing hardware manufacturers have introduced new and different hardware architectures for increased performance, often times operating systems and even applications must change also to utilize those changes before performance gains can be realized.
One of the first areas of hardware performance gains was in introducing a data “cache.” This allowed frequently used data to be available quickly to hardware processing components, increasing their speed. Eventually, multi-leveled caches were developed and some even placed on a semiconductor die (“onboard” cache) along with the processor to achieve even faster response times. Along with optimizing frequently used data retrieval, manufacturers also worked on increasing the processing speed itself. Processor semiconductor chips were shrunk dramatically in size and new materials were used to get even smaller sized chips. This allowed extremely fast state (zeroes and ones) changes within the processors. Today, processor speeds have reached beyond 3 gigahertz levels with front side bus speeds well over 500 megahertz. Increasing the bus (or “connection”) speed allows the processors to access “off-board” cache faster, facilitating the processor speed.
Typically, increasing a processor's speed may not require extensive changes to an operating system nor to applications that run on a computing system. These types of changes are generally “overall” performance increases that mean faster processing even with unchanged software. Unfortunately, there are physical limitations to this type of performance increase. Semiconductor sizes are nearing atomic levels where eventually it will not be possible to go any smaller. This has created a push in architectural optimization to increase processing in a computing system. Hardware manufacturers have begun to develop computing platforms (systems) with multiple processors instead of just a single processor. They have also introduced single physical packages that contain multiple processing cores in what used to be only a single processor core. Additionally, recent trends have produced processors with multiple “logical” processors that are utilized, for example, in simultaneous multi-threading. These logical processors are not physical processors, but appear as such from a user's perspective. They typically share functional resources such as adders and memory and the like. Caches have begun to be shared between both physical and logical processors. Buses have also been utilized as shared resources for performance gains. Thus, the hardware components in a computing system have grown quite complex in their architecture and can vary greatly with each computing platform.
This newer breed of enhanced platform optimization requires changes in software to fully realize the platform's potential. The reason for this is the introduction of multiple processing entities, whether they are physical and/or logical entities. A software application can often increase its performance by utilizing more than one processing entity. This is not always the case because it requires that an application have internal processes that do not require a serial process (i.e., one action must always precede another action in sequence) in order to allow multiple processes to execute at the same time. An application must also be aware that it has access to a platform with multiple processing entities. It must also have its code written so that it can optimize itself based upon a particular processing architecture. Obviously, this requires changes to the software application before a user will obtain increased performance.
Because of the constant need to increase computing system speeds, it is very likely that performance strides will continue to be made. Therefore, it is unlikely that only existing hardware architectures utilized today will be the only ones used in the future. Thus, it is more likely that even higher complexity architectures will be developed with even more varying combinations. This will also drive to increase the complexity of the software applications in order for them to adequately exploit the hardware architecture to fully optimize their application's performance. With this newfound increase in complexity comes an increase in difficulty in extracting the maximum performance of both the software and hardware. Although each can be optimized for speed, performance monitoring to evaluate optimal utilization must also be capable of keeping pace with the performance increases of both hardware and software. Today, typically, performance can only be monitored by keeping track of idle times and spot checking memory utilization. Multiple logical CPUs and their shared resources only compound the difficulty of tracking resource utilization. Often averages are utilized to estimate total resource usage, resulting in too little utilization or allowing more processes than can be handled by the computing resources.