1. Field of the Invention
The present invention relates generally to input/output performance. More specifically, the present invention relates to automatically analyzing input/output performance problems using a multi-level system.
2. Description of the Related Art
Modern software systems have complex, multi-level input/output stack implementations which make input/output performance issues difficult to diagnose and understand. Although much of the data needed for performance analysis is provided from various levels of the software stack, this data is usually segregated and, as a result, cannot be easily analyzed.
An exemplary illustration is an input/output performance problem in a data warehousing environment where a database administrator observes poor input/output performance after a migration to different hardware, operating system, or database management system versions. Assuming that such a database management system issues a large amount of sequential input/output requests and the storage device is capable of handling those requests and high throughput, the database administrator expects a high transfer rate. However, due to various configurations or operating system problems, the database administrator actually gets a very low transfer rate and high input/output wait time. Many hypotheses may be formed and time is required to verify each one of the hypotheses. Exemplary hypotheses may be:                (1) If a file system with multiple threads accessing a small set of files is being used, there might be file locking issues or problems with file layout.        (2) Poor query plans could be generated by the optimizer, resulting in sub-optimal input/output patterns.        (3) A Redundant Array of Independent Disks (RAID) array could have degraded and many extra input/output operations are required to service reads.        (4) The system could be under memory pressure and input/output queues could become congested writing out dirty data.        (5) The files being retrieved could be highly fragmented.        
The database administrator begins by running a number of existing tools, such as vmstat, iostat, or even a profiler like oprofile, to collect various kinds of data. Then, the administrator is required to look over all of this data for anomalies that could point to the cause. After looking through the information, the administrator may find out that the system has a very high interrupt rate and that the storage device is 100 percent busy throughout the query. That leads the administrator to look at low level input/output statistics and find that the operating system is driving down a large number of smaller input/output operations unexpectedly. So now the administrator has found the basic problem, such as, smaller than expected input/output requests are being issued to the device. After a lot of trial and error, the database administrator looks for points in the input/output stack where requests are made and eventually suspects the filesystem. The administrator may then use more specific tools, such as filefrag, to examine file layout and find severe file fragmentation. In this case, a defragmentation tool may be applied as a corrective action.
Although there are a number of input/output performance tools available, such as sar, iostat, vmstat and strace, none of them are able to look at different levels in the input/output stack and analyze the data. Overall, these tools only give hints as to why performance is poor, and none of the tools are designed to perform any type of multi-level analysis.