In a typical data processing system, a number of software tools require regular access to disk storage and other auxiliary storage. In addition to input and output operations (storage I/O operations) directly resulting from a system user's interaction with an application program, disk access is required for utilities such as indexing, taking a backup to offline storage, and archiving with data movement and compaction. Disk access is also required for periodically-executing applications such as virus scanning and spyware detection.
The tools performing each of these file-level operations are typically unaware of the other tools, even when they run in parallel. In currently available personal computers, any disk I/O operation is a relatively slow operation—because of the requirement for physical movement of a read/write element relative to the disk. When two or more of these tools perform batch processing of a large number of files, their repeated requirements for disk access can lead to very slow system performance. This is problematic when the tools run in parallel and compete for resources, but is also problematic when one batch processing utility is followed by another batch processing utility because slow system performance can then continue for an unacceptably long period of time.
A lot of work has been done on scheduling of disk access operations. For example, US Patent Application Publication No. 2002/0143847 describes a method of scheduling for a mixed-priority workload—comprising high—priority online transaction processing operations and lower-priority monitoring operations. US 2002/0143847 describes issues affecting response times, such as conflicts between requests and parallel queries across multiple storage volumes. The dispatch of processes servicing a low priority workload is deferred, and low priority I/O operations are not performed, to ensure that high priority workload items are not deferred beyond acceptable response time bounds.
European Patent Application Publication Number EP 1193967 also describes a disk scheduling algorithm for a mixed-priority workload.
Requests held in queues are reorganized when a new request arrives such that a low priority request is only serviced if this can be done without violating the deadline constraints of a higher priority request.
U.S. Pat. No. 6,078,998 is another patent specification describing scheduling with reference to deadlines and priorities, in this case aiming to improve disk utilization and efficiency for a data-intensive application such as a multimedia application by calculating a global optimization of seek time.
U.S. Pat. No. 6,182,197 describes ordering data access requests for shared disk data based on priorities, and re-ordering requests in response to updated priorities.
Each of the above-identified patent specifications describes an attempt to handle different priorities of requests for particular types of process. However, known attempts at scheduling disk access operations have focussed on optimizing the order in which to perform the separate tasks, without any attempt to reduce the number of separate disk access operations.