1. Field of the Invention
The present invention relates generally to the field of scheduling tasks, and more particularly to scheduling tasks involving varying staging, processing, and destaging times.
2. Description of the Related Art
The invention has particular applicability to data processing in contexts where the data to be processed is stored on a media such as tape that is accessed sequentially rather than randomly and therefore may involve significant staging times (the time required to access the data and/or place the data in a temporary storage medium where it may be accessed quickly during processing) as well as processing times. Therefore, the invention will be discussed in connection with data processing tasks (referred to herein as xe2x80x9cjobsxe2x80x9d), wherein the data is stored on tape. However, the invention is not limited to these contexts and may be used to schedule a wide variety of jobs, including jobs that do not involve a computer in any way as well as computer-related jobs regardless of whether the data is stored on tape.
Many data repositories are or are expected to become huge, possibly terabytes or petabytes in size. Given the current mass storage technology, these data will almost certainly reside on tapes. Even when expected increases in disk capacity are considered, sizeable portions of these repositories are likely to be on tape media. Analyzing, mining, and other data-intensive applications thus comprise tape-resident jobs, that is, jobs that process wholly, or in part, data from tape.
One tape-resident job application scenario of note is EOSDIS (Earth Observing System Data and Information System). EOSDIS will command and control a series of 10 observation satellites, and provide distributed data processing, archival, and distribution across eight DAACs (Distributed Active Archive Centers). Each DAAC consists of a set of processors which share disks that are connected to the archival system. The disks are used as a cache. Typical systems include a platform of 16 processors sharing 512 Gbytes of disk space, and up to three platforms sharing 4 tape drives. EOSDIS has been engineered to ensure that these tape drive subsystems are not a performance bottleneck.
The data are processed to yield data products that are stored on an archival set of tapes. About 0.6 terabytes of data will be collected daily, with a plan to accumulate a comprehensive global 15-year data set containing several petabytes of data. Data files will be downloaded from the satellites and archived on tape. Periodically, a predetermined batch of jobs will be run on possibly several days worth of data to summarize and analyze the data. The resulting data products allow scientists to study, for example, water vapor, cloud profiles, solar irradiance, etc. Such batches contain jobs which typically have sophisticated access patterns that involve reading files from and writing to several tapes. The jobs have been planned in great detail; the size and number of files, as well as the number of floating point operations needed for each job are known. Nearly five thousand tape-resident jobs are executed per day at the 8 DAACs. These jobs have a large range of needs, with some jobs being processed in just over two minutes, while others require up to 11 days for processing. In nearly 90% of the tape-resident jobs, more time is spent on moving the data to and from tape than is spent processing the data.
A similar scenario arises in data processing at large corporations which are increasingly incorporating bulk xe2x80x9cbookkeepingxe2x80x9d operations. By bookkeeping, we mean running a batch of basic operations such as collating, summarizing, compressing, and archiving data, and more generally, sophisticated data mining, billing, fraud detection, etc. Such bulk bookkeeping processes are run periodicallyxe2x80x94say daily, weekly, or monthlyxe2x80x94on accumulated xe2x80x9csales dataxe2x80x9d which maybe too large to be resident on disks.
Many other instances of processing batches of tape-resident jobs exist when massive data sets are manipulated. Spatial data manipulation within Geographic Information Systems (GIS) is another such instance.
A rather natural question arises, namely, how to order the execution of the tape-resident jobs in a given batch. Not all sequences of execution amongst these jobs make optimum use of the resources. For instance, a poor schedule may force processors to be idle when data is read from tape to disk, while a careful scheduler might be able to hide that latency by having scheduled other jobs previously to keep the processors busy. Thus better sequencing strategies may significantly reduce the total time for executing the given batch. Since these batches of jobs typically take a long time to execute (batches of jobs in EOSDIS, for example, may take several hours or in cases where a batch is run once a month, may even take a few days). Thus, decreasing the overall running time by even a small fraction may make a significant difference.
What is needed is a method for efficiently scheduling jobs with varying staging and processing times, such as tape-resident data processing jobs.
The present invention overcomes the aforementioned limitations of the prior art by providing a plurality of methods for efficiently scheduling jobs with varying staging, destaging, and processing times. In the first method, referred to as the merge method, the jobs are first divided into two or more sets based on a predetermined criteria such as whether the processing time or the staging time is longer. Then the jobs in each of the sets are ordered according to another predetermined criteria. Finally, the jobs are scheduled by alternating between the schedules for the sets. In one embodiment, jobs in the set with longer processing times are ordered according to a longest processing time first criteria, while the jobs in the set with longer staging times are ordered according to a longest staging time first criteria.
Thus, in accordance with the invention, the final ordering of jobs will be either: a) the job with the longest processing time from among those jobs with longer processing times than staging times, followed by the job with the longest staging time from among those jobs with longer staging times than processing times, followed by the job with the second longest processing time from among those jobs with longer processing times than staging times, etc.; or b) the job with the longest staging time from among those jobs with longer staging times than processing times; followed by the job with the longest processing time from among those jobs with longer processing times than staging times, followed by the job with the second longest staging time from among those jobs with longer staging times than processing times; etc.
The second method, referred to herein as reverse method, again divides the jobs into two or more sets based on a predetermined criteria such as whether the processing time or the staging time is longer. Then the jobs in each set are sorted using some predetermined criteria, and the resulting schedules are ordered. For example, the jobs in the set with longer staging times than processing times are ordered by shortest staging time first while the jobs with the longer processing times than staging times are ordered by longest processing time first. The schedule for the jobs with longer processing times is then appended to the schedule for the jobs with longer staging times; this is called the Reverse Johnson method.
The third method, referred to as the fold method, initially orders the jobs by one criteria and schedules jobs from each end of the list in alternating order. The jobs may initially be ordered by longest staging time first or longest processing time first. The jobs are then scheduled from each end of the list, e.g. first, last, second, second-to-last, third, etc. (or last, first, second-to-last, second, etc.).
In a fourth method, the jobs are ordered by the 3-2 Reverse Johnson method (3-2 refers to the reduction of 3 variablesxe2x80x94staging time, processing time and destaging time (destaging time is the time required after processing to move the data processed to an appropriate location; in the case of many tape-resident processing jobs, the destaging time is zero since the end result of the processing job is often a small summary which may be stored on a disk rather than tape)xe2x80x94to two variables). In this method, the staging time for each job is set equal to the staging time plus the processing time, while the processing time is set equal to the processing time plus the destaging time. The reverse method is then performed using the new staging and processing time parameters.