In modern computers the operating system attempts to predict the next section of a file that will be read in the near future. This prediction is then performed before the application requests the action. The section of the file that was predicted is brought into the computer's memory from disk, and hence is known as read ahead. Read ahead was developed to alleviate the time spent waiting to receive data blocks from disk. For example, if it costs X seconds for an application to issue a synchronous read, and wait until the data is retrieved from disk, then each time data is retrieved from disk, X seconds are added to the processing time. Consequently, by predicting which data blocks will be retrieved by the application, and issuing asynchronous read requests for the predicted data blocks, the application could continue to process existing data. And when the application requests its next block of the file it will tend to find that the next block is already in memory and the operation is completed much faster.
However, the prediction mechanism of the prior art is limited to positive sequences, i.e., it predicts whether the next block is one block greater than the last block that was read, e.g. file blocks 1, 2, 3, 4. Thus, if predicted, the prior art mechanism would have the operating system issue a read for the current block and issue an asynchronous read for the current block plus one. The prior art prediction mechanism is if the last block plus one equals the current block, then issue an asynchronous read for the next sequential block or the current block plus one. The OS or operating system maintains another data structure, typically called a v-node, that is associated with the file that the application is reading. The v-node is used by the OS to track the file. The v-node maintains a list of blocks that make up the file and their respective physical locations on the disk, as well as the last block read. Thus, the prediction mechanism consults the v-node for the file, and determines if the current request is equal to the last block read plus one. If so, it then issues a read for the current request and the asynchronous read for the next block and updates the last block read entry of the v-node to indicate the predicted request. For example, if the current request is for block 2, and the previous request was for block 1, then a read is issued for block 2 and an asynchronous read is issued for block 3. The last block read entry of the v-node is changed to block 3. If the current request is not equal to the previous plus one, then only a read is issued for the current request and no asynchronous request is issued. The last block entry of the v-node is changed to current request. For example, if the current request is for block 5, and the previous request was for block 1, then a read is issued for block 5 and the last block read entry of the v-node is changed to block 5. However, there are several problems with this approach, mainly because the prediction mechanism can only detect an application performing sequential block accesses, e.g. file blocks 1, 2, 3, 4, etc.
Note that the prediction mechanism cannot detect an application that reads backwards through the file, e.g. file blocks 6, 5, 4, 3, etc. This has been overcome by merely checking whether the current request is either plus or minus one of the previous read block.
Also note that the mechanism cannot detect an application performing accesses that are strided, e.g. file blocks 1, 3, 5, 7, etc. This problem has been overcome by modifying the v-node to maintain the last and the second last reads. Thus, the prediction mechanism checks to see if the current read request is as distant from the last block read, as the last block read is from the second last block read. If so, the application is predicted to read at a constant stride, and an asynchronous read is issued for the next block, which is one stride from the current block. For example, if the last block is block 3, and the second last block is block 1, and the current block is block 5, then the mechanism will compare block 5 to block 3, and determine a stride of 2, which is equal to the stride of block 3 from block 1. Then the OS will issue a read for block 5 and an asynchronous read for block 7. The last and second last blocks in the v-node will be updated to blocks 7 and 5, respectively. To detect an application that reads backwards with a stride through a file, e.g. file blocks 17, 15, 13, 11, the mechanism for plus or minus stride.
Moreover, if more than one application or more than one thread of a single application reads the same file at the same time then the prior art mechanism cannot detect either a sequence or a stride of blocks being read. For example, the prior art mechanism could not detect file blocks read in this order: 1, 100, 2, 101, where blocks 1 and 2 were requested by application A (or thread A of application C), and blocks 100 and 101 were requested by application B (or thread B of application C). This is because the v-node, which tracks the reads, is associated with the file and not the application or thread. Thus, as the applications or threads alternate, the pattern of entries in the v-node is disrupted. For example, suppose the last read is 100 and the second last is 1, when the current of 2 is compared with the last and the second to last, 2 is not sequentially after 100, and 2-100 equals a stride of -98, which does not equal the stride of +99 from 100-1. Thus, no patterns are detected, and no read aheads are issued. This negatively impacts the performance of those applications or threads. When no predictions are made then the operating system will not be able to accelerate the data into memory, and each application or thread in an application, will stall on each read waiting for the disk operation to complete. This problem will become more pronounced as the industry and consumers begin to use systems that comprise either multiple processes or applications which can utilize multiple threads.
The following is an example of a multi-threaded system and the problems it encounters using the prior art prediction mechanism. The system is a radar site that scans vertically at three levels, high (H), middle (M), and low (L), and feeds the data into a file. By vertically scanning, the site generates cross-section of the atmosphere in the form of a data stream of H-M-L-H-M-L-H-M-L-H-M-L, as the site scans and resets, scans and resets, etc. Now suppose the application wants to process the data for any given level, e.g. depict the high level or the low level. Thus, the application wants to read the data as H-H-H-H or L-L-L-L. Thus, the application requires the prior art stride reader because the distance between the records is a constant. However, if the application is multi-threaded, then it is likely to start one thread to process Hs, a second to process Ms, and a third to process Ls. Each thread is reading from the same file, at the same time, and are strided readers. As each thread reads from the file, the last and second last values in the v-node are reset accordingly, and therefore the prior art stride prediction mechanism will never recognize the reading patterns. Moreover, the prior art sequential reader will not pick up an H-M-L-H-M-L pattern, because the Hs may have different processing times from the Ms or the Ls, thus the pattern being read will not be H-M-L-H-M-L, but rather H-M-L-L-M-H-M-M-L etc. Thus, no predictions are possible, and each block of data is being read directly from disk. This may slow the system such that a single thread or process would have processed the data faster then having multiple threads or processes.
Note that moving the prediction mechanism from the file level to the process or thread level may overcome some of the problems discussed above, but introduces new problems which result in severely degraded performance. In this instance there will be three threads of the application, with one reading Hs, one reading Ms, and one reading Ls. As each thread is time switched across the CPU(s), their access requests will show that they are each stride readers. Thus, the OS will issue the requests to the disk as a series of Hs, then perhaps Ms and Ls. Reading the information from the disk this way will incur a large number of seeks as the disk skips over M and L data to read H data, and likewise for M and L data. Moreover, the OS has not recognized that at the file level it is actually a sequential reader, which would have been detected by a file level sequential prediction mechanism and performed many read aheads without seeks. Furthermore, the hard disk itself has a cache, which performs sequential read aheads, thus in reading block 1, blocks 2 and 3 are loaded in disk cache for faster retrieval. Thus, by not issuing sequential reads, and instead issued stride reads, this feature is disabled, and disk drive performance is reduced. So by moving the prediction mechanism to the application level, it will cost more seeks and reduce performance.
Note that the above problems are particularly relevant when the file is larger than the memory, and thus a portion of the file must reside on a hard disk or other mass storage medium, such as CD-ROM.
Therefore, there is a need in the prior art for a prediction mechanism that can recognize stride forward, stride backward, sequence forward, and sequence backward patterns hidden with a complex read patterns issued by multi-process or multi-thread systems, which make such patterns appear to be random accesses. Such a system would allow the operating system to accelerate the data on the path.