The present invention relates to a file prefetch control method for use with a computer system. More particularly, the invention relates to a method for controlling a processor of a computer system prefetching files.
Generally, the operating system (OS) of a computer builds a file system in a randomly accessible secondary memory such as magnetic disks. The operating system reads part of disk contents into a main memory for various uses.
A file system of the operating system described in xe2x80x9cThe Design and Implementation of the 4.3 BSD Operating Systemxe2x80x9d (Samuel J. Leffler et al., Addison Wesley, pp. 187-221) has its disk volume divided into a plurality of physical disk blocks, the disk volume having been designated at th time of a file system build-up. Under the operating system, file storage regions are assigned to individual disk blocks thus prepared. Disk contents are stored into the main memory in units of disk blocks. File read and write operations are carried out using buffers secured in a storage area of the main memory. In addition, the operating system leaves as much content of the most recently accessed disk regions as possible in the main memory. Such measures allow the operating system to minimize the number of write and read operations to and from disks upon access to files, whereby the throughput of the file system is enhanced and a disk access wait time is shortened.
With the above-described type of file system, a write or read operation to or form a file takes place as follows: A disk block containing target data is calculated based on a target file to be accessed and on an offset relative to the target data to be reached. A check is made to see if the content of the target disk block already exists in a file system cache within the main memory. If the block in question is found to exist, the write or read operation is performed to or from the cache. The steps above eliminate an output or input to or from the physical disks, whereby the input/output wait time of the processor is reduced. If the target disk block is not found in the main memory, a buffer is allocated in the main memory and the content of the target block is read into the buffer.
It is common knowledge that the throughput of a file system is higher, the greater the size of each of disk blocks, i.e., units in which files are assigned to storage regions on disks and in which data are written and read to and from the disk storage regions. However, enlarging the disk block size tends to increase regions that cannot be utilized on disks. Illustratively, if a large disk block is assigned in its entirety to a small file or to a small region at the end of a file, that area within the disk block which does not hold file contents is uselessly occupied.
To avert such wasteful practice, the operating system generally divides disk blocks. That is, a file or the last disk block of a file smaller in size than a full block is stored into one of small regions created by dividing a disk block. This technique has been proposed in order to minimize wasteful uses of disk regions.
One disadvantage of the above technique is that individual small regions derived from the division of a disk block accommodate mutually irrelevant files or fragments of such files. Each of the divided small regions is handled as a single disk block. The larger the number of divided disk blocks, the smaller the size of units in which disks are accessed. In particular, if a large number of small files exist, the throughput of the file system in question can deteriorate.
Conventionally, users are not allowed to define those allocated block locations on disks which are to retain contents of small files. Where a specific file whose fragments are distributed in a plurality of disk blocks tends to be accessed exclusively, a disk access time can be appreciably long because the disk blocks holding the contents of the entire file are dispersed over the disks.
Most operating systems perform prefetch operations on a secondary memory by resorting to asynchronous input and output manipulations in order to boost a cache hit rate of the file system in use. Under this scheme, the operating system retains a logical disk block number of the file most recently read into the main memory. If a pointer for the next access operation points to the logical disk block next to the logical disk block whose number has been retained, the operating system assumes the occurrence of sequential file access and reads in advance a plurality of subsequent logical disk blocks into the main memory through asynchronous input and output operations. Logical disk blocks refer to component disk regions of a file divided by the file system in increments of a predetermined disk block size.
The above prefetch scheme is arranged to judge whether the access operation about to take place is sequential or not on the basis of the most recently read logical block and of the logical disk block to be read this time. It follows that the scheme is effective only in the case of sequential access to a given file. Prefetch operations are not performed conventionally on a plurality of files likely to be read in sequence. Thus, it can take time to read a plurality of files even if they tend to be read consecutively.
It is therefore an object of the present invention to provide a method for prefetching a plurality of files that are accessed continuously.
It is another object of the present invention to provide a method for allowing a large number of small files efficiently to utilize storage region of a memory, whereby the throughput of a file system is prevented from deteriorating.
In carrying out the invention and according to one aspect thereof, there is provided a file prefetch control method for use with a computer system, including the steps of: dividing a file into a plurality of partial files furnished with a partial file name each; and converting a request to access any one of the partial files using the corresponding partial file name into a request to access the entire file to which the requested partial file belongs; whereby the file as a whole is read out. With this method, a plurality of partial files which tend to be read out consecutively are managed as a single file. A request to read any one of such partial files is arranged to trigger prefetch of the other partial files that are likely to be read out together.
According to another aspect of the invention, there is provided a file prefetch control method for use with a computer system, including the steps of: rendering a plurality of partial files consecutively into a single file; converting a request to access any one of the partial files into a request to access the single file to which the requested partial file belongs; whereby the whole file is read out. Even where a large number of small files exist, this method allows storage regions of the memory to accommodate the files efficiently and thereby prevents deterioration of the file system throughput.