1. Field of Invention
The present invention relates to managing files in a computer, relates more particularly to a storage device for storing variable length data files of different content and different length, to a method of storing variable length data files in the storage device, to a method of reading from the storage device, and to a program enabling the same.
2. Description of Related Art
Japanese Unexamined Patent Appl. Pub. JP-A-2007-228555 teaches a data storage method for segmenting image data for a scanned document into a plurality of files, compressing the segmented image data using a predetermined standard method (JPEG), and storing the image data with JPEG markers indicating the content of the image data. By changing the image size to which the compression process is applied according to the amount of available memory, this method enables handling scanned images of documents in a standard image format that can also be handled by other printing devices and personal computers, for example, as well as images that are larger than the standard management size.
One method of handling variable length data predicts the maximum length of the variable length data items and reserves fixed-length data fields of a size equal to each predicted maximum item length (such as 255 bytes) in a data file to simulate handling variable length data. Another method records each variable length data item from the beginning of a fixed-length data field, and fills the remaining portion (where variable length data is not recorded) with special data that is not normally used as data that is processed or manipulated. Because this method enables using variable length data as fixed-length data, the location of each variable length data item in the data file can be easily calculated. However, because storage areas sized to the maximum expected length of the variable length data items are always reserved, a problem of wasted storage space occurs in the storage device, and if variable length data exceeding the predicted maximum length of the variable length data occurs, handling (processing) the data becomes difficult.
Another method of handling variable length data stores the variable length data items sequentially with no gaps therebetween in the data file, and reads the variable length data items one by one from the beginning of the data file (in actuality reading one word at a time) to find the desired variable length data item. This method can effectively use the storage area of the storage device because it predicts the maximum length of the variable length data items and does not reserve fixed-length data fields sized to this predicted maximum length, but processing to read the desired variable length data item can be time-consuming because the variable length data must be read and evaluated one at a time from the beginning of the data file in order to find the desired variable length data.
Japanese Unexamined Patent Appl. Pub. JP-A-H06-60120 and Japanese Unexamined Patent Appl. Pub. JP-A-H08-263338 teach examples of solutions for the foregoing problems of the related art.
Japanese Unexamined Patent Appl. Pub. JP-A-H06-60120 teaches a structure including at least one data file storing a plurality of variable length data items, at least one index data file having a 1:1 relationship to a data file and storing the storage location of each variable length data item in the data file, and at least one search criteria file storing an index file name and the storage location of each search criteria entry in the index file. An attribute identifier and an end identifier are respectively added before and after each variable length data item, and auxiliary data of an undefined length that is not searched is added as necessary between the end identifier of one variable length data item and the attribute identifier of the next variable length data item.
When creating or updating a data file in which contiguous variable length data items are stored in blocks, Japanese Unexamined Patent Appl. Pub. JP-A-H08-263338 teaches storing the total average data count per block, which is the average number of variable length data items stored in all blocks stored in the data file, and the partial average data count per block, which is the average number of data items stored in each of the blocks from the first block in the data file to each particular block, in the data file. To find a variable length data item of a specified number in the data file, the specified variable length data item is found using the total average data count per block, the partial average data count per block, and the number of the desired variable length data item.
However, because there may be more than one stored data file for one data file with the structure taught in Japanese Unexamined Patent Appl. Pub. JP-A-H06-60120, the stored data files must be opened one after the other on the main storage device until the specified search criteria is found when a variable length data search is specified, and the search process therefore cannot be accelerated. Another problem is that because auxiliary data of an undefined length that is not used for searching is added as needed to the data files, the storage area of the storage device cannot be used effectively.
While the technology taught in Japanese Unexamined Patent Appl. Pub. JP-A-H08-263338 enables effectively using the storage space in the storage device because variable length data items are stored contiguously in the data file, searching for specified variable length data is done by the process described below.
First, the number of the specified variable length data item is divided by the total average data count per block to calculate the block number of the block in which the specified variable length data item is thought to be stored. The block of the calculated block number is then sought, the variable length data stored at the beginning of that block is read to get its number. If the number of the variable length data item is greater than the number of the specified variable length data, the specified variable length data is stored in a block before that block. The partial average data count per block of the block identified by the foregoing calculation is therefore read, and the number of the specified variable length data is divided by the partial average data count per block to again calculate the block number of the block in which the specified variable length data item is thought to be stored. This process then repeats until the data count of the first data item in the calculated block is less than or equal to the number of the specified variable length data item. When the data count of the first data item in the calculated block becomes less than or equal to the specified variable length data, the variable length data is read sequentially from the beginning of that block, and whether the specified variable length data was found is confirmed.
The method taught in Japanese Unexamined Patent Appl. Pub. JP-A-H08-263338 enables faster processing than a method in which variable length data (actually read in word units) must be read one at a time from the beginning of the data file in order to find the specified variable length data, but because the calculations must be repeated and the location of the specified variable length data cannot be determined directly, the search process is not necessarily faster.