The present invention generally relates to file processing methods, data processing apparatuses and storage mediums, and more particularly to a file processing method and a data processing apparatus which compress a file such as a dictionary file related to one or a plurality of dictionaries, encyclopedias and the like, store the compressed file in a storage medium and read the stored file from the storage medium, and to a storage medium which stores a file such as a compressed dictionary file.
Recently, there are storage mediums such as a CD-ROM which prestores information related to a dictionary, encyclopedia or the like. By making access to such a CD-ROM from a computer, it is possible to read and display the information related to the dictionary, encyclopedia or the like. As a result, a large amount of information related to the dictionary, encyclopedia or the like can be stored in a single CD-ROM which is extremely compact. In addition, instead of obtaining the necessary information by opening a dictionary, encyclopedia or the like while using a computer, the necessary information can be read from the CD-ROM, thereby making it possible to greatly reduce the time and trouble to obtain the necessary information.
In a conventional CD-ROM which stores the information related to the dictionary, encyclopedia or the like, a dictionary file is made up of a dictionary data and a data related to index (hereinafter referred to as an index data). For example, in the case of an encyclopedia, the dictionary data includes a data (hereinafter referred to as a text data) related to a text which explains the meaning of a word, a data (hereinafter referred to as an image data) related to an image showing an animal if the word describes the animal, for example, a data (hereinafter referred to as an audio data) related to a sound such as a singing of a bird if the word describes the bird, for example, and the like. On the other hand, the index is used to retrieve a desired dictionary data from the dictionary file, and is provided with respect to the dictionary data. The index is sometimes also referred to as a keyword. The index data includes a pointer related to a heading, a pointer related to an item, and the like. The data related to the heading includes a headword. Further, the data related to the item includes a headword, comment, and the like.
Conventionally, because the storage capacity of the CD-ROM is relatively large, the text data and the index data are stored in the CD-ROM without being compressed. On the other hand, the amount of information included in the audio data and particularly the image data is large, and the audio data and the image data are respectively compressed according to appropriate compression techniques before being stored in the CD-ROM.
However, if one CD-ROM is required for each dictionary or encyclopedia, it is troublesome to utilize the dictionary data. For this reason, it is desirable to store the information related to a plurality of dictionaries, encyclopedias or the like in a single CD-ROM, but in this case, there was a problem in that the amount of information to be stored may exceed the storage capacity of the single CD-ROM even if the dictionary data is compressed. In addition, even in a case where the dictionary file to be stored in the CD-ROM relates to a single dictionary, encyclopedia or the like, as the amount of information of the dictionary file increases, the amount of information to be stored may exceed the storage capacity of the single CD-ROM even when the dictionary data is compressed.
Accordingly, it is conceivable to not only compress the dictionary data but to compress the entire dictionary file, including the index data, when storing the information related to the dictionary, encyclopedia or the like in the CD-ROM. But no method which is capable of efficiently compressing the entire dictionary file by a relatively simple technique and capable of expanding the compressed dictionary file in a short time has yet been proposed. Particularly in the case of the dictionary, encyclopedia or the like, the amount of information related to the index data is large. For this reason, if it takes a long time to carry out the process of restoring the index data when expanding the compressed dictionary file, an access time to the desired index data or dictionary data becomes long, thereby deteriorating the convenience of the dictionary, encyclopedia or the like.
Moreover, when compressing the dictionary data in units of the item of the index or in units of a fixed length, for example, it takes a long time to carry out the process of expanding the dictionary file because the amount of information related to the index data is large particularly in the case of the dictionary, encyclopedia or the like, thereby similarly deteriorating convenience of the dictionary, encyclopedia or the like. For example, a Japanese Laid-Open Patent Application No.9-26969 proposes a telephone directory retrieval system which employs a method similar to the above. However, this proposed method does not compress the index data. In the case of the telephone directory, the amount of information related to the index data is small compared to the amount of information related to the telephone number, family name, given name, corporate name and address which correspond to the dictionary data. Consequently, the information compression efficiency as a whole will not greatly improve even if the index data of the telephone directory were compressed. Therefore, even if this proposed method were applied to the storage of the information related to the dictionary, encyclopedia or the like into the storage medium, the information compression efficiency of the dictionary file as a while will not improve considerably.
Accordingly, in a case where the amount of information related to the index data is relatively large even when compared to the amount of information related to the dictionary data, such as the case of the dictionary, encyclopedia or the like, there was a problem in that it is conventionally impossible to efficiently compress and store the dictionary file in the storage medium and to make access to the compressed dictionary file in a short time by a relatively simple process.
Hence, it is an object of the present invention to provide a file processing method, a data processing apparatus and a storage medium which are capable of efficiently compressing and storing a dictionary file in the storage medium and making access to the compressed dictionary file in a short time by a relatively simple process, even in a case where the amount of information related to an index data is large even when compared to the amount of information related to a dictionary data, such as the case of a dictionary, encyclopedia or the like.
Another object of the present invention is to provide a file processing method comprising a compressing step dividing data and index data with respect to the data into a plurality of sections, and compressing the sections to obtain a compressed file, and a storing step storing the compressed file in a storage medium together with address information of the sections after the compression. According to the present invention, it is possible to efficiently compress and store in the storage medium a file such as a dictionary file which is formed by data including an index, text of each item and the like. In addition, it is possible to carry out a file retrieval at a high speed by a relatively simple process, by expanding the compressed file for every section.
When each section has a fixed length, it becomes unnecessary to include address information prior to the compression in the compressed file, and the data compression efficiency can be improved. On the other hand, when each section has a variable length, and said storing step further stores address information prior to the compression in the storage medium, it is possible to carry out the data expansion at a high speed by setting the section to an appropriate length depending on the data type and section.
When the file processing method further comprises a restoring step reading the compressed file from the storage medium and expanding each of the sections, so as to restore the data and the index data, it is possible to improve the file retrieval speed by using an auxiliary storage unit capable of making a high-speed data access and storing the restored data and index data in the auxiliary storage unit.
When the compressing step uses a compression algorithm and a compression parameter which are common to the data and the index data of each of the sections, it is possible to simplify the data compression process and the data expansion process at the time of the data expansion by using the common compression algorithm and compression parameter. More particularly, it is possible to use the Huffman code, the universal code and the like as the compression algorithm.
Still another object of the present invention is to provide a file processing method comprising a reading step reading a compressed file from a storage medium together with address information of each of a plurality of sections after compression, for each of the sections, said compressed file being obtained by dividing data and index data with respect to the data into the sections and compressing the sections, and a restoring step expanding the compressed file and restoring the data and the index data. According to the present invention, it is possible to carry out a high-speed file retrieval by a relatively simple process, by carry out the expansion of the compressed file such as a compressed dictionary file for every section.
A further object of the present invention is to provide a data processing apparatus comprising compressing means for dividing data and index data with respect to the data into a plurality of sections, and compressing the sections to obtain a compressed file, and storing means for storing the compressed file in a storage medium together with address information of the sections after the compression. According to the present invention, it is possible to efficiently compress and store in the storage medium a file which is formed by data including an index, text of each item and the like. In addition, it is possible to carry out a file retrieval at a high speed by a relatively simple process, by expanding the compressed file for every section.
Another object of the present invention is to provide a data processing apparatus comprising reading means for reading a compressed file from a storage medium together with address information of each of a plurality of sections after compression, for each of the sections, said compressed file being obtained by dividing data and index data with respect to the data into the sections and compressing the sections, and restoring means for expanding the compressed file and restoring the data and the index data. According to the present invention, it is possible to carry out a high-speed file retrieval by a relatively simple process, by carry out the expansion of the compressed file for every section.
Still another object of the present invention is to provide a storage medium which stores computer-readable information, comprising reading means for causing a computer to read a compressed file from a storage medium together with address information of each of a plurality of sections after compression, for each of the sections, said compressed file being obtained by dividing data and index data with respect to the data into the sections and compressing the sections, and restoring means for causing the computer to expand the compressed file and restore the data and the index data. According to the present invention, it is possible to carry out a high-speed file retrieval by a relatively simple process, by carry out the expansion of the compressed file for every section.
A further object of the present invention is to provide a storage medium which stores computer-readable information, comprising a compressed file stored together with address information of each of a plurality of sections after compression, for each of the sections, said compressed file being obtained by dividing data and index data with respect to the data into the sections and compressing the sections, where said compressed file is compressed using a compression algorithm and a compression parameter which are common to the data and the index data of each of the sections. According to the present invention, it is possible to efficiently compress and store a file in the storage medium. In addition, it is possible to carry out a file retrieval at a high speed by a relatively simple process, by expanding the compressed file for every section.
Another object of the present invention is to provide a storage medium which stores computer-readable information, including a program which causes a computer to carry out a compressing procedure for dividing dictionary data and index data with respect to the dictionary data into a plurality of sections, and compressing the sections to obtain a compressed dictionary file, and a storing procedure for storing the compressed dictionary file in the storage medium together with address information of the sections after the compression. According to the present invention, it is possible to retrieve the file at a high speed by carrying out a relatively simple process.
Still another object of the present invention is to provide a computer-readable storage medium storing a compressed file comprising a compressed data region storing compressed data obtained by dividing data and index data with respect to the data into a plurality of sections and compressing the sections, and an address information region storing address information after compression of the sections, and a compression parameter region storing a compression parameter used for the compression. According to the present invention it is possible to retrieve the file by carrying out a relatively simple process.
Therefore, according to the present invention, even in when the amount of information related to the index data is large even when compared with the amount of information related to the dictionary data, such as the case of the dictionary, encyclopedia and the like, it is possible to efficiently compress and store the file such as the dictionary file in the storage medium, and the file such as the compressed dictionary file can be accessed within a short time by carrying out the relatively simple process.
Other objects and further features of the present invention will be apparent from the following detailed description when read in conjunction with the accompanying drawings.