A general host system, for example, a computer system includes a nonvolatile large-scale storage device such as a magnetic hard disk drive (HDD) or a solid-state drive (SSD) with a nonvolatile semiconductor memory.
The storage device includes, for example, an interface, first memory block, second memory block, and controller (for example, U.S. Pat. No. 6,377,500).
The first memory block stores files. The second memory block serves as a buffer memory for write/read. The first memory block is a nonvolatile large-capacity storage, as compared to the second memory block, but its access speed is low. The second memory block is used to compensate for the difference between the communication speed of the interface and the write/read speed of the first memory block. For example, the first memory block is a nonvolatile flash memory array, and the second memory block is a volatile DRAM or SRAM.
The problem of the conventional storage device is that it has no full-text search function by itself. A full-text search function searches stored files for files including a search target content in response to a content search request, and outputs a list of them. Such content is normally comprised of words. The storage device preferably has an advanced function of receiving, as an input, a Boolean operation request including AND/OR/NOT for a plurality of content search results and outputting a file list representing the Boolean operation result.
Methods of implementing the full-text search function include the inverted index method (for example, J. Zobel, A. Moffat and K. Ramamohanarao, Inverted files versus signature files for text indexing. ACM Transactions on Database Systems (TODS), Volume 23, Issue 4 (December 1998), Pages: 453-490). In the inverted index method, an index data file called an inverted file is created for each content in advance, which stores a list of files including the content. The description of the inverted file is updated every time a file is added or deleted. For a content search request, the description of an inverted file corresponding to the search target content is output as a search result. It is therefore unnecessary to check the descriptions of all files in each full-text search.
Conventionally, to implement the full-text search function using a storage device, management of inverted files stored in the storage and Boolean operation for a plurality of content search results need to be done using the central processing unit (CPU) or main memory (DRAM) of the host system.
However, since the communication speed between the host system and the storage device is limited by the communication speed of the host interface, inverted file management or Boolean operation cannot be performed at a high speed.