This invention relates to a method and apparatus for data searching in a computer environment, that is to say a method and apparatus for acting upon a search query supplied to a computer by a user and for locating data in accordance with the query. More particularly, but not exclusively, the invention relates to a method and apparatus for locating a text string which may be present in a database of stored text files and which is in accordance with a user supplied search query.
The invention also relates to a computer readable medium operable for supplying instructions to a computer to cause it to operate or become operable in accordance with said method and apparatus.
In order to identify or locate particular documents or blocks of text in a data base of text files, it is known to provide a method and apparatus which can receive a user supplied search request comprising a particular text string and will carry out an hierarchical search through an indexed database to find a matching string within the database. One such known method and apparatus is disclosed in U.S. Pat. No 5,781,772 to Wilkinson, III et al. Also known are systems able to carry out Boolean searching in which documents stored in a database are located on the basis of a search query made up of two or more text strings linked by logical operators such as AND, OR and AND NOT. Special logical operators are also available sometimes, for example xe2x80x9cnearxe2x80x9d where documents are located if two particular words appear next to each other or within a specified number of words from each other in the document.
The result of any large database search may well comprise many, perhaps a very large number of, xe2x80x98hitsxe2x80x99, this being due to lack of knowledge or memory and/or the lack of a particular search capability. Thus, the user may know or remember only part of the information needed to aim the search more precisely or the search program may not allow discrimination of the order in which specific text strings from the search request appear in the target document.
One object of the invention is to make available a search algorithm which provides an additional functionality or an additional search query format for identifying documents and/or locating blocks of text in a database of text files.
Another object is to provide an apparatus and method for data searching able to better discriminate specific blocks of text identified by a search query.
According to one aspect of the invention, there is provided, in a computer environment, a method for searching data to locate a portion of said data identified by a search query, the method comprising:
receiving a sequence of two or more data fragments expected to be contained within said data;
searching the data to locate matches between the data and the respective data fragments; and
identifying a portion of said data from the address of a match with the first data fragment in the sequence and the address of a match with the last data fragment in the sequence.
Advantageously, the method further includes:
searching the data to locate the first match between the data and the first data fragment in the sequence;
searching the data to locate the last match between the data and the last data fragment in the sequence; and
identifying a portion of said data between the addresses of said first and said last match.
The method may also include:
searching the data to locate the first match between the data and the first data fragment in the sequence;
searching the data to locate matches between the data and the or each subsequent data fragment in the sequence;
identifying a portion of said data from the address of said first match between the data arid the first data fragment to the address of the first match between the data and the last data fragment in the sequence subsequent to at least one match between the data and any intermediate data fragment in the sequence.
In each case, the method may include displaying said data upon a display screen and highlighting said identified portion of data.
According to a second aspect of the invention, there is provided, in a computer environment, a method for searching data to locate a data item within the data, the method comprising:
receiving a search query comprising two or more data fragments contained in sequence in said data item;
searching said data to locate matches with the respective data fragments which matches are non-overlapping and in the same sequence as in said search query.
According to a third aspect of the invention, there is provided, in a computer environment, a method for searching a database to locate a data item, the method comprising:
storing two or more data fragments contained in sequence in said data item;
searching the data base to locate the first match with the first data fragment; and
searching the database to locate matches with the or each subsequent data fragment, said searching being directed in dependence upon the location(s) in the database of matches with the or each previous data fragment.
According to a fourth aspect of the invention, there is provided, in a computer environment, a method for searching a database to locate a specific data item, the method comprising:
storing two or more data fragments contained in sequence in said data item;
searching said database to locate the first match with the first data fragment in said sequence and storing the start address of said first match;
from the end address within the database at which said first match is located, searching said database to locate the last match with the last data fragment in said sequence and storing the end address of said last match;
from the said start address of said first match to the start address of said last match, searching said database to locate all matches with the first data fragment in said sequence; and
for each subsequent data fragment in turn, searching the database from the end address of the first match with the previous fragment to the said start address of said last match of said last fragment to locate all matches with each said subsequent data fragment.
According to a fifth aspect of the invention, there is provided an apparatus for searching data to locate a portion of said data identified by a search query, the apparatus comprising:
input means for receiving a sequence of two or more data fragments;
control means connected to said input means and said data supply means and operable for searching data made available by the data supply means to locate matches between the data and the respective data fragments, and for registering information identifying a portion of said data from the address of a match between the data and the first data fragment in the sequence to the address of a match between the data and the last data fragment in the sequence.
According to a sixth aspect of the invention, there is provided a computer readable medium containing a computer program for rendering a computer operable for searching data to locate a portion of the data identified by a user supplied search query, the program comprising:
computer code for enabling the computer to receive a sequence of two or more data fragments;
computer code for directing the computer to search said data to locate matches between the data and the respective data fragments; and
computer code for causing the computer to identify a portion of said data from the location in said data of a first match between the data and the first data fragment in said sequence to the location of a match between the data and the last data fragment in the sequence.