With a rapid development of the information technology, more and more people use the Internet to find a variety of information, including latest released news, progresses of new technologies, professional and academic papers, information published or shared on social networks such as comments, blogs, discussion, etc. For example, a user may want to find detailed information about a news through the network, or to find an introduction of a certain technology, or to learn about other people's comments on a recent released film, etc. An important tool to achieve these network query requirements is a search engine.
A search engine is a system that can automatically collect information from the Internet using a certain strategy and a specific technique and, after organizing and processing the collected information, present to the user with the information relevant to the user's search. It is important to organize and process the collected information, and extracting the keywords, creating index documents, and sorting search results according to certain rules are the key factors, which can affect the searching speed.
In the search engine technology, the inverted index is a commonly used data structure. By using the inverted index, a list of documents that contain a keyword can be quickly obtained based on the keyword, and search results can be quickly generated and be fed back to the user.
However, the existing inverted index method may have certain problems. An overall search time is proportional to a number of data records that hit with the keyword. When an inverted chain of a keyword is very long, for example, a keyword “news” or “sport” may hit millions of results, it may need at least hundreds of milliseconds or even several seconds to complete the search.
Accordingly, the present disclosure provides a method for fast merging inverted chains and a related apparatus to at least partially alleviate one or more problems set forth above and to solve other problems in the art.