1. Field of the Invention
The invention relates generally to the technical field of generating an extended page snippet of a search result in a search engine, and particularly to a method and apparatus for generating a page snippet in table style.
2. Description of the Related Art
As the Internet business continuously grows, various existing search engines have become indispensable tools that people use to find network resources of interest, for example webpages.
Generally, a search engine operates in the following manner: once a user submits an inquiry though a client, the search engine will return searched webpages to the user through a search result page. One important object of the search engine is to provide a link set desired by the user with respect to a specific search inquiry of the user, and another object is that it is required to inform the user of the content associated with each link clearly and quickly. Therefore, when the search result is returned, besides a title and a uniform resource locator (URL) of the webpage, the search result page also contains a short text description related to the webpage. This short text description is usually referred to as page snippet. In general, the search engine extracts the page snippet from the webpage by extracting and combining text segments including a keyword involved in the inquiry. In the search result page, the search engine differentiates the display of the inquired keyword from other texts in the page snippet by various means, such as highlighting, underlining, different font, and the like, in order to draw the user's attention and facilitate the user to determine whether to click the webpage. The page snippet in the prior art reflects a correlation between the webpage and the inquiry to a certain extent. The current page snippet in the prior art consists of the text segments containing the inquired keyword, however, and selecting of the text segment does not take account of the content other than the keyword in the text segment. It also does not take account of the table format information of the text segment.
However, a table is an important data source, and some widely used data types adapted to be presented in a table are listed as follows: traditional Web Table type of data, for example, information such as members, companies, situations, merchandise, movies, and music, including both bordered tables and non-bordered tables. The application of business intelligence (BI) causes a number of enterprise data to be generated in the form of report form (a format such as Web report form, PDF, Excel®, Word and the like), and many BI analysis and presentation tools in an enterprise level such as IBM Cognos® and the like will generate a lot of report forms and publish the same. There is a strong search demand for such massive data in an enterprise or the Internet. Moreover, on the basis of a file parsing tool, various mainstream search engines have already brought documents in Excel, Word and the like under the retrieval.
In order to improve the user experience, the prior art also provides a search result preview function which may preview webpage information in the manner of a picture. In the field of increasingly mature search engine technology, the space for modifying is getting smaller and smaller, and difficulty in improvement and innovation to the search engine is increasing. Therefore, a little modification may mean a great improvement to the user experience. However, the snippet is different from the preview. The preview does not generate a relative segment for a final user's fast understanding on the basis of the inquiry, but simply outputs the content of the original webpage. Whereas the snippet is used for the user to quickly judge the correlation with the inquired word, the preview is used to further judge the correlation after the judgment through the snippet; the stages of using them are different. A display space of the snippet is very narrow and small, while the display space of the preview is very large. The snippet is displayed as default, but the preview is not and is displayed only after a mouse is moved to a particular position (including a title, a snippet, a network address and the like) to trigger the display, and there is also a delay in showing the display (depending on the displayed content and the network speed). Thus, the snippet and the preview are absolutely different technical solutions for those skilled in the art.
Accordingly, with respect to the table data source, the table format information thereof is also an extremely important part which facilitates the user to quickly understand the search result through the webpage snippet. The search technology needs to be further improved to at least present the table format formation in the page snippet to a certain extent.