1. Field of the Invention
The present invention generally relates to search programs and more particularly to an improved method and system for searching which clusters hypertext documents.
2. Description of the Related Art
The World-Wide-Web has attained a gargantuan size (Lawrence, S., and Giles, C. L. Searching the World Wide Web. Science 280, 5360 (1998), 98., incorporated herein by reference) and a central place in the information economy of today. Hypertext is the lingua franca of the web. Moreover, scientific literature, patents, and law cases may be thought of as logically hyperlinked. Consequently, searching and organizing unstructured collections of hypertext documents is a major contemporary scientific and technological challenge.
Given a “broad-topic Query” (Kleinberg, J. Authoritative sources in a hyperlinked environment, in ACM-SIAM SODA (1998), incorporated herein by reference), a typical search engine may return a large number of relevant documents. Without effective summarization, it is a hopeless and enervating task to sort through all the returned documents in search of high-quality, representative information resources. Therefore, there is a need for an automated system that summarizes the large volume of hypertext documents returned during internet searches.