1. Field of the Invention
The present invention relates to search results characterization in a search engine and more particularly to search results characterization in a Web log (“blog”) search engine.
2. Description of the Related Art
Content distribution serves a core function of the Internet. From the earliest days of Internet computing, tools such as “Archie” and “Gopher” provided content retrieval mechanisms in which content—namely academic and technical publications—could be located and retrieved, even if the identify of a retrieved publication had not been known a priori. Nearly two decades ago, with the development and commercial deployment of the World Wide Web (the “Web”), content searching tools experienced a dramatic leap forward with the development of several commercially accessible search engines specifically geared to content distributed over the Web. Even today, search engine technology for Web based content continues to evolve in ways unimaginable even just a few years ago.
In the prototypical search engine, in a process often referred to as “spidering”, a computer program periodically (or today, continuously) probes Web accessible content sources—namely Web sites—parses the textually content of the content sources and incorporates the parsed content into an index. Thereafter, query terms can be received through a generic user interface (UI) and the index can be consulted to identify indexed content containing one or more of the query terms, also referred to as search terms. Finally, a result set can be presented in the UI to the querying end user. Optionally, the relevancy of each result set can be provided in the result set indicating a percentage of query terms appearing in the result set. Further, the result set can be sorted according to relevance so that the most relevant results appear at the beginning of the list for ease of access by the querying end user.
While search engine technology has formed part and parcel of the daily Internet experience for the typical end user in respect to content on the Web, the efficacy of the traditional search engine has not translated well to the “Blogosphere”. The term “Blogosphere” refers to the collection of Web logs (“blogs”) accessible through the Web or outside of the Web. As it is well known, a blog is essentially an open diary produced by an author expressing thoughts either amorphously, or more typically in accordance with a theme. Thus, blog postings and indeed the entirety of a blog, often is associated with one author or a collective of authors. Thus, while the content itself of a basic Web page may be the only important aspect of the Web page from the perspective of an end user searching Web content, in the Blogosphere, the content of the blog in addition to the nature of the author of the blog can be equally as important. Yet, the conventional search engine does not account for the nature of the author in performing content searching.