1. Field of the Invention
The current invention relates to information retrieval, and, in particular, to providing ranked results in response to a search query, such as an Internet-based blog search.
2. Description of the Related Art
The Internet is a network of networks that provides access to innumerable instances of many types of information. One segment of the Internet is the world-wide web (www), which is a collection of interlinked hypertext documents and services accessible via uniform resource locator (URL) addresses. The web comprises many types of information since any information that can be transmitted digitally can be provided via the web. One category of information available on the web, i.e., online, is the blog, which is a word derived from the term “web log.”
A blog is an online journal whose contents is provided by a blogger. The blogger is often an individual, but can also be a group of individuals. A blog is updated through posts provided by the blogger. Posts typically comprise text and graphics, but can also include any information that can be digitized, such as audio and video. Some blogs are dedicated to a single topic, such as restaurant reviews or political commentary, while other blogs cover multiple topics, as dictated by their bloggers' whims. Most blogs are hosted by specialized blog websites, such as Blogger, at www.blogger.com, or Wordpress, at wordpress.org. Such specialized blog websites simplify creating, maintaining, and posting to blogs.
Blogs may contain information that is useful to a user researching a topic online. Users can research the topic online by entering search terms in a search engine such as Google, at www.google.com, or Yahoo!, at www.yahoo.com. Blog posts, which are online documents, are typically searchable by search engines, and search engines are often able to determine whether a retrieved document that matches the particular search criteria is a blog document or not. Search engines can typically access metadata for a blog post, where metadata in an online document is information that is not readily visible to a user browsing the online document. Blog metadata can include information about the blogger, such as the blogger's nominal home location. Individual blog posts can also contain metadata. A particular blog post can include location information pertinent to that blog post in its readily visible text. The blog post can also include location and other information in metadata associated with the blog post.
Blogs and blog posts vary enormously in utility to a user performing an online search. Some blog posts include a blogger's thoughtful and accurate assessment of a topic. Blogs comprising such blog posts can be considered high-quality blogs and can be useful to others interested in subjects that high-quality blogs discuss. Some blog posts are rants or paeans that provide a misleading overview of their subject. Blogs substantially composed of such blog posts can be considered low-quality blogs and are not likely to be useful to others interested in the subjects discussed by the low-quality blogs. There are known methods for providing an assessment of the quality of a blog based on measurable factors. For example, the number of links to/from the blog from/to known high-quality sources can be used as an indicia of quality since high-quality sources are likely to direct their readers to other high-quality sources, or at least not to low-quality sources. Conversely, links to/from the blog from/to known low-quality sources can indicate a low-quality blog since low-quality sources are likely to direct their readers to other low-quality sources.
As new blog posts continue to be posted, either to existing blogs or new blogs, novel ways of ranking search results may increase the likelihood that the most relevant results are the results that are displayed first to the user performing the search. This is particularly true for a user who performs the search from a mobile device, with which investigating less-relevant results is likely more wasteful of the user's and device's resources.