The present disclosure is related to the field web searching, and more particularly to systems and methods in which web search results are ranked and ordered.
A search engine results page, or SERP, is the listing of web pages returned by a web search engine in response to a keyword query. The results normally include a list of web pages, in the form of an identifying name as a link to the underlying uniform resource locator, or URL, which is effectively an address for the web page on the world wide web (or the “web”), and a short description or excerpt of the web page showing where the keyword matched the content within the web page. As used herein, “web” shall be used in short for the World Wide Web, and shall refer to the system of interlinked or linkable content located on a wide variety of storage media which may be accessed via an interconnect network of computers and similar devices referred to as the Internet. Furthermore, the term “internet content”, or simply “content” as used herein shall refer to text, video, audio, and virtually any other type of information and format and combinations thereof, which is accessible via the world wide web, and which may be sorted and indexed for relevance and rank in response to a web search request.
Once the SERP is obtained for a keyword query, the user may navigate the web to those results and view them, for example using a web browser program, share one or more of the resulting links, for example by email or text, share certain content found on the linked pages, again by email or txt, and so forth. That is, one or more of many common software tools may be used to access, share, copy, edit, tag, comment on, etc. the entries in the SERP. Users may also browse directly to content through links and referrals, not necessarily from SERP.
The concept of “Sharing” has evolved over the last two decades. Initially, sharing meant providing a user with the ability to email an item of content to a friend via a convenient control (e.g., button) on a website or built into the web browser application. Sharing has now evolved into a plethora of 3rd party websites for bookmarking, saving or posting to other locations like social networking sites where friends and family can view the content being referenced. Sharing is distinct from other social interactions like commenting in that a user has explicitly attached their reputation to the content and implicitly endorsed it. This sharing is done either in a private manner (e.g., email a friend), in a semi-public manner (e.g., posting to a social network), or in a public manner (e.g., posting to social news sites). So other users can benefit from the work done by the users sharing.
In many cases, more than one of these tools are presented to a user in a single interface for convenience and simplicity of use. For example, as shown in FIG. 1, a Yahoo! News web page 10 (http://news.yahoo.com/) presents a user with the option of sharing an article being read to another user by way of a “send” pull-down menu 12. In the example of FIG. 1, the only available conduit for sending the article being read is email, although other such conduits are available, such as sending the line via SMS text messaging. Another example of sharing of content from a Yahoo! News web page 14 is illustrated in FIG. 2, in which a “share” pull-down menu 14 presents the user with a number of options for appending the link to the content being read to a social networking page, such as Digg, Facebook, etc. In addition to sharing the Yahoo! News pages, users may also comment on, and “vote” for content via the “Buzz Up” link 18 shown in FIGS. 1 and 2. These convenient methods for sharing are in recognition of the value and desire of users to interact around information. Web site operators that provide these types of convenient tools for users validate the fact that people want to comment on, modify and combine (“remix”), share, and use the content in ways that are suitable to them.
The SEPR is a result of running what is referred to as a web search engine (or simply a search engine). This type of application assists users with finding, among the staggering number of web pages, images, video, audio, web log (blog), and myriad other file types (content) accessible via the Internet, those relevant to their interests. Common search engines include Google, Yahoo! Search, Ask, Microsoft Live Search, etc. Such search engines typically have aspects that autonomously search the web (referred to as web crawlers) building directories of search terms and associated addresses at which content related to those terms can be found. While each search engine operator has its own, generally proprietary relevance algorithms, each search engine must make a determination as to whether an item of content in its directory is relevant to a user's search request, and if so present that content to the user in response to the search request. In addition to relevance, there is a desire to rank those search results determined to be relevant such that the most relevant content is presented to the user first, the second most relevant content presented second, and so on.
To date, relevance and rank have primarily been determined based on reference to characteristics of the content intrinsic to its web presence. That is, attributes such as how many other websites link to a subject website, how popular those linking websites are, how many times the item of content contains a searched keyword, and so forth are used to assign a numerical weight to elements of a set of search results, and the element with the highest weight is presented first. PageRank, a technique employed by the Google search engine (see U.S. Pat. No. 6,285,999, incorporated herein by reference) is an example of such a relevance and ranking process. It will be noted that the typical relevance and ranking process employs only attributes intrinsic to the content such as presence of keywords, keyword density, keyword placement, etc., and not attributes extrinsic to the content, such as the actual use of the content, in assigning ranking weights. For the purposes hereof, we term this limited class of data analysis for relevance and ranking “internal reference”, as in reference only to attributes internal to the web.
Ranking of search results historically has been done as follows. A software tool referred to as a web-crawler autonomously examines websites, and uses an algorithm to sort and order what the crawler uncovers. One very common algorithm employs the location (e.g., HTML title tag) and frequency of keywords on a web page to determine whether the web page discusses and is “about” the keyword. For example, there is an assumption that the more a term is used near the top or start of a web page, the more relevant that page is to the searched term, and further an assumption that a word used more frequently on a web page relative to other words on that page is more relevant to the overall content of that page, and thus of greater interest when ranking results of pages in which a searched term appears. But because each search engine has a different algorithm, the same search performed by different search engines produce different results. Even so, the majority of common search engines refer to such location/frequency data and other internal factors when doing their relevance and ranking analysis.
However, we have recognized that many more factors may be useful or even key in determining relevance and ranking of search results. In the tautology of the reference factors, there are internal and external factors. For the purposes hereof, we refer to one class of external factors as “social reference” factors, as described further below. Prior web search relevance and ranking processes have largely overlooked the importance of the social reference factors.
A number of search engines do consider certain social reference factors, such as the number of third party comments appended to an article after its web posting, the number of links added to a article after its web posting, and the re-writing of an article with a link back to the original (so called Fresh content). Furthermore, non-search applications such as the aforementioned Yahoo! Buzz, track and employ the number of email shares, user ranking and so forth in determining, for example, which stories to present on a news web page. There are also a number of human decision-making resources for indexing, organizing, and reviewing content (e.g., www.stumpedia.com, www.findingdulcinea.com, www.mahalo.com, and www.wikipedia.com).
However, there are many other social reference factors not yet appreciated as being of great value in determining relevance and ranking, specifically of web content search results. Of particular interest herein are algorithmic relevance and ranking of web content search results that utilize external, social reference data (i.e., do not require a user to make the relevance and ranking decisions). There is a need in the art for an improved algorithmic search analysis process and system employing same that utilizes additional external social reference factors in determining relevance and ranking of web content search results.