Internet content is a mixture of informative, artistic, entertainment, and for profit content. One of the primary benefits of the Internet is that anyone can quickly, easily, and economically create content that is hosted and served to others in the form of one or more websites. However, this content can also be intermixed with false information, spam, malware, adware, and other elements that are intended to defraud, harass, deceive, or otherwise mislead content consumers. It can be difficult for a user that is a content consumer to distinguish between credible and non-credible content. As a result, the user may be misled into buying a good or service that it does not want or is different than the content consumer's expectations, provide confidential information to a non-credible entity that can then misappropriate that information for nefarious purposes, or the user may simply consume information that is represented as truthful and accurate, but is instead falsified or misrepresented.
The burden of distinguishing between credible and non-credible websites is mostly left to the user. Some users identify a credible website that hosts credible content as one that is authoritative or a primary source for the content that it distributes. Some examples of authoritative or primary source content creators are sources like www.cnn.com, www.microsoft.com, and www.uspto.gov. Some users identify a credible website as belonging to an established and trusted business or one that offers goods and services from credible sources (i.e., well known and established dealers, manufacturers, service providers, etc.). For example, users that purchase goods from www.apple.com are assured that the Apple® products they purchase made and warranted by the manufacturer. Still some users identify a credible website based on the amount of spam, malware, adware, and the like that is present on such a website. The more banner advertisements and pop-ups, the more likely the website is one that is not credible. In any of these and other cases, the credibility of a website is based on the knowledge and experience of the user. Less Internet savvy users are more likely to be duped by the non-credible websites and are thus more prone to fraud, misappropriation of confidential information, and consumption and dissemination of falsified or inaccurate information.
Some automated tools currently exist to aid better aid the user in gauging the credibility of a website. One such tool is the search engine. Search engines such as Google, Bing, and Yahoo, identify websites that are of potential interest to a user based on one or more query parameters provided by the user. The search engine ranks the websites that match a user query based on relevancy factors. These relevancy factors account the closeness of the user's query parameters to words that appear in a website and a website's popularity as determined by the number of incoming links to the website as some examples. However, these relevancy factors and the search engine rankings that are derived from the relevancy factors usually do not account for credibility. As a result, some users incorrectly assume that the highest ranked websites (i.e., first presented websites in the search engine results) not only provide the most relevant content, but also provide content that is trustworthy, accurate, and credible. Some users also assume that the first few presented search engine results are usually authoritative or are the primary source of the content that they distribute and that the distributed content is spam-free.
Besides the fact that the search engine rankings do not account for website credibility, the rankings can also be manipulated such that a non-credible website will appear higher in the rankings, thereby causing a user to improperly perceive that website as being more credible than it is. Search engine manipulation is the byproduct of abuses in search engine optimization. More specifically, non-credible websites can be optimized with content, keywords, links, etc. such that they are ranked highly when a user searches for certain keywords even though the websites may in fact have little to do with the searched for keywords or the websites are relevant to the search for keywords, but contain spam, are intended to defraud, or contain inaccurate or falsified information. In summary, search engine optimization can be used to make a website that is irrelevant, full of spam, or that contains other non-credible content to appear in the search rankings to be more relevant than it is and, as a result, appear to some users as being more credible than it is.
Search engine optimization can also have the effect of making credible websites appear to be less credible. For instance, a website creator may repeat various keywords unnecessarily, create extraneous content, and perform other optimizations that improve the website's ranking in the search engines, but that pollute the website with confusing and unnecessary information that makes the desired for content hard to find. Such a website can be perceived as being less credible when the sought after content is hard to find, surrounded by unnecessary text (e.g., repeated words or phrases), hyperlinks, or visual elements, and this unnecessary text, hyperlinks, and visual elements are needed to improve the website's rankings in the various search engines. Stated differently, search engine optimization results in websites that are optimized for search engines and not for the people that consume the content from those websites. As a result, the subjective criteria used by people to gauge the website's credibility is sometimes ignored or left to be a secondary concern for search engine rankings or search engine optimized websites.
A further shortcoming of using search engine results as an indicator for website credibility is the fact that some search engines do not consider the amount of “spam” elements on the website when ranking the website. For instance, a particular website may be the highest ranked website because of the amount of content it contains relating to a particular subject and because of the number of links that point to that particular website. However, this same website may be littered with banner ads, annoying flashing graphics unrelated to the primary content, poor contrast between text and background images that make the actual text difficult to read, large videos or graphics that increase the download time for the website, pop-ups, etc. When a user visits such a website and is bombarded with these and other spam elements, the user may immediately identify the site as non-credible. Consequently, the user is less likely to complete a commercial transaction at that website resulting in financial loss to the website. Also, the user is less likely to remain at the website to consume content or become interested in advertising or other promotions of the website. These and other factors highlight the importance of not only having a relevant website, but one that is also perceived as being credible.
Today, review websites exist to assist a user in ascertaining the credibility of a website. At these review websites, users rate a website and express their opinions about that website such that others can ascertain the credibility of a website based on the experiences of others. However, the problem with understanding credibility through this approach is that the credibility data is at a third party site. Therefore, the user must first lookup a website of interest at the third party site in order to ascertain that website's credibility before accessing the website. Another shorting coming of such review websites is that the credibility data at these websites is not derived using the same set of rules or criteria for all websites that are similarly classified. For example, an overly critical reviewer may find fault with an irrelevant feature of a first website and an overly sympathetic reviewer may ignore a glaring issue of a second website. A small set of only negative reviews may also fail to adequately convey the credibility of a website. The reviews may come from users that are not from the primary demographic to which the website caters to. Accordingly, user submitted reviews cannot be used to accurately gauge the credibility of a website. More importantly, the subjectivity and inconsistent sampling of reviews does not allow a website administrator or user to comparatively gauge the credibility of one website relative to other similarly classified websites.
Accordingly, there is a need to better promote website credibility so that the content that is placed on the Internet is optimized for people. To promote people optimized websites, there is a need to automatedly identify and quantify factors that people use to gauge credibility. Such factors extend beyond the relevancy of the content and include the presentation of the content as well as the accuracy, trustworthiness, and safety of the content being presented. There is a need to provide the identified and quantified factors to website administrators so that they may appreciate the website elements that beneficially and detrimentally affect their websites' credibility and so that the administrators can take directed action to better optimize the credibility of their websites. There is also a need to better enable users to identify credible websites from non-credible websites. In so doing, users are provided a better online experience and are protected from non-credible sites.