The use of the Internet, also known as the World Wide Web (WWW or simply “Web”), has increased exponentially in recent years. Today, there are literally 100's of millions of Internet users who use the web for a variety of purposes, including browsing news sources, e-mail, school research, paid subscription viewing, auctions, and on-line purchasing. As a result, the demand for new content has also increased exponentially, resulting in the generation of billions of web pages hosted by millions of web sites.
The availability of billions of web pages creates a problem for both the user and the content provider. From the user's perspective, he or she would like to be able to browse web pages that have a particular content, such as details about a particular category of food (e.g., wine) or remedies for medical ailments. At the same time, the web site host generally would like to attract as many users to its site or sites as possible, especially if the host participates in e-commerce or generates revenue from advertisements. With literally billions of competing Web pages, how does a host target users to its site(s)?
In one respect, both of the foregoing problems are solved using the same mechanism—search engines. Search sites, such as Google®, AltaVista®, Overture®, Lycos®, etc., enable users to easily locate desired content via keyword searches describing the content. For example, if a user desires to obtain information about a particular event, the user merely needs to enter one or more keywords that are descriptive of the event into a search engine and a list of search results containing descriptions of relevant web pages and links to those pages are returned to the user's browser.
In general, there are two main activities performed at a search site. The first activity is content indexing, which comprises gathering web page content (and/or indicia indicative of the content), storing the content in very large databases, and indexing the content so that it may be easily searched. Typically, this is accomplished by using web “spiders” that gather web page content via HTML parsing and the like. In other instances, a web site may wish to have its content guaranteed to be indexed by paying an inclusion fee to a search site. Typically, such fees are per URL, and are often valuable for dynamic URL pages and pages that often change content. In general, the page content identification data that is stored by a given search engine differs, wherein some search engines stored the entire page content, while others store less information, such as titles, headers, first 20 words, etc. In addition, HTML, the language used to render Web pages, permits the use of “meta-tags,” which enables a web designer to include content description in the HTML that is not displayed on the rendered page. Thus, meta-tags are another useful way to extract content description information for web pages.
The second activity performed by the search site is content searching. In most instances, the searching is based on one or more “keyword” or phrases included in a search request entered by a user. In response to a keyword search request, the search engine queries its database to identify indexed pages that have content pertaining to the keyword. The identified pages are then ranked based on proprietary algorithms, wherein the search engine returns results with a confidence or relevancy ranking, with the highest rankings appearing at the top of the list. In other words, the search engine orders the search results according to how closely it determines the content of those pages match the search query.
Most search engines use search term frequency as a primary way of determining whether a document is relevant. For instance, if you're researching diabetes and the word “diabetes” appears multiple times in a Web document, it's reasonable to assume that the document will contain useful information on diabetes. Therefore, a document that repeats the word “diabetes” over and over is likely to turn up near the top of a search results list. Some search engines consider both the frequency and the positioning of keywords to determine relevancy, reasoning that if the keywords appear early in the document, or in the headers, this increases the likelihood that the document is relevant. For example, Lycos® ranks hits according to how many times a search's keywords appear in their indices of the document and in which fields they appear (i.e., in headers, titles or text). It also takes into consideration whether the documents that emerge as hits are frequently linked to other documents on the Web, reasoning that if other people consider them important, you should, too.
Even when good search keywords are used, the returned search results are often lacking. Many times, the search result lists leave users confused, since, to the user, the results seem completely irrelevant to the search term entered. Basically, this is because search engine technology has yet to reach the point where humans and computers understand each other well enough to communicate clearly. For example, most search engines cannot determine the difference between words that have the same spelling, but different meanings. Although some search engines are geared toward “concept-based” of “natural language” searching, their results are still lacking.
Now to return to the other problem—that is, the problem of a web site host attracting traffic. A common way to receive traffic is via search results. However, when common keywords are used for search terms, literally 1000's or even 1,000,000's of results may be returned for a given search. Since most users will only wade through a few pages (if that many) of search page results, pages that are not listed on the first few pages (or even the first few in the list on the first results page) will never be viewed. Although the probability of a higher ranking may improve with the use of meta-tags, the vast number of web pages (including other web pages containing similar meta-tags) that may be returned for a given search typically prevents the addition of meta-tags alone from yielding adequate ranking improvement.
This problem is also addressed by the search engines. However, this time the service comes at a cost. The solution offered by the search engines is known as “paid searches.” Under paid searches, clients, such as electronic storefronts, retailers, and the like, pay a search engine to return web page results that include their web pages at or near the top of the results list based on search hits containing one or more keywords. The paid search results are generally returned in one of two forms: 1) they are included in the context of “normal” search results (i.e., they appear the same as any other search result); or 2) they appear separate from the “normal” search results, often in a manner similar to (but less intrusive than) banner ads. For example, Google® provides paid keyword search results that are rendered adjacent to its normal result listings.
Typically, the client pays a “per-click” charge to the search engine each times a user clicks on a link in the list of search results (or in the separated results) that will take the user to the client's site, otherwise known as a “clickthrough.” Generally, paid searches have been shown to be much more cost effective than banner advertisements (which are typically charged each time a page containing a banner ad is viewed, rather than how many times such an add is clicked). As a result, the use of paid searches have become increasingly popular with search engine clients. In response to the increased usage, the search engines have gone to a “bid” model, wherein each client places a bid for each keyword. Under this model, when multiple clients pay for the same keyword, the search results are ordered based on the ordering of the bids (i.e., highest to lowest bid).
A primary consideration for a paid search client is whether or not its keywords are cost-effective. That is, determining whether the marginal profit being derived from the increased traffic to the client's storefront is greater than the amount being paid for the keyword. Another consideration is keyword selection and search result appearance. While several companies presently provide services that are used to track keyword effectiveness, there are no existing solutions for automatically generating effective keywords and performing other aspects of keyword management.