For many people, using the World Wide Web (“web”) has become a daily routine. However, with the increased amount of information on the World Wide Web locating the desired information has become challenging. Compounding the problem, the numbers of new users inexperienced at web searching are growing as well.
Search engines base the users interest on search terms or keywords entered in by the user. Once the user enters in the keywords, the search engine provides links to relevant subject matter on those entered keywords. Accordingly, the search engine accomplishes this by matching the keywords in the search query to a keyword index of web pages contained in the search engine's database. When the index includes the search keyword, the user's keywords are “hits” and the URL of the corresponding web page is returned to the user.
Unfortunately, this process for identifying web pages relevant to a search keyword is not an optimal process for finding all relevant matches. The keywords stored in the search engine's index are closely tied to the exact words appearing on the web page. Current search engine technology has limited ability to find pages with different but conceptually-related or synonymous keywords. For example, using an exact match process with the search term “automobile sales” will limit the search to “automobile” or “sales.” However, numerous pages on the web may have used the terms “auto sales” or “car sales” to represent the same concept. In this case, the user will only find those pages that use the terms “automobile” and is thereby limited in his/her ability to find all information related to the subject matter.
Some attempts have been made to develop processes for identifying the important words and phrases to use in a web page's copy to enable users to find the page for a broader range of conceptually related or synonymous search terms. A taxonomy or thesaurus may be used to expand a web page's targeted set of keywords by identifying additional related words to include in the page's text. Moreover, a demand data analysis may be completed to refine the expanded keywords by determining which keywords users are most likely to enter in a search engine. Although attempts have been made, the processes are not fully automated or integrated. The current, non-automated process for identifying a complete set of relevant keywords to optimize and place on landing pages is labor intensive and consequently not feasible for large volumes of text.
As a result, there is a need for an automated process for analyzing documents to identify keywords for use in document landing pages. An automated keyword analysis process removes the labor involved in finding the keywords in a web-page or document, adding related keywords, refining the keywords, and placing them into the web page's or document's corresponding landing page. With an automated keyword analysis system and method, a user of a search engine has a higher probability of producing relevant “hits” on the web pages enhanced by the automated analysis. By automatically creating related subject matter keywords, use of the web is simplified and new users, as well as old users, can find the information they need on the ever expanding web.