Keyword assignment to a web page is a crucial step for web page classification and search. The keywords must be representative enough to capture the information contained in the page and must be common and socially acceptable enough to be of practical use (e.g., identifying a relevant web page to a user according to user provided search keywords).
Usually a web page contains a few keywords that are assigned to it by the designer. For example, keywords may be found under HTML tags “title” or meta tags “keyword” or “description.” These keywords are not necessarily acceptable enough to be of practical use since different web designers assign them differently and to serve different purposes.
There are several different techniques that may be used for keyword assignment to a web page. In an artificial intelligence-based technique, an algorithm analyzes a web page to learn the characteristics of the web page and correspondingly assign keywords to the web page. This algorithm improves with the number of web pages analyzed. In a data mining based technique, an algorithm looks for trends within the data present in a page and then identifies key attributes to the page. In a keyword density-based technique, an algorithm sorts through the words that are present in a web page and assigns keywords to the web page based on the density function obtained.
These techniques are computationally intensive and require large storage space per page due to the need to analyze page content. In addition, any modification in page content necessitates a reanalysis of the entire page. Further, since these techniques depend on the content of the page, they are not suitable for keyword assignment to a web page having few words, such as pages that are dynamically constructed using JavaScript, such as a Google™ Map page.