The past decade has been marked by a technological revolution driven by the convergence of the data processing industry with the consumer electronics industry. The effect has, in turn, driven technologies that have been known and available but relatively quiescent over the years. A major one of these technologies is the Internet or Web related distribution of documents, media and programs. The convergence of the electronic entertainment and consumer industries with data processing exponentially accelerated the demand for wide ranging communication distribution channels, and the Web or Internet, which had quietly existed for over a generation as a loose academic and government data distribution facility, reached “critical mass” and commenced a period of phenomenal expansion. With this expansion, businesses and consumers have direct access to all matter of documents, media and computer programs.
In addition, Hypertext Markup Language (HTML), which had been the documentation language of the Internet or Web for years, offered direct links between pages and other documentation on the Web and a variety of related data sources that were at first text and images, e.g. both JPEG and MPEG, and then evolved into media, i.e. “hypermedia”. Web documents may also include applets and other programming routines. (The term Web documents as used herein is meant to include all such data documents). This even further exploded the use of the Internet or Web.
A major problem encountered by all Web users is the amount of wasted time that the user spends in misdirection, e.g. the “blind alleys” that the user often traverses in trying to get to an appropriate Web site or Web document. It is clearly in the interest of all businesses and organizations that use the Web to have their customers and clients reach their intended destinations on the Web as expeditiously and quickly as possible.
A significant source of this time waste is in the Web page (the basic document page of the Web) itself. In the case of Web pages, we do not have the situation of a relatively small group of professional designers working out the human factors. Rather, in the era of the Web, anyone and everyone can design a Web page. Pages are frequently designed by developers without usability skills. As a result, Web pages are frequently set up and designed in an eclectic manner. Often Web pages are set up through loose business, professional, social and educational configurations with general trade or public input of Web pages.
Due in part to this divergence in Web page or document creation, it is very often the case that terms and words used in these Web documents may have multiple meanings in different industries, businesses, technologies and arts. Consequently, when conventional keyword searches are done on the search engines, they go to a database or source domains on the Web, and even data source paths within such domains that have virtually nothing to do with the subject matter that the requesting user had intended to search. It is not unusual for any search submitting a combination of two or three words to receive a search result with tens of thousands of terms because of term similarities in many industries, businesses and technologies. For example, if a user wishes to search for film coating decay in the preservation of motion picture films, he is likely to come up with over 130,000 hits from the motion picture industry and fruit growing, dentistry and liquid pipe erosion.
Past expedients for reducing such numbers of excessive hits have included increasing the numbers of keywords in the search statement which has the danger of making the search too restrictive. Also, an exclusion list of words may be provided by the user. This likewise may be too restrictive. Further, this exclusion list must be repeated in each subsequent search using a combination of keywords.