Electronic storage of information in computerized databases and file servers has all but replaced the traditional library as a data source of recording knowledge. Modernly, a user provides locating information about the subject matter of interest to be found in an information source. This locating information would include knowledge about the author, title, publication date, or keywords that might appear in a written abstract about the information source. The locating information describes something about the information source, and is commonly referred to as the meta data. Historically, the written word was the primary medium found in books, newspapers, magazines and other periodicals. Modernly, the types of media for recording data have expanded to include magnetic tape, photography, video tape, digital books, computer generated reports, digital audio, digital video, computerized data bases, and internet web pages. Computer based indices have replaced card catalogs as the preferred means for locating various information sources. Most of the newly recorded data is available in electronic form and available via networked computers.
Networked computers enable rapid data sharing. The network connection can be made with optical connections, copper wire connections, or can be wireless. The networks can be localized intranets referred to as local area networks. Networks can also include many external computers distributed over a wide physical area as an internet, referred to as wide area networks. To share data information, the networked computers use compatible communications protocols. The most common protocol includes hypertext transport protocol (HTTP), that uses transmission control communication protocol internet protocol (TCP/IP). The largest and most common collection of networked computers is the internet. HTTP is the protocol that is used on the world wide web (WWW) that utilizes the hypertext markup language (HTML) to format and display text, audio, and video data from a data source most often using a WWW browser. The most common method to display information communicated through the WWW is in the form of HTML web pages.
To view web content data of a particular web page requires a reference to the location of the web page. The web page content data is stored electronically in memory storage devices of a web server. The servers have web domain name addresses to enable retrieval of the information from the local storage. If the desired web content data is on the internet, the web server storing the desired web content data must first be identified. On the internet, computers utilize an internet protocol address (IPA) unique to each web server system. Because numbers are difficult for humans to remember, alias names are used in lieu of the IPA. These alias names are commonly referred to as domain names. A domain name service (DNS) keeps track of which IPAs are represented by the respective domain names. Once a domain name is known, a user can specify the exact directory path to the file of interest containing the desired web content data by specifying the complete domain name and the directories path using a uniform resource locator (URLs) on the web.
To locate desired web content data at a particular URL, the user would either be required to specify the exact URL and then manually review the document, or perform a search based on some search criteria. The most common search method employed is through the use of web based search engines. Search engines typically use key words in Boolean combinations to specify search criteria. Boolean combined keyword searches are routinely used by users and provide users with a simple and convenient way of searching for desired web content data. However, Boolean combined keyword searches using search engines often produce millions of URL locations with many nonrelevant web pages pointing to nonrelevant web content data as part of the search result. A search engine match result is also referred to as hit, whether it is relevant or not to the requester. A user often has to manually review many nonrelevant search hits in order to locate relevant search hits. Additionally, typical Boolean combined keyword searches do not provide users with a convenient means to routinely search web pages linked to web page hits. Human review of data is most effective at determining if the source of information is appropriate for required needs, but humans often lack time to perform recurring searches for desired data. While a one time search may be executed by a user, users often have to disadvantageously repeat the identical search process, for example, on a daily basis, in order to monitor changes in web content data. Web based search engines do not provide a means to perform automated routine searches based upon user defined search criteria. These and other disadvantages are solved or reduced using the invention.