This invention relates generally to the field of information management, and more particularly to a method and system for confidentially tracking and reporting information available on global computer networks.
The Internet has experienced exponential growth and the number of interconnected computers is quickly approaching one billion worldwide. As such, the Internet provides unprecedented access to massive volumes of information and resources. An entity resource, such as a company, organization, periodical, etc., presents information to the Internet by uploading the information to a server that is connected to one of the interconnected networks and has a registered Internet Protocol (IP) address. Often, an entity organizes its information on the server as a hierarchy of pages composed with hypertext markup language (HTML). Along with general information, each page may contain links to other informative items including graphics, documents or even links to other web sites. Users can easily access an entity""s information using a graphical software program referred to as a browser. Because the Internet is essentially a vast web of interconnected computers, databases, systems and networks, an entity""s information is often referred to as its xe2x80x9cwebsitexe2x80x9d. For this reason, the Internet and its interconnected web sites is often referred to as the World Wide Web. Finding relevant information on the Internet, including the millions of websites and the billions of individual web pages, is a difficult task that has been inadequately addressed.
Many companies have developed search engines in an attempt to ease the location and retrieval of information from the Internet. Examples of current search systems include the AltaVista(trademark) search engine developed by Digital Equipment Corp., Lycos(trademark), Infoseek(trademark), Excite(trademark) and Yahoo(trademark). Most conventional search systems consist of two components. First, a data gathering component, known as a webcrawler or robot, systematically traverses the Internet and retrieves information from various websites. Often, the webcrawler moves from website to website traversing every link found. As the individual websites are accessed, each page of information is retrieved, analyzed and stored for subsequent searching and retrieval. After retrieving and examining each page of a website, the webcrawler moves on to another site on the Internet. While the webcrawler is traversing various websites and retrieving the pages of information, the webcrawler indexes the information presented by each page and stores a link to each page and the corresponding index information in a repository such as a database.
The second component of conventional search systems is the search engine. The search engine provides an interface for selecting the links stored in the repository in order to identify web pages with desired content. For example, the above mentioned search engines allow a user to enter various search criteria. The search engine probes the stored index information generated by the webcrawler according to the search criteria. The search controller presents to the user any stored links having corresponding index information that satisfies the entered search criteria. The user is able to view the actual page located on the original website by following the link to the actual website.
The present invention is directed to a method and system for systematically tracking a defined set of network resources on a global computing network. The method and system can be arranged to deterministically guarantee that any information from the sites is relevant and current. The method and system also can be arranged to increase the confidentiality of search parameters and the identities of parties seeking information.
In one embodiment, the present invention provides a computer-implemented method for gathering information from network resources on a global computer network, the method comprising assigning search times to the network resources, the search times designating times at which the network resources are to be searched within a monitoring period, categorizing the network resources into industry groups, generating search items, each of the search items defining a search for particular information and designating one or more of the industry groups, identifying, at a given search time, the network resources that have been assigned the given search time and categorized into industry groups designated by one or more of the search items, retrieving and storing information from the identified network resources, and performing the searches defined by one or more of the search items on the stored information.
In another embodiment, the present invention provides a method for gathering information from network resources on a global computer network, the method comprising assigning search times to the network resources, the search times designating times at which the network resources are to be searched within a monitoring period, generating search items, each of the search items defining a search for particular information and designating one or more of the network resources, identifying, at a given one of the search times, the network resources that have been assigned the given search time and which are designated by one or more of the search items, retrieving and storing information from the identified network resources, whereby information from the network resources that have not been assigned the given search time or are not designated by one or more of the search items is not retrieved and stored, and performing the searches defined by one or more of the search items on the stored information.
In a further embodiment, the present invention provides a method for gathering information from network resources on a global computer network, the method comprising generating a set of search items, each of the search items defining a search for particular information and designating one or more of the network resources, retrieving and storing information from the network resources designated by one or more of the search items, performing the searches defined by one or more of the search items on the stored information, and presenting results of the searches.
In an added embodiment, the present invention provides a method for gathering information from network resources on a global computer network, the method comprising categorizing the network resources into industry groups, generating a set of search items, each of the search items defining a search for particular information and designating one or more of the industry groups, retrieving and storing information from the network resources associated with the industry groups designated by one or more of the search items, performing the searches defined by the search items on the stored information, and presenting results of the searches.
In another embodiment, the present invention provides a method for gathering information from network resources on a global computer network, the method comprising selecting a set of network resources residing on the global computer network, assigning a search time to each of the network resources, the search time indicating a time within a monitoring period in which the network resource is to be searched, generating a set of search items, each of the search items defining parameters for a search and designating one or more of the network resources to be searched, determining, at approximately the search time for each of the network resources, whether the respective network resource is designated for searching by at least one of the search items, retrieving and storing information from the network resources designated by at least one of the search items, performing the searches defined by the search items on the stored information, and presenting results of the searches to users.
In a further embodiment, the present invention provides a software system for monitoring network resources residing on a global computer network over a time interval, the system comprising a database storing resource identifiers that correspond to particular network resources, and search items that define a search for information and specify one or more of the network resources, a system executive that constructs a set of the resource identifiers scheduled to be searched, and a set of the search items specifying at least one of the network resources corresponding to one of the resource identifiers of the constructed resource identifier set, a collection controller, for each of the resource identifiers of the constructed set of resource identifiers, the collection controller retrieving information presented by the networked resource corresponding to the resource identifier, a search controller for receiving the information retrieved by each of the collection controllers, and a search instance, for each search item of the search item list, wherein the search controller instantiates each search instance to perform the search defined by the respective search item on the information received from the collection controllers for the network resource specified by the respective search item.
In an added embodiment, the present invention provides a method for monitoring information presented by at least one of a plurality of networked computers comprising storing a plurality of identifiers, wherein each identifier corresponds to one of the plurality of networked computers, storing a plurality of search items, wherein each search item includes search criteria and at least one networked computer to be monitored, generating a set of identifiers to be searched, generating a set of search items monitoring at least one of the networked computers corresponding to one of the identifiers of the identifier set, retrieving information presented by each of the networked computers corresponding to an identifier of the identifier set, and searching the retrieved information according the search criteria of each search item of the search item set monitoring the networked computer corresponding to the retrieved information.
Other advantages, features, and embodiments of the present invention will become apparent from the following detailed description and claims.