1. Field of the Invention
The present invention relates to data analysis of online applications. Specifically, the present invention relates to automated monitoring or manipulation of specific data within an online resource.
2. Description of the Background Art
The proliferation of the Internet and the presentation of data in various formats on the Internet are well known. The Internet is used to convey a variety of educational, personal, scientific, and commercial information from a large number of sources to a large number of viewers. Due to the sheer size of this information, monitoring these sources for changes or modifying data within these sources can invariably be a time consuming and daunting task.
Data may be displayed over the Internet in various formats. The most common format used is HyperText Markup Language (HTML). Oftentimes, the content within a particular web page or HTML file may vary. For example, java-based advertisement banners often appear within a single web page (e.g., search engine results), but the content within these banners is intermittently changed. Additionally, data within an online resource may frequently be updated. For example, vendors often display prices of products within a web page so that customers may quickly locate a current price on a particular product. These prices may be stored within a database, which is accessed by an online resource so that the public may view its contents. The fact that the content within web pages and other online resources is constantly changing is well known within the art.
It is very important to many individuals and companies to maintain the most current information available on the Internet. In order to maintain current data, online resources need to be monitored so that new information can be identified and, subsequently, the individual or company needs to be notified of the new information. For example, businesses that sell products need to continually monitor the costs of each component within each product that they sell. This task may require a large amount of time if a large number of products are sold or if a large number of components are contained within a single product. Other examples include the need for commodity brokers to monitor commodity prices in various markets. Currently, an individual either does this monitoring manually or software is used that compares a web page to an archived copy of the web page to determine if changes were made. However, both methods fail to target specific important information within a web page that may be changed.
Additionally, data within an online resource may need to be continually updated. For example, a supplier of a particular computer component may wish to update clients' databases regarding the cost of the component. If the supplier has a large number of clients, this updating task may be rather daunting. Typically, this process is done manually either by physically mailing out an update or emailing an update. In any event, the process requires a large amount of time.
A number of software products are available that monitor online resource for changes. According to a pre-defined schedule, a software agent fetches the resource or metadata about the resource, and performs an analysis to determine what had changed. Typically, the retrieved resource is compared to an archived copy of the resource to determine changes. However, this software is unable to specifically target and identify relevant or important data within the resource. As a result, a user was notified if any changes had occurred to the resource. This inability to identify specific data resulted in a user frequently being notified unnecessarily. For example, ad banners within a web page are changed frequently. This change would be unnecessarily reported to a user requiring the user to filter through a large number of insignificant changes in order to find the important changes.
Other software products are available that include code to specifically recognize specific content within a web page by locating the content relative to its position on the web page. For example, this software may recognize certain ad banners or header information from specific search engines because it appears above the rest of the content in the web page (i.e. at the top of the web page). However, if the layout of the web page is changed or a new type of ad banner is used, then the software heuristics must be recoded to adjust to the location changes within the web page. Because web page layouts are constantly adjusted, a large amount of work is required to continually maintain operational software heuristics that reflect the most current web page layout for each online resource that is monitored.
Non-HTML files may also want to be monitored for changes. For example, a computer manufacturer may want to monitor a database containing current prices for computer components. Software applications are currently available that may monitor specific data within a database by identifying data relative to its location within the database. However, these software applications are not generic and there is not a data-driven solution to monitor the database. Therefore, if the database is restructured, then the software heuristics must be changed.
Each of these monitoring software applications functions only within a single format. For example, an HTML monitoring software application cannot monitor databases and a database monitoring software application cannot monitor HTML based files. Therefore, an individual must purchase and maintain multiple monitoring software applications in order to monitor files that may be in different formats. Purchasing and maintaining multiple monitoring software applications is both costly and time consuming for a company or individual.
The same problems and difficulties arise when online resources must be updated remotely. Specifically, the lack of uniformity between multiple online resources makes it extremely difficult to automatically update specific data fields within each online resource because of the same problems described above. As a result, there is a need for a system and method to automatically monitor or update multiple online resources stored in different formats.