The World Wide Web (or “Web”) is a vast source of information. However, much of this information is duplicated or overlapping. This is because numerous people may have investigated a problem, issue or subject, and create contents for this problem, issue or subject on the web. In order to compile information about the problem, issue, or subject, a user searches websites and blogs to compile the information.
It is desirable to be able to capture data from the Web during a search and then place it in a document containing the information obtained about the problem, issue, or subject. Typically, this document is stored locally on a user's machine. One problem with this is that the information is usually only available to a select few. Usually no one else benefits from the research and the efforts taken to compile the information is wasted.
Another problem with locally storing information captured from the Web in a document produced by word processing software is that the information stored therein is not updated whenever information on the website is updated. In other words, whenever the website changes the information that the user obtained from the website and placed in word processing document will not change. Data copied from a website and stored in a word processing document is a snapshot in time of the information at the time when it was capture. However, if the data on the website changes the captured data can quickly become stale.
One reason that the data is not updated is because much of the information on the Web is unstructured data, which means that the data is not in a database or in a database format. For example, an online encyclopedia website just contains text and images. The data in these types of websites is updated regularly because structured data is used. However, websites rely on structured data are limited on the information they can get from the Web.