Computer software for automated retrieval of information from web sites involves a programmer who writes a script for each web site from which it is desirable to retrieve information. One problem with this approach is that the party writing the script is typically not the party operating the web site from which the information is being retrieved. If the owner of the web site changes the layout of that web site, the script that retrieves information from that web site may no longer work properly.
If information retrieval from a large number of web sites is desired, the aforementioned problem can add significant ongoing expense.
Additionally, the users of the information to be retrieved can become frustrated with the information to be retrieved being unavailable when the script to retrieve it stops working properly. If many web sites are changed around the same time, the programmers required to change the scripts may get backlogged, forcing the users of the information to be retrieved to wait longer and longer for updated information to be retrieved from the web site or sites in which they are interested. The delays involved can cause the users to feel the information is too out of date and unreliable to be useful, causing them to stop using it altogether.
There is also a tension that can arise with teams of programmers writing scripts. The business retrieving the information may find it difficult to justify the cost of writing a script to retrieve information from web sites in which only few users are interested. However, if users cannot obtain the information they need from any source that has it, they may find the subset of information that can get not sufficiently complete to justify the use of any of it. Thus, an entity that retrieves the information is either forced to devote an inordinate amount of resources writing scripts for web sites in which only a small number of users are interested, or the entity may find the market for its information services severely limited, because, if the number of users is large, there may be a vast number of little used web sites, each of which could interest at least some of the entity's potential users.
What is needed is a system and method that can allow information to be retrieved cost effectively, even from little used web sites, and can provide for scripts for information retrieval to be reliably updated faster than the team of employed programmers available to update such scripts may be able to implement such updates, can assemble an updated script that works for a given web site from portions of several scripts that may have been received from different users, and can help users provide information useful for building scripts or portions of scripts to allow such user to provide such information quickly and/or easily.