This invention relates in general to electronic directories and, more specifically, to gathering information for a networked directory.
There is a desire to provide electronic directories on the Internet to allow searching for information. Conventional directories search the whole Internet by “crawling” from link to link and cataloging the information that is encountered. These crawling software bots or “crawlers” traverse the Internet constantly in an attempt to keep the directory information current. One pass through the Internet can take months.
Many electronic directories have crawlers associated with them that gather information across the Internet. When a crawler encounters information it is passed back to the electronic directory for cataloging. In this way, the crawlers consume tremendous bandwidth from the Internet that would otherwise be available to others.
Information cataloged in electronic directories is often stale. Clicking on the links provided by the directory often reveals many links are broken and/or the information in the catalog no longer accurately describes the state of the referenced web site. More quickly than the changes are uncovered by crawling, the contents of the Internet change. For example, a web page that describes a weekly television show may change weekly, but the crawler will only catalog it at a much slower frequency. Broken links and stale information reduce the usefulness of electronic directories on the Internet.