Networks have become an essential part of the corporate computing environment. The Internet is a world wide network that interconnects computer networks to one another. Recently, a new method of communication has become very popular on the Internet. This method involves client applications know as "Web Browsers" and server application known as "Web Servers". The collective set of all the Web Servers in the world forms the "World Wide Web". The World Wide Web is a client/server application. Web Servers and Web Browsers use a Hyper Text Transport Protocol (HTTP) to exchange information. The information is formatted in a Hyper Text Markup Language (HTML). HTML files and other network files are identified by their Universal Resource Locators (URL).
Companies use internal networks often modeled on the Internet and the World Wide Web protocols to form an intranet to share information internal to the company. In addition, companies spend large amounts of money to put information on the Internet for use by their customers and potential customers. Unfortunately, only those people who have a computer with Internet access can view this information. It would be helpful if this information could be bundled in other formats (e.g., CD-ROMs, Diskettes, e-mail messages) for distribution. In order to bundle this information, it is necessary to retrieve the information from the network in a systematic manner. None of the existing products are designed to retrieve information in a systematic manner for bundling and distribution in other formats.
Thus there exists a need for a method of extracting network content in a systematic manner.