As documents move from a paper format to an electronic format, the compilation and subsequent retrieval of the ever-increasing number of electronic documents becomes increasingly complex. Distribution of electronic documents over the Internet and an increasing number of distributed corporate intranets only increases the level of complexity. Finding and categorizing electronic documents scattered about such a distributed environment becomes increasing important as knowledge continues to migrate into the electronic world.
Many search systems have attempted to analyze electronic documents for the purpose of categorizing and intelligently describing the document for later retrieval. These systems have had limited success to date. The process of reading an electronic document and conceptualizing the contents into categories and such is a daunting technical challenge.
It would therefore be advantageous to provide a system for and a method of locating electronic documents stored within a distributed environment, categorizing the located electronic documents according to their content, and indexing the categorized electronic documents for easier retrieval.