1. Field of the Invention
The invention relates to distributed content storage and management, and more particularly, to storage and content indexing of files located on electronic information sources.
2. Background of the Invention
Distributed content storage and management presents a significant challenge for all types of businesses—small and large, service and products-oriented, technical and non-technical. As the Information Age emerges, the need to be able to efficiently manage distributed content has increased, and will continue to increase. Distributed content refers to files that are distributed throughout electronic devices within an organization. For example, an organization may have a local area network with twenty desktop computers connected to the network. Each of the desktop computers will contain files—program files, data files, and other types of files. The business may also have users with personal digital assistants (PDAs) and/or laptops that contain files. These files collectively represent the distributed content of the organization.
Essentially, two disparate approaches to distributed content storage and management have emerged. One approach relates to backing-up files, principally for the purpose of being able to restore files if a network or computer crashes. Under the back-up approach, the focus is on preserving the data by copying data and getting the data “far away,” from its original location, so that it can not be accidentally or maliciously destroyed or damaged. Generally, this has meant that back-up files are stored on tape or other forms of detached storage devices, preferably in a separate physical
location from the original source of the file. Given the desire to keep the data safe or “far away,” file organization is by file name or volume where the data is stored, and accessing or retrieving files stored in a back-up system is often slow or difficult—and in some cases, practically impossible. Furthermore, because the backed-up files are not regularly accessed or used, when a back-up system does fail, often no one will notice and data can potentially be lost.
The other approach to distributed content management relates to content management of files. The content management approach is focused on controlling the creation, access and modification of a limited set of pre-determined files or groups of files. For example, one approach to content management may involve crude indexing and recording information about user created document files, such as files created with Microsoft Word or Excel. Within current content management approaches, systems typically require a choice by a user to submit a file to the content management system. An explicit choice requirement by a user, such as this, limits the ability of a system to capture all appropriate files and makes it impossible for an organization to ensure that it has control and awareness of all electronic content within the organization.
Neither approach fully meets the growing need to effectively manage distributed content. In user environments where only a back-up system is in place, easy access to stored files is difficult and access to information about a specific file is often impossible. In user environments where only a content management system exists, many files are left unprotected (i.e., not backed-up) and the indexing and searching capabilities are limited. In user environments where a back-up system and a content management system are both used, cost inefficiencies are introduced through redundancies. Moreover, even when both a back-up system and a content management system as are in use today are in place, the ability to manage and control the electronic content of an organization remains limited.
What is needed is a system to cost-effectively store and manage all forms of distributed content.
What are also needed are efficient methods to store distributed content to reduce redundant and inefficient storage of backed-up files.
What is also needed are efficient methods to gather data related to file content that will spawn further user applications made possible by the sophisticated indexing of the invention.