In recent years, technology related to data processing systems has seen progressive advances. In certain areas, such as data storage and data management, it is now common for a corporate system to include a large number of system components that are individually configured to handle different types of tasks. For instance, one computer may be configured to manage and process e-mail, another computer may be configured to archive compressed data, and another computer may be configured to store and control user access to electronic documents. In such a distributed computing environment, in which different software applications and different data file formats are distributed among many nodes in a network, a continuing need exists for a management system to coordinate data processed in each component.
There exists a large number of software applications and systems that provides many services, such as data format conversion systems, data compression systems, e-mail servers, etc. For instance, there are many software applications configured to process e-mail messages, such as an e-mail server application. As can be appreciated by one skilled in the art, most existing e-mail server applications are capable of receiving, sending, and storing e-mail messages. In addition, most existing e-mail server applications are capable of selectively retrieving e-mail records based on a user-configured query.
While existing systems, such as an e-mail server, are effective in executing their specific functions, there are several disadvantages. One disadvantage stems from the fact that an individual system, such as an e-mail server, cannot efficiently coordinate data retrieval capabilities with other software applications or other individual systems. For example, it may be desirous to retrieve specific image records from an email database, decompress the image records, and then perform an optical character recognition (OCR) process on the image records. Such a task may be carried out by the use of a customized program or a script; however, these existing solutions in coordinating functions between different systems may require a substantial amount of human resources to design and implement.
In addition to the above-described problems, existing systems do not effectively coordinate process workloads between the various components of a data processing system. For instance, using the example described above, a customized program or a script may be designed to extract compressed e-mail records from an e-mail server and then send the records to another data conversion system to decompress the compressed e-mail records. In such a task, the processes being executed by the data conversion system may take longer than the processes being executed by the e-mail server. This mismatch of processing time may cause process bottlenecks, and thus cause various inefficiencies during the execution of each task. Additional problems are introduced to such solutions when implemented on a distributed system, since various systems may reside on different computer platforms, i.e., Unix versus Windows-based systems.
The inefficient nature of existing data processing systems is further impaired when executing large-scale data processing tasks, such as data processing and data collection tasks related to litigation discovery. As litigants and regulatory agencies have increased their focus of evidence discovery on data stored in computer systems, the amount of resources required for electronic evidence data collection has exponentially increased. Accordingly, the discovery process of identifying, locating, collecting and reviewing voluminous amounts of potentially relevant data has become increasingly difficult. Most existing programs do not have the capabilities to efficiently process such large quantities of data on distributed systems.
Most existing systems and programs also fail to provide capabilities for implementing a unified record management policy on distributed systems storing various types of data. For example, existing systems do not provide an efficient way to apply a unified record retention or destruction policy to a system in which various types of employees have stored numerous files on a number of computers, including personal computing devices (PDAs), desktop computers, servers, or the like. Moreover, existing systems do not effectively manage individual data records that are embedded in other stored files, such as a specific data field in a word document, a single cell in an Excel® spreadsheet, a specific attachment linked to an email, or the like. Given the complexity of most existing computer systems, there has been a long-standing need for a system and method that can efficiently implement a unified record management policy over a plurality of existing computers having many different systems and computing platforms. In view of this problem, with the increased focus of evidence discovery on data stored on computer systems, there also exists a continuing need for a system that offers a proactive approach that allows a business entity to properly collect and manage data records to reduce the exposure of discovery conflicts in future litigation.
Based on the above-described deficiencies associated with existing systems, there is a need for a system and method that can efficiently manage, retrieve, process, and store data stored in a number of networked computers. There also exists a need for a data management system that can efficiently determine the availability of resources associated with the resources of a distributed data processing system. In addition, there exists a need for a system that can efficiently implement a unified record management policy over a plurality of networks having many different operating platforms.