The present invention generally relates to systems and methods for electronic information retrieval and more particularly to systems and methods for retrieving information from logically and geographically distributed and incompatible storage devices.
Historically, corporations have used paper, microfilm and microfiche media for the long term storage of information important to the corporation. Each of these types storage media can take massive amount of physical storage space, and require considerable effort when the retrieval of stored information is necessary. Electronic storage archives have been developed that enable large electronic repositories that facilitate relatively easy retrieval of electronic files. Typically, these electronic storage archives allow the long term archival of document bitmap images, computer generated reports, office documents (e.g., word processing documents and spreadsheets), audio and video files, etc.
The hardware typically incorporated in an electronic archive is comprised of a general purpose computer and storage devices (such as magnetic disks, optical disks and magnetic tape subsystems). The hardware is typically operated and accessed by software comprising an operating system, database management systems, hierarchical storage management software (HSM) and archive management software. There are at least four significant limitations associated with current long term archival systems. First, larger corporations will invariably require several geographically diverse heterogenous archival systems in order to support the various operations of the corporation throughout the country and the world. For example, The corporation""s research and development facility in London England has a separate archival system from the archival system for one of the corporation""s manufacturing sites in Dallas Tex. Even if each of the archive facilities has a heterogeneous archival (e.g., a database manager) the hardware and the software comprising the archival at the two sites is invariably provided by two different vendors whose proprietary product are not interoperable (i.e., the software at the London site cannot be used to access the information stored at the Dallas site).
A related second problem is that even if the hardware and the software at the London and Dallas are from the same vendor, the corporation will typically not have any mechanism for managing information accesses at the enterprise level, treating all of the corporation""s archives as single resource regardless of the location.
A third significant problem is that an electronic document stored in one format can only be used by the specific retrieval applications that support that document storage format. Frequently, retrieval applications have very different formatting requirements, thus creating further compatibility problems. For example, a check image contained in the archive facility of a bank is typically in TIFF-JPEG or TIFF-G4 format while the image of a bank statement is typically in IBM AFP, Xerox Metacode or Adobe PDF format. The retrieval application (e.g., Netscape or Microsoft browser) or device (Palm PC, smartphone) frequently cannot display images in the format in which the images are stored. Although both electronic files are images, they cannot be retrieved by the same retrieval application. This compatibility problem severely limits the range of retrieval solutions and frequently increases the cost and time in building custom file conversion functions.
A final significant limitation with current archive systems is that these systems impose great challenges in applying enterprise level management and control processes including consolidated usage tracking and billing information; performance measurement and management; uniform access and retrieval application and security and a uniform look and feel for document displays.
In light of the above problems associated with the traditional archive retrieval systems, the present invention manages information retrievals from all of an enterprises"" archives across all operating locations. All of the electronic archives, regardless of the location, configuration or vendor makeup are linked to provide a single global framework for managing archive access. It thus provides system developers with a single xe2x80x9cvirtual archivexe2x80x9d for accessing all of the enterprises"" stored data, without the need to have location dependent programming code.
A first aspect of the present invention is the user interface. The goal achieved by the present invention with respect to the interface is to provide a single, consistent and user friendly interface. This is accomplished through the use of an intranet access portal. This single entry point for users is preferably enabled using a browser which provides access for the user to several retrieval application. By the use of a single entry point, users are able to access multiple applications through a single sign-on and password.
A second significant aspect of the present invention is the use of logical tables (xe2x80x9cmeta-descriptorsxe2x80x9d) that are used to direct information retrieval requests to the physical electronic archives. By the use of these tables, no change what-so-ever (hardware or software) is required to the archives. The tables provide a high degree of location independence to information retrieval applications by creating a xe2x80x9cvirtual archive.xe2x80x9d This concept of a xe2x80x9cvirtual archivexe2x80x9d provides for rapid application development and deployment, resulting in lower development and maintenance costs. The virtual archive furthermore allows for data aggregation (regardless of location) so the a user can have data from multiple physical locations on a single screen in a single view.
A third aspect of the present invention is the functionality of reformatting and repackaging the retrieved information. This is required because of the above described incompatibility between the format of the stored information and the distribution media. A final function performed by the present invention is automatic disaster recovery.
A further significant aspect of the present invention is the use of statistical analysis techniques in providing the requester with predicted response time based on historical performance of request queues. Depending on the requested object type, storage media of the requested object, overall archive workload factors and equipment (number of availability of tape drives), etc., the response time may be sub-second or several minutes. Using empirical performance statistics, multiple performance profile models (PPM""s) are developed. Each retrieval request is classified with a matching PPM, and a delay factor (in seconds or minutes) is sent to the requesting application or user whenever response delays are expected.
The present invention provides significant advantages to a corporation over the existing archive systems. Document archives can be consolidated at strategic locations globally. Each location archive can serve the archival needs for all product and service lines of the corporation and provide generic storage capability covering a broad range of objects including office documents, document images, computer print reports, etc. Each business division of the corporation can leverage and share document management products developed by other divisions at much reduced costs and lead-time. The present invention allows many business divisions to have presence at multiple global geographical locations. A document archival infrastructure that could be leveraged on a global basis will facilitate our global service reach objective. Many new information retrieval products (e.g. customer Internet retrievals) can be provided though a single customer access point regardless of physical storage locations. This level of transparency in customer accesses to consolidated global information can be critical to a corporation""s competitiveness in the new information age.