Digital processing, mass storage, and other computer-related technologies have profoundly impacted modern society. Many business, and other activities require access to and use of mass-stored data to carry out normal operations. And, data generated during business, or other enterprise operations regularly need to be archived, available for subsequent retrieval.
Sometimes, an enterprise utilizes one or more on-site storage devices, such as computer servers, at which to store data, including archival data. The storage server is sometimes also networked to other computer stations of the enterprise by network connections, either local area network (LAN) or wide area network (WAN) connections. Users of network-connected computer stations are able, if authenticated and authorized, to access the stored data. Such a server is sometimes referred to as being a repository of data. And, more generally, any device at which content is stored is referred to as a repository. In an ECM system, the repository is sometimes referred to as being an ECM repository.
Sometimes, data is stored at dedicated data centers, either integral with, or remote from, an enterprise facility. A data center typically is positioned at a location having a stable, and sometimes also redundant, power supply of power capacities permitting powering of storage and other processing devices maintained at the data center. Ambient conditions at the data center are also typically maintained, best to ensure that the ambient conditions do not affect operations of devices maintained thereat.
Data centers sometimes contain third-party Enterprise Content Management (ECM) data repositories which store, typically, large-volume and bulk data, sometimes of terabyte, or greater (petabytes), volumes of data. At a data center which contains an Enterprise Content Management (ECM) system, a system operator or administrator of such a repository, conventionally utilizes vendor-provided proprietary technology with respect to the storage of, access to, and transfer of, content. An ECM system typically contains a combination of unstructured data, i.e., content such as images, documents, pictures, sound files, video, etc. which need structured data to manage such content. Structured data typically comprises data that can be organized in databases, e.g., arranged in rows and columns. The volume of unstructured data often exceeds that of the structured data by several thousands of magnitude. And, ECM systems, therefore, oftentimes have a very large data storage footprint.
ECM systems often store the content and structured data combination in proprietary format. The storage in the proprietary manner generally limits the content ingest and export functions to the vendor's tools and programming interfaces. The vendors often do not provide a published data dictionary. Due to the typically-proprietary nature of the technology, once content is stored at a data-center repository of an ECM system, the content, in its entirety, can only be moved to another ECM repository that uses a different proprietary technology with great difficulty. And, due to this difficulty in transferring the data, sometimes the content owner is constrained to continue to store the content in the same vendor's repository, even if the content owner has significant motivation or desire to store the content in another vendor's repository.
Conventional data import and data export tools available for use to transfer content typically are custom-written and have only limited features. For instance, sometimes only import capabilities are provided, and no export capabilities are provided. This limits the manner by which content is later exportable. And, to the extent that the content is later transferred, i.e., exported, from an ECM repository, the export tools, generally custom-written export utilities, regularly are unable to transfer significant amounts of content at high transfer rates. The custom utilities sometimes are required to transform formats of the stored content to the requirements of another ECM repository to which the content is to be transferred. When custom-written, such utilities are generally highly proprietary and not reusable.
Additionally, import and export tools conventionally available to transfer content generally do not include much control capability. The conventional tools and mechanisms, when used to transfer significant levels of content, do not typically include control mechanisms permitting batch-volume management of the content transfer, such as stop and restart capabilities or transfer rate change capabilities. And, such conventional tools also provide minimal monitoring capabilities. Often, the control utility has to be engaged by the system operator only at startup and does not allow for dynamic control thereafter
It is apparent, in light of the foregoing, therefore, that existing content import and export tools suffer from various deficiencies that limit their usefulness.
If an improved manner could be provided by which better to transfer content, i.e., import content to an ECM repository and export content from an ECM repository, content owners would be better able to take advantage of ECM system improvements available at state-of-the-art data-centers containing such repositories. Such an improvement would enable ease of content transfer across different vendor repository types bringing the most economic ones commercially to the advantage of the customer. By making it easier to transfer content across repositories or ingest content into several different repository types at a data center, data center owners can offer ECM functions more like a utility or a service rather than a proprietary system locked into one vendor's technology. ECM offered as a as a utility would allow for image archiving, content management for several customers on one or many system with no awareness to the technical software layer underneath the utility.
It is in light of this background information related to mass storage of content that the significant improvements of the present invention have evolved.