It is a typical process in computer information systems to transfer data stored in a storage system to application server computers, process the transferred data in the server computers, and then store the processed results back in the storage system. Some such processes read large amounts of data from storage systems but generate results of more manageable size. For example, indexing of files to enable searching of the file contents, such as full-text searching, requires transferring various files from a storage system to an index server, parsing of the transferred files, extracting text from the files, and storing each unique word identified in the files into an index data base. The size of the extracted text is usually small (e.g., in the kilobyte range) but the original files to be transferred and processed can often be quite large (e.g., in the megabyte range or larger) because the files contain not only text, but also other data, such as images, sounds, movies, and the like.
The large size of these files consumes network bandwidth and processing resources in the index server when the files are transferred, parsed and processed, particularly, when there are massive amounts of files to be indexed. For example, in a large archive storage system storing petabytes of data, the problem can become severe, causing the indexing process to take a very long time and consume a large amount of network and server resources. Another example of a typical indexing process is creating a thumbnail image from a larger image. In both cases, the problem is rooted in the transfer of large amounts of data through the network, thereby taking up available bandwidth that might be better used for other purposes, and also in consuming large amounts of processing resources in the application servers, which also might be used for other purposes.
A number of different types of communication protocols and methods for facilitating access between servers and storage systems are currently in use. These include NFS (Network File System) protocol, CIFS (Common Internet File System) protocol, and the like. For example, NFS and CIFS protocols are widely used by file servers such as Network Attached Storage (NAS) devices, and so forth. Also the Extensible Access Method (XAM) interface standard has recently been defined. The XAM standard defines an interface method for access across applications, management systems and storage systems to create uniform access to information. XAM annotates objects with metadata providing for the management of information at a high level. This coupling allows policy services to make informed decisions about the management of objects in storage without referring back to the application and without impacting the application. XAM provides a standardized interface and metadata to communicate with object storage devices to achieve interoperability, storage transparency, and increased efficiency. While the above protocols and methods are discussed in exemplary embodiments of the invention, the invention is not limited to any particular communication methods or protocols. Related art includes US Pat. Appl. No. US2005/0278293 to Imaichi et al., filed Jan. 18, 2005, entitled “Document Retrieval System, Search Server, and Search Client”, and “Information Management-Extensible Access Method (XAM)—Part 1: Architecture, Version 1.0, Working Draft”, Storage Networking Industry Association, San Francisco, Calif., Apr. 2, 2008, the entire disclosures of which are incorporated herein by reference.