1. Field of the Invention
The present invention pertains to data storage and retrieval, and in particular to a method and system of storing a client provided data file along with an associated data file virtualization policy for retrieval; and a method and system for the retrieval of data from the client provided data file.
2. Discussion of Related Art
Interchange of computer data using a client program and a server program is a well-known technology. A client program communicates with a server program using a communication protocol over a network, for example a LAN, WAN or the Internet. Examples of a communication protocol are TCP, UDP, HTTP, HTTPS, socket-based communication, HTTP 1.1 WebDAV. A client program sends a request for data to a server program. Based on that request the server program sends data to the client program in response to the request.
The client program and the server program may be running on the same computer or on separate computers. For example, a client program may be running on a client computer while a server program may be running a server computer. The server computer may be a computer system having one or more processors. However, the client program and the server program may also be running on the same computer system. In addition, a client program may be running on one or more computers. Similarly, a server program may be running on one or more computers. The computers running client programs and server programs are connected to each other in some form over the network.
Server and client programs follow some type of communication protocol to be able to understand each other. One such method is when a client side asks a server side about its capabilities. The server side then responds to the client side with a list of services it offers. The client may utilize those services to fulfill its goals by making additional requests to the server.
A client program includes a set of one or more computer programs. A server program includes a set of one or more computer programs.
The HTTP protocol is popular and a well-known standard over a computer network, for example LAN, WAN and the Internet or the World Wide Web (WWW). A current HTTP protocol version is HTTP 1.1 and is described in the Internet Engineering Task Force Specification IETF RFC 2616. An extension to the HTTP 1.1 protocol is HTTP 1.1 WebDAV. This protocol is described in IETF RFC 4918.
The HTTP 1.1 WebDAV protocol in its simplest form allows a computer to read from and write to web resources on a remote storage device using the WWW. A web resource can be a file or a one or more files. The protocol also supports the equivalent of hierarchical folder listings, file and folder metadata reporting, file and folder deleting and such features that existing Portable Operating System Interface (POSIX)-based file systems offer over the WWW, using this protocol. In addition, the protocol also supports file versioning over the WWW. The protocol allows for client programs to connect to remote storage solutions over the WWW and provision data at the remote location as if it were a network mounted POSIX file system.
For example, the HTTP protocol supports the OPTIONS request. It allows the server to provide a list of WebDAV commands that it supports and how the commands are supported. The WebDAV protocol requires the implementation of some requests. The implementation of other or additional WebDAV requests is optional. The PROPFIND request in WebDAV is used to retrieve properties and metadata from a resource. It is equivalent to getting properties and metadata about a file and getting a hierarchical directory or folder list. The MKCOL request in WebDAV is used to create collections, for example a collection could be a directory or folder. The GET request in WebDAV is used to retrieve a complete or partial resource, for example a file, from a remote location on the WWW. The PUT request in WebDAV is used to store a complete or partial resource, for example a file, from a remote location on the WWW. The COPY request in WebDAV duplicates a resource, for example a file. Details regarding the various requests that can be implemented in WebDAV can be found in the IETF RFC 4918 specification.
On a POSIX-based file system, there are various techniques and methods available for reading and writing files. One of these methods is synchronous or asynchronous unbuffered direct I/O.
Synchronous I/O implies that if a program issues a POSIX request to read or write, the control is returned to the program once the request is completed successfully or unsuccessfully and not before that. Asynchronous I/O implies that as soon as a program issues a POSIX request to read or write, the control is returned to the program. The actual read or write operation is completed at a later time and at that time the program that originally issued the write operation request is alerted regarding the completion status of the write operation.
Unbuffered Direct I/O implies that if a program issues a POSIX request to read or write, the operating systems does not do anything besides reading the data from storage into memory or writing the data from memory to storage. It does not perform caching of any sort.
Synchronous or asynchronous unbuffered direct I/O has constraints imposed by file system implementations. Some file systems permit a single I/O request of this type to be no larger than a certain number of bytes. For example, in certain versions of the Microsoft® Windows® operating system, a single direct I/O read request can be no larger than approximately 64 Megabytes. A read or write operation operates on a file starting from a specific offset and is of a specific size in bytes. Most file systems require that a single I/O request of this type be made starting at an offset and of a size that are an integer multiple of the storage device block size. For example, a storage device block size may be 512 bytes or 4096 bytes. Most operating systems also require that the computer memory allocated for such an operation begin and end at an operating system memory page boundary, for example, begin and end at operating system memory page boundary of 4096 bytes.
Popular WebDAV implementations generally utilize some form of buffered I/O. These implementations also utilize published or invented caching techniques to further improve performance. Caching methods are intended to improve aggregate performance based on pattern of access.