The present invention relates to accessing objects in an object store, and more specifically, this invention relates to accelerated access to objects that have been stored in an object store which utilizes a file storage system for its implementation.
Object store is a storage technique that is useful in many different storage systems, including cloud storage, among others, where object store is the most prominent storage used (e.g., object-based storage solution by IBM Corp. (SoftLayer), Azure by Microsoft Corp., S3 by Amazon.com, Inc., etc.). Object storage, is mainly intended to handle exponential growth of unstructured data. Unlike traditional storage of files in network-attached storage (NAS) or blocks in a storage area network (SAN), object store uses data objects. Each object is assigned a unique object identifier (ID) and each object contains its own metadata (part of which is user defined metadata, like tags associated with objects) along with actual data, thereby removing the need for centralized indexing. The user defined metadata (or tags) is a strong differentiator of object store-based storage techniques and is used extensively throughout object store. Thus, it enables massive scalability and geographic independence of storage locations, while maintaining reasonable costs.
The important elements of an object store are “objects” and “containers.” The objects and containers in the object store are identified with an object ID. The object ID is an universally unique ID (UUID) given to a particular object or container in a particular object store. The key usage and purpose of object ID is to allow for easy retrieval of objects and containers.
Depending on the particular implementation of object store, the object ID may be referred to as an object name, an object key, or by some other name, but will be referred to as object ID throughout. Accessing these objects/containers is made easy by just providing the object ID while requesting the object/container. This hides the implementation details from the end user and management of the objects is simplified. Most implementations of object store use a file system to store the objects and containers. In a file system semantic, a full path is provided to a file and/or directory to access a particular file or directory. So in a file system semantic, depending on the location in the file system, the objects and/or containers will have different path names. This can be very easily observed when the object association is moved from one container to another, and then the same object gets a different path, thereby making it difficult to manage. Therefore, using an object ID is the preferred mechanism for accessing the objects/containers.
One of the biggest needs in cloud based object store is to improve overall performance. This calls for optimizing the object store system. One way to optimize the object store system is to optimize the object retrieval process which may lead to increased performance. Particularly, in an object store implementation, such as one based on cloud data management interface (CDMI), OpenStack Swift, etc., the objects are being stored on a file system, with each object mapping to a file and each container mapping to a directory. Inside the file system, there is no object and everything is being accessed/operated as a file. To the outside world, the objects are allowed and preferred to be referred to as object IDs. These object IDs are unique and both objects and containers may be accessed with an object ID.
When one tries to access the object/container using an object ID, then at the object store server (which is responding to the requests), there needs to be a mapping made available from the object ID to the file path name, and then the path name may be used for further processing. Considering a huge number of objects being stored in a cloud environment, mapping from object ID to file path name becomes time consuming and is one of the constraining factors for object store performance related to retrieval of data.
A generic method being followed to solve this issue is to maintain an in-memory copy of this mapping. When a very large number of such records is stored, this represents an impractical solution. Another mechanism would be to create a database which will keep this mapping information keeping the object ID as a primary key. The second method uses less memory, but is more computationally intensive. The huge list of object IDs may need to be searched in order to find the one object in the request, and then to retrieve the associated file name.
Hence, smarter object ID generation and mapping would be beneficial to help optimize the system with regard to performance and memory.