With recent advances in network technologies such as Gigabit fiber optic networks and the proliferation of wireless technologies (for example Wireless Fidelity (WiFi), Worldwide Interoperability for Microwave Access (WiMax)), data may be accessed in a much shorter time than ever before. As a result, thousands of megabytes of email messages, e-commerce transactions, multimedia files and other data can be generated and uploaded to a network in a day. All of this data must be stored, putting unprecedented pressure on the storage industry to develop a more efficient storage technology in managing and storing network data.
In response to these pressures, the storage industry has already moved away from the old Direct Attached Storage (DAS) architecture to a Network Attached Storage (NAS) architecture and/or Storage Area Network (SAN) architecture for managing data. However, both of the NAS and SAN architectures have well-known limitations, such as those discussed in the publication “Object-based storage: The next wave of storage technology and devices”, Intel White Paper and publication “Object-based storage”, Mike Mesnier et al., IEEE Communication Magazine, Vol 41, Pages 84 to 90, August 2003. The NAS architecture provides file sharing for heterogeneous network platforms with the use of a file server in handling all the metadata (data that describe data), but the throughput is limited by the file server. SAN architectures overcome some of the limitations of NAS architectures by providing direct access to the storage devices. However, SAN architectures may compromise security for better performance, and may also suffer compatibility drawbacks when attempting file sharing between different platforms. As such, a next generation storage technology termed an Object-Based Storage System, as described in publication “Object-based storage: The next wave of storage technology and devices”, Intel White Paper and publication “Object-based storage”, Mike Mesnier et al., IEEE Communication Magazine, Vol 41, Pages 84 to 90, August 2003 has been proposed to overcome the deficiencies in NAS and SAN.
The Object-Based Storage System has the advantages of both the SAN and NAS architectures in providing scalable, block based accessing (high performance), and secure object sharing for heterogeneous Operating System networks. Files are treated as objects and stored in object-based storage devices (OSDs). The OSD architecture treats storage neither as blocks nor files, but as objects. For example, an object could be a single database record or table, or the entire database itself. An object may contain a file, or just a portion of a file. Like other general storage systems, the OSD has its own file system—an object-based storage device file system (OSDFS) that handles storage of objects. A good file system is not only able to provide high performance and high throughput for the storage system, but it is also able to maintain high utilization of the storage system.
Many object-based storage systems adopt a general purposed file system for the OSD, for example Second Extended File System (ext2) as disclosed in “Design and implementation of the second extended file system”, R. Card, T. Ts'o, and S. Tweedie, Proceedings of the First Dutch International Symposium on Linux, 1994 and Third Extended File System (ext3) as disclosed in “Whitepaper: Red Hat's New Journaling File System: ext3”, Michael K. Johnson.
Ext2 is one of the two file systems that are included in the standard Linux kernel. The other file system being the First Extended File System (ext). Ext2 has been designed and implemented to fix some problems present in the First Extended File System (ext). In addition to the standard Unix features, ext2 supports some extensions which are not usually present in Unix file systems. The ext3 file system is a set of incremental enhancements to the ext2 file system that provide other advantages.
However, the workloads encountered by the OSDs are quite different from the general purposed file system workload. As such, the design of the object-based file system may be essential in improving the performance of the overall large-scale object-based storage system.
New designs and methods have been proposed to improve the performance of object-based file systems. In the publication “OBFS: A File System for Object-based Storage Devices”, Feng Wang et al., 21st IEEE/12th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST2004), April 2004, an object-based file system (OBFS) was designed specially to handle OSD workloads. The workloads were categorized into small and large objects. Based on this categorization, the OBFS stored the small objects in a small region consisting of a bitmap area and an onode table where metadata of each object is stored, and the large objects to a large region, utilizing embedded onodes to reduce the seek time of the hard disk. An onode includes a size of an object on disk, an object size and an o_block array where locations of data are stored. However, the OBFS described in this reference adopted a synchronous update scheme for writing small workloads or data, which involved a seek time to the onode table. In addition, reading data also involved a seeking distance for the hard disk to read from the onode table and then to the data area. Therefore, each data access involved a seek to the onode table and then to the data area, resulting in relatively slow reading of data.
In the publication “Leveraging Intra-object Locality with EBOFS”, Sage A. Weil, University of California, Santa Cruz, an extent-based object file system (EBOFS) which utilized extents as the allocation unit and B+ tree as the tree list in maintaining an object free list as well as an object lookup table is described. To reduce the hard disk's seeking overhead, EBOFS groups the free extents into a series of buckets based on the free extent size. However, the grouping of extents in the free list is a design concern in EBOFS, since a poor grouping decision will degrade performance.
Therefore, there is still a need for an alternative design for an object-based storage device file system, and methods of allocating storage, searching data and optimizing performance of an object-based storage device file system to improve performance of the object-based file systems so as to achieve a high throughout and high disk utilization.