The invention relates to a server with an adaptable and configurable file system.
The ever increasing capability of computers in storing and managing information has made them increasingly indispensable in modern businesses. The popularity of these machines has lead in turn to the widespread sharing and communication of data such as electronic mail and documents over one or more computer networks, including local area networks and wide area networks such as the Internet. To support the sharing of data, client-server architectures which support xe2x80x9centerprisexe2x80x9d computing typically provide one or more servers which communicate with a number of personal computers, workstations, and other devices such as mass storage subsystems, network printers and interfaces to the public telephony system over the computer networks. The users perform processing in connection with data and programs that may be stored in the network mass storage subsystems through the network attached personal computers and workstations. In such an arrangement, the personal computers/workstations, operating as clients, download the data and programs from the network mass storage subsystems for processing and upload the resulting data to the network mass storage subsystems for storage.
In the server, a file system such as the Unix file system provides services for managing the space of storage media. They provide a logical framework to the users of a computer system for accessing data stored in the storage media. The logical framework usually includes a hierarchy of directory structures to locate a collection of files that contain user-named programs or data. The use of directories and files removes the concern from the users of finding the actual physical locations of the stored information in a storage medium. The logical framework may be stored as xe2x80x9cmetadataxe2x80x9d or control information for the file such as file size and type and pointers to the actual data.
The file system dynamically constructs various data structures in the server""s memory, as well as others that are stored with the file system itself on the storage device such as in the memory of attached personal computers and workstations. Typically, the required data structures are loaded from the disk storage device into memory buffer when the file is first accessed (mount time). These structures may be dynamically modified in the memory buffer. When the last access to a file system is made (unmount time), all related data structures remaining in memory buffer are flushed to the various data storage devices.
The access speed of data in the server depends not only from access methodology, but also from data flow in the server. Thus, the way data is physically written or read from disk, the layout of the file system, the size of the caches deployed, the way the pointers to the data blocks is stored, the flush rate of the caches, and the file system paging algorithm affect the efficiency of the server in servicing requests directed at it. If the performance of the server becomes unacceptable, the performance may be improved by changing one or more of the above server parameters. However, conventional systems which attempt to automatically optimize the server parameters do not have a global view of the application and thus may make local optimizations without any knowledge about the environment or the application.
One factor affecting the system performance is the size of the cache. With a limited cache memory, a multitude of requests over a variety of data segments can easily exhaust the capability of the disk cache system to retain the desirable data in the cache memory. Often, data that may be reused in the near future is flushed prematurely to make room in the cache memory for handling new requests, leading to an increase in the number of disk accesses to fill the cache. The increase in disk activity, also known as thrashing, institutes a self-defeating cycle in which feeding the cache with data previously flushed takes a disproportionate impact on the disk drive utilization. A related factor affecting the hit rate is the cache memory block size allocation. An allocation of a relatively large block of memory reduces the quantity of individually allocatable memory blocks. In systems having multiple concurrent tasks and processes that require access to a large number of data files, a reduction in the number of individually allocatable blocks increases the rate of cache block depletion, once more leading to thrashing which decreases the overall disk system throughput. Although additional memory can be added to the disk cache to alleviate the above-mentioned problems, an upper limit exists as to the size of the disk cache that is cost effective.
Another factor affecting the performance of the disk subsystem is the read-ahead policy for prefetching data associated with requests. Prefetching enhances performance when sequential data requests are encountered. However, in the event that the data access occurs in a random manner, the prefetching policy may be ineffective as data brought into the cache is not likely to be used again soon. Additionally, the prefetching policy may cause a bottleneck on the disk data path, as each attempt to prefetch unneeded data consumes valuable data transfer bandwidth in the server. Thus, an automatic prefetch of data in a system with a large percentage of random I/O operations may degrade the overall system performance.
During operation, the server must be capable of concurrently retrieving different data files for different clients, regardless of whether the files are large or small, or that they are actual or meta data, or that they are continuous or non-continuous data files. However, most applications requests data in patterns that are quite predictable. For example, in seismic, weather prediction, or multimedia applications, the data typically is voluminous and is typically not needed immediately afterward. Since the data typically used only once, caching this data often provides little benefit. In another application for serving Web pages, the characteristics of this application are: each Web page is infrequently updated, the data storage size of the Web page is typically small, and the number of accesses or hits for popular Web sites are typically high. During operation, conventional file systems typically bring pages associated with the accessed Web site into memory and serves the Web page(s) associated with the Web site. However, the memory containing the page(s) may be flushed relatively soon to make space for pages(s) associated with another Web site. On the next access of the original Web site, the pages need to be reloaded. In these cases, the automatic optimization may be suboptimal or unnecessary, leading to inefficiencies in such systems.
The access speed of data in servers with, Network Attached Storage (NAS) systems depends not only on the network access methodology, but also on the data flow within the server. Thus, the way the data is physically written or read from the disk, the layout of the file systems and the paging characteristic of the file system affect system performance. Many file systems like Unix File System (UFS), Write Anywhere File System (WAFL), Lazy Write File System (LWFS) may optimize performance using techniques such as pre-allocation of blocks in the case of sequential writes, delayed block allocation in the case of random access, and queuing of disk blocks within streams, among others. However, these systems make certain assumptions about the way the user data is characterized and classify data as sequential, random or meta-data and process data requests in accordance with the assumptions.
The present invention provides a file system which can be adapted to the characteristics of the access and storage methodology of the user""s data. The user can tune the operation of the file system as well as get intelligent information from the file system on his data characteristics. The user is given options in the kernel (which needs system reboot) and options at the mount time to select the way his file system should behave while handling various data sets.
In one aspect, an apparatus and a method manage data stored on one or more data storage devices using an adaptive file system by characterizing the data on the data storage devices managed by the file system; and tuning the file system by selecting one or more options to configure a kernel during boot-up and an operating system during mount time.
Implementations of the invention include one or more of the following. One of the options may optimize the file system for sequential read/write operations by disabling caching of the data; and performing read/write operations directly to the data storage device. Blocks of data may be pre-allocated. One of the options may optimize the file system for large file random read operations by determining an average block size of the large file; and reading ahead blocks of data based on the determined average block size. One of the options may optimize the file system for large file random write operations by writing data directly to the data storage device. A page to be overwritten by the large file random write operation may be buffered. One of the options may optimize the file system for small file random read/write operations by performing a delayed read/write operation. Yet another option may optimize the file system for accessing metadata by generating a search parameter; and performing a search in accordance with the search parameter. The search parameter may compare either left-most or right-most characters of a file name. Another option may optimize the file system for sequential read operations by determining all files in a directory; and prefetching each file in the directory into a buffer.
In another aspect, a computer system includes an interconnect bus; one or more processors coupled to the interconnect bus and adapted to be configured for server specific functionalities including network processing, file processing, storage processing and application processing; a configuration processor coupled to the interconnect bus and to the processors, the configuration processor dynamically assigning processor functionalities upon request; one or more data storage devices coupled to the processors and managed by a file system; means for characterizing the data on the data storage devices managed by the file system; and means for tuning the file system by selecting one or more options in an operating system kernel and a mount table.
Advantages of the system includes the following. The server can be tuned for specific applications. The tuning process is simple, and only requires the user to select from a list of options as to the characterization of the processing load. Alternatively, the data may be characterized by automatically gathering and analyzing application data. The data in the file system can be sorted or retrieving depending on the characteristics of the data to get high performance without any overhead.
The file system can be configured from a host processor, which provides a single point of administration for system utilities and tools, including monitoring and tuning software. Since these activities are independent of file input/output operations, network file system (NFS) requests are serviced simultaneously with no performance degradation. This allows systems administrators to complete system management functions such as file backup and restore when convenient during normal system operation instead of during off hours.
The resulting server is powerful, scalable and reliable enough to allow users to consolidate their data for different applications onto one high performance system instead of scores of smaller, less reliable systems. This consolidation of data resources onto a powerful server brings a number of advantages to the client-server environment. The consolidation of data reduces the need to replicate data and to manage the consistency of the replicated data. Data is available more quickly and reliably than a conventional client-server architecture.
Other features and advantages will be apparent from the following description and the claims.