A computer system can be generally divided into four components: the hardware, the operating system, the application programs and the users. The hardware (central processing unit (CPU), memory and input/output (I/O) devices) provides the basic computing resources. The application programs (database systems, games business programs (database systems, etc.) define the ways in which these resources are used to solve the computing problems of the users. The operating system controls and coordinates the use of the hardware among the various application programs for the various users. In doing so, one goal of the operating system is to make the computer system convenient to use. A secondary goal is to use the hardware in an efficient manner.
The Unix operating system is one example of an operating system that is currently used by many enterprise computer systems. Unix was designed to be a simple time-sharing system, with a hierarchical file system, which supported multiple processes. A process is the execution of a program and consists of a pattern of bytes that the CPU interprets as machine instructions (text), data and stack. A stack defines a set of hardware registers or a reserved amount of main memory that is used for arithmetic calculations.
Unix consists of two separable parts: the “kernel” and the “system programs.” Systems programs consist of system libraries, compilers, interpreters, shells and other such programs which provide useful functions to the user. The kernel is the central controlling program that provides basic system facilities. The Unix kernel creates and manages processes, provides functions to access file-systems, and supplies communications facilities.
The Unix kernel is the only part of Unix that a user cannot replace. The kernel also provides the file system, CPU scheduling, memory management and other operating-system functions by responding to “system-calls.” Conceptually, the kernel is situated between the hardware and the users. System calls are the means for the programmer to communicate with the kernel.
FIG. 1 is a block diagram illustration of a prior art computer system 100 having a processor 110, file-system 120, memory 130, operating system 140, kernel 150, applications 160 and I/O devices 170. The prior art system 100 shown in FIG. 1 employs an access mechanism in which one or more applications may be initiating access to the same file objects in file system 120.
The prior art system 100 implements a shared resource environment which allows such system resources as memory 130 and file-system 120 to be shared between applications processes in applications 160. The kernel 150 includes interfaces that allows applications processes to access virtual memory in memory 130.
In the system 100 shown in FIG. 1, the kernel interface has limitations in its capabilities in supporting multiple and varied applications that access the file-system 120 despite the shared environment of system 100. This is because new file-system technologies have limited applicability due to the difficulty in adapting others to accommodate them. For example, multiple page size support in the virtual memory system in system 100 is not supported in a UFS file-system or other file-systems. On-line backups via snapshots cannot handle the files which have locked pages.
The difficulties with current kernel interfaces is primarily due to their shared state. File-systems must implement knowledge about the virtual memory system in order to operate correctly. This knowledge is vague and dynamic with respect to a source base. The corresponding difficulty lies within the virtual memory system, in that it can only assume generic file-system behavior; file-systems typically have no way to indicate it can utilize a new behavior.
As distributed systems become prevalent, it is important that the number of system downtimes be substantially reduced. CPU and network speeds, RAM and disk sizes will increase and so will access to these devices. The interface between the file-system 120 and memory 130 implementation therefore becomes important.
The file system 120 represents a logical grouping of files at a mount point in the computer system 100. It is represented to users as a top-level entry in the file system table. Within the file system 120 are file objects, which correspond to individual files or directories contained within the file system 120. A file object provides basic information about the object, such as name and parent, and what operations can be performed on it, such as moving, deleting and so on.
Multiple access to the same file objects or portions thereof in the file system 120 often result in conflict situation where one or more sequential activities may attempt to access the same state of an object in contradictory ways. This may result in a invalid state of the object being accessed unless various sequential activities which share access to the object implement a protocol to prevent an access sequence from being interleaved or occurring simultaneously. Such a prior art protocol is otherwise called a synchronization protocol and objects sought by two or more such activities are often called critical resources. By obtaining access permission to a critical resource according to a synchronization protocol, a sequential activity is said to enter a critical section.
There are a number of well-known synchronization protocols or mechanism. One of the most common is known as a “mutex” (mutual exclusion). A mutex is a data object which encapsulates a list of activities waiting for permission to access a critical resource. A synchronization protocol such as mutex is monolithic by monolithic locks which restricts access to entire set of items in the object. While such locks ease the implementation of an arbitrator of such items (for example, file system 120) they unnecessarily limit parallel access to such items.
Limiting parallel access results in lower object utilization than would otherwise be obtainable. An operation upon a set of items may be required to wait for a slow device to transfer data. During that waiting time, other items with the object in the file system 120 may be immediately accessible, but any accessors would be inhibited until the first operation is completed.
Having a monolithic locking mechanism is inefficient as it blindly serializes all accesses to file objects whether such access are read or write access. Serializing access to critical resources as implemented in the exclusionary locking mechanism of the prior art also creates bottlenecks in the file system when several access requests are made to one or two objects. This unnecessarily leads to system degradation. Thus, it is desirable to provide a controlled access to file objects which could reduce overhead carried in the inefficient serialization of current access protocols.