Most end-user applications consist of three basic components: a user interface, computational function, and information storage. In the early evolution of computer systems, the general computing model for non-distributed applications was most frequently employed wherein these components were typically integrated in one system so as to be indistinguishable.
As the art developed, it became evident that numerous benefits could be obtained from distributed computing models wherein applications might be distributed across computer systems, the simplest approach being to break the application down into the hereinbefore noted components. It became apparent that a user interface, for example, might be remote from the computational function, and, in like manner, information storage might be remote, as in a distributed file system, and that even the routines which process the information in the files could be remote from the functions which manage the physical storage media. It was further realized as the art developed, that it may be desirable to even distribute portions of a component itself among multiple machines, the foregoing being aspects of what is known as distributed computing models.
Thus, high-powered individual computers and connectivity solutions (such as LANS and WANS) are now dramatically changing the way computers process information. The previously described isolated, dedicated, early single-user systems are no longer the norm. Today, users expect to reach beyond their desktop computers to exploit greater features, functionality, and performance. Distributed computing solutions have existed since the mid-1980's as, for example, in the Open Network Computing (ONC) and Network Computing System (NCS) systems from Sun Microsystems and Hewlett-Packard, respectively. Despite their availability distributed applications were a relatively scarce commodity because of difficulty in programming, with existing tools only being piecemeal solutions.
The distributed computing revolution challenged the computer industry to provide interoperability among components in heterogeneous, networked environments. Interoperability required more than mere connectivity to a network in order to allow applications to exploit the potential resources in the networked environment and to provide dramatically improved performance. Developers, in exploiting this environment, required a complete set of tools within an architectural framework to ease development. As a result, a distributed computing environment arose to meet these needs in the form of a comprehensive integrated set of services known as the Distributed Computing Environment (DCE) by the Open Systems Foundation, which work across multiple systems remaining independent of any single system.
Numerous characteristics serve to define the DCE model. Among these are that DCE services are protected from a single point of failure by copying the services and important files to additional hosts in the network. Though many interdependencies exist among DCE components, the centralized control allows management of each component independently from the others whereby independently manageable sub-environments called cells are provided. Such decentralization minimizes bottlenecks whereby workloads are distributed among multiple hosts. The DCE system further provides flexibility in the decentralization permitting changing, adding, or reconfiguring hardware and software without impacting the surrounding environment. Although closely integrated, DCE's modular structure permits tailoring DCE configurations by installing DCE servers on computers with appropriate resources. DCE further harnesses latent computing power sitting idle in machines on the network by means of remote procedure calls (RPC) and DCE threads. Moreover, by the aforementioned technique of locating critical application servers on multiple hosts and replicating important files onto other systems, critical work may be kept in process while some systems fail, thereby providing increased availability.
Still further benefits to the DCE environment addressed limitations on local data storage by providing a distributed file service (DFS) which provides a single view of all files in an organization both to UNIX and non-UNIX systems to all users. This important DFS aspect of DCE will be hereinafter described in greater detail. The DCE environment further provides, by means of such DFS, a database storing the location of all files in the file service. When files move, the DFS automatically updates the database of the new location. Similarly, distributed application clients may use the DCE directory service to locate associated servers, thereby providing services which track data and programs. Moreover, the aforementioned DCE RPC hides differences in data requirements by converting data to appropriate forms needed by clients and servers, thereby accommodating heterogeneous data. These and other benefits to the DCE systems and, in particular the OSF implementation, gave rise to an increased interest in adoption of such systems with a concomitant need for further developments of the components thereof, such as the aforementioned DFS to be described in greater detail. Further background on distributed computing and DCE systems may be found in "Understanding DCE", Rosenberry, Kenney, and Fisher, by O'Reilly and Associates, Inc., Publishers, copyright 1992 and "OSF's Distributed Computing Environment", by Ram Kumar, AIXpert, copyright IBM Corporation, fall, 1991.
An overall high level view of the DCE architecture may be seen with reference to FIG. 2. It is layered between the operating system 10 and applications 12. Within DCE, three infrastructure components permeate all other components, namely the aforementioned remote procedure call (RPC) and presentation services 14 (which allow developers to program for a distributed environment as easily as a stand-alone system): security service, 16 (ensuring against unauthorized access), and management services 18 (providing utilities to manage DCE services).
Additional components of DCE depicted in FIG. 2 are as follows. A directory service 20 provides a way for users to name and locate objects, and is centrally positioned as a keystone of the architecture. It gives distributed system users a well known, central repository in which to store information which may be retrieved from anywhere in the distributed system. A time service 22 provides a consistent view of time in the distributed environment. Other fundamental services, 24, act as a place holder for future services. The distributed file services (DFS) to be hereinafter described further, 26, provides a consistent unified view of all files in the distributed system. A diskless support service, 28, extends DCE to low-cost, diskless nodes. Other distributed services 30, will provide services likely to be offered in the future including spooling services, transaction services, and object-oriented environments.
To build distributed applications, developers need an easy-to-use programming model, such as the remote procedure call (RPC) 14, to take advantage of the distributed computing architectures. RPC allows developers to partition various tasks required by an application into separate procedure modules which may be executed on different systems. This offers benefits of an easy-to-use programming model, balanced, distributed use of computing resources, and the ability to run applications across diverse software and hardware platforms as previously described.
Still referring to FIG. 2, threads 32 allow multiple sequential flows of execution within a single process. Such threads provide a simple concurrency paradigm maintaining a synchronous model inside each thread while ensuring that synchronous events take place. In a client-server environment they allow servers to handle multiple clients simultaneously, and further allow clients to make multiple requests simultaneously, thereby providing better service availability. More detail regarding the RPC model 14 and its interaction with directory services 20 may be obtained in the aforementioned reference by Kumar.
It will be recalled that the DCE is thus a layer of software which masks differences between various kinds of hosts. Referring to FIG. 3, this layering may perhaps be more readily comprehended than in FIG. 2. The DCE 34 layer will be seen as sitting on top of the host operating system and networking services 10, and offers its services to applications 42 thereabove. From the conceptual model of DCE in FIG. 3, the relationship may be seen between the DCE distributed services (security 16, directory 40, and time 22 to RPC 14 and thread services 32). RPC and threads are base DCE services available on systems on which DCE is installed. From the layered appearance of FIG. 3, it will be more readily appreciated how similar to applications employing underlying DCE services for distribution the DCE file service 26 is. It will also be appreciated that the directory 40 actually includes a cell directory service (CDS) component 36 and X.500 directory service 38 (GDS) which programs utilize by calling the X.500 directory service (XDS) application programming interface 40.
Continuing with FIG. 3, it further illustrates how an application 42 may utilize DCE APIs. A distributed application does not require use of all DCE APIs but rather only utilizes those which it requires. Thus, in the illustration of FIG. 3, the application 42 might only require the RPC 14, security 16, XDS 40, and operating system APIs.
One important aspect of DCE relating directly to the present invention is the distributed file system (DFS) which will be described now in greater detail. An overview of DFS may be obtained in "An Overview of the OSF Distributed File System" by Lebovitz, AIXpert, February, 1992, copyright IBM and in the previous cite to the article by Kumar, incorporated herein by reference.
A distributed file system is an application allowing a user on one computer to easily access files on another. To the user, the distributed file system appears as a large, local file system.
Such distributed file systems have been in use for at least 20 years, with early development occurring at the Palo Alto Research Center (PARC) of the Xerox Corporation, wherein distributed file systems for LANs were experimented with in the early 1970's.
Most distributed file systems are similar to the original Xerox system. Files stored on a workstation disk are local files, and those stored on the central file system are referred to as remote files. When a workstation user tries to access a remote file, a message requesting information is sent through the network to the central file system. When the central file system receives the message, it obtains the requested information from its disk drive and sends it back to the workstation in another message. A user modifies a remote file in a similar way. Modified information is sent to the central file system in a network message. When the file system receives the message, it writes the modified information to its local disk.
A major drawback of such distributed file systems is the amount of resources they utilize. They require more network and remote file system resources than local sources, creating two problems. First, performance on the local computer is only as good as the performance of the central file system, e.g. increasing the power of the desktop computer does not necessarily increase overall system performance. Secondly, such a system cannot grow gracefully to an enterprise-wide system.
There are numerous more modern remote file systems presently in use which include the network file system (NFS) from Sun Microsystems, various distributed PC network file systems, notably from Apple, Banyan, Microsoft, and Novell, and various vendor-specific systems such as DECnet from Digital Equipment Corporation and Domain from Hewlett-Packard Corporation.
The evolution of computer technology has vastly increased the power of individual workstations and PCs. However, computers have evolved more quickly than distributed file systems which do not fully exploit the wealth of resources available. Several drawbacks accordingly are present in modern distributed file systems. These include lack of consistent and uniform file naming space, inconvenient multiple security systems in need of an integrated approach, data inconsistencies, lack of performance and scalability, weaknesses in system management and administration, incompatibility with file system standards, and lack of support for wide area networks.
As part of the hereinbefore described OSF's DCE, the distributed file system (DFS) component thereof took a new approach to building an enterprise-wide distributed file system. It employs modern high performance workstations and PCs, and exploits performance-enhancing techniques like caching and replication, thereby distinguishing DFS from earlier file systems. As but one example, by integrating a distributed file service 26 with the DCE directory services 20, DFS thereby allowed users to access files in a consistent manner from different workstations in a distributed computing environment. These directory services thereby ensured that the system utilized a uniform naming convention for all files stored in DFS. As previously noted, DCE directory services are based upon industry standards, ensuring that every computer resource in the world may be identified and accessed with a unique and consistent name --such resources including computers, application services, and files. Referring to FIG. 4, there is depicted an illustration of how a file located somewhere in a worldwide DCE file system may readily be located using the aforementioned global directly service (GDS).
Continuing with the background description of DFS, further information regarding DFS may be obtained from "The Distributed File System (DFS) for AIX/6000", document #GG24-4255-00, IBM Corporation, copyright May, 1994. As previously noted, DFS technology provides the ability to access and store data at remote sites similar to the technique used with NFS. It extends the view of a local, and therefore limited in size file system to the distributed file system of almost unlimited size located on several remote systems. Several advantages touched on previously are thereby provided over a centralized system including providing access to files from anywhere in the world (FIG. 4), higher availability through replication, and providing users on systems the ability to access data from a nearly unlimited data space. As such DFS is considered an essential part of DCE, and was basically an enhanced version of the Andrew File System (AFS) technology marketed by the Transarc Corporation and recently integrated into the base DCE technology.
Turning to FIG. 5, the distributed file system from DCE DFS is a collection of several file systems illustrated at 44-48 located on distributed systems. All such file systems are mounted into a single virtual file system space with a single namespace. The end-user thereby has direct access to all files in this distributed file system without knowing where the physical files reside. Still referring to FIG. 5, it will further be noted that a hierarchical structure is provided consisting of directories and files as is known from other Unix systems. The root of the DFS file structure is a junction 50 in the DCE naming space, and the multiple file sets 44-48 may reside on different servers and may be mounted into the DFS namespace.
Concerning overall DFS operation, DFS is built upon the concept of a client-server architecture. Turning to FIG. 6, a plurality of clients 52 and servers 54 are shown. The server provides data and the client uses the data. Communication between the server and client is handled with the previously described DCE remote procedure calls (RPC) 14. Systems in a DFS environment which own file systems export them and users on other systems access such file systems. Such file exporting machines are called DFS file server systems (e.g. DFS servers 54) and the importing machines are called DFS client systems (DFS clients 52). A machine can be both a server and client. FIG. 6 further shows the client/server nature of DFS.
Each DFS server system runs a corresponding file set exporter 54A which makes file systems 56 available to the DFS file space. DCE cells may have one or more DFS file servers, with two being shown in FIG. 6. As noted previously, a system may function as both a DFS client and DFS server. Each DFS client runs one or more client applications 58 which in turn may access cache on its respective client.
Turning now to FIG. 7, DFS clients run the cache manager 62, which caches data from the file exporter 54A in memory or on a local disk, such cache manager providing the important function of improving performance and availability. DFS server systems desiring to export file sets must register their export file sets at a system known as the file set location server 64. The file set location server maintains a database 66 of all file sets. This database is utilized to keep track of the physical locations where file sets are stored. If a DFS client 52 requires access to one of the file sets, it sends a request first to the file set location server/database 64, 66 to inquire about the physical location of the file set, After receiving location information, the client 52 then contacts the actual particular DFS file server 54 wherein the particular file set or file system 56 resides.
In addition to the previously discussed DFS machine roles providing the basic function of DFS file server machines, DFS client machines, and file set location servers, there are other functions or services which may be provided by one or more machines well known in the art. These services may be categorized into machine roles, not all of which are required. Moreover, a machine may actually serve more than one role, although more likely they will be spread out throughout a cell. Such machines known in the art include file server machines, private file server machines, file set location servers, DFS clients, system control machines, binary distribution machines, backup database machines, and tape coordinator machines, all of which are described in further detail in the aforementioned DFS for AIX/6000 publication.
Numerous basic benefits are provided by DCE DFS over other types of distributed file systems. These include providing a uniform file space, caching on the DFS client machine, finer granularity for access control, ability to establish binary distribution machines, ability to work on other vendor platforms, built-in backup capability, and diverse administration options.
As to the DFS relation to DCE, DFS is built on top of the underlying DCE services as previously described, taking advantage of the lower level services of DCE such as RPC, security services, directory and time services. Before DFS may configured on a machine, it requires the following DCE components be installed, configured, and running in the cell: security server, director server, and DCE time servers, all of which are also further described in detail in the aforementioned AIX article.
Referring now to FIG. 8, it first illustrates the required DCE components previously mentioned for running in a cell before DFS may be configured on a machine, namely the security server 62, cell directory server 64, and DCE time servers 66. The further purpose of FIG. 8 is to not only show the aforementioned components for a DCE configuration, but additionally the DFS components which have been added to implement a DCE DFS cell, thereby also illustrating the different DFS machine roles. As previously described, DCE cell requirements are for at least one cell directory server 64, one security server 62 and at least three DCE distributed time servers 66. However, these are DCE requirements for a cell and not DFS requirements. The DCE cell may operate with the above requirements. However, in order to add the DFS capability to the DCE cell, the following additional components are required. First, all DFS clients 68 and servers such as DFS file servers 70 must be minimally configured as DCE clients. Secondly, at least one file set location database machine 72 is required and at least one system control machine 74 per administrative domain. A backup database machine and tape coordinator 76 are optional. Further details regarding these additional components to implement DFS on DCE may also be found in the aforesaid AIX publications.
With the foregoing background in mind regarding a DCE DFS cell implementation, the problem addressed by the subject invention may now be further understood. It is at times desirable to operate differing operating systems in a DCE environment such as the OS/2 (Trademark of the IBM Corporation) operating system provided by the IBM Corporation. Such file systems typically (and specifically in the case of OS/2), include what is known as "attributes" of various types such as "extended" attributes and "file" attributes to be hereinafter described in great detail. One problem with the widely accepted OSF DCE DFS file system is that it does not provide for native support for attributes such as the OS/2 style attributes. Nevertheless, this functionality must be provided for OS/2 clients prior to the availability of such support for these attributes through DFS protocols.
The OS/2 client expects to attach standard file attributes (FAs) directly to a file. Moreover, OS/2 clients further expect to be able to retain additional information about files and directories in named entities known in the art as "extended" attributed (EAs). However, the OSF DCE DFS knows nothing about OS/2-specific FAs or EAs, and accordingly a need arose to accommodate such capability. However, an additional constraint on the system design was that such attributes must not be visible to the OS/2 DFS clients except through the normal OS/2 application programming interfaces. In other words, they must not be visible in the namespace which OS/2 DFS clients may access.
OS/2 extended attributes were implemented in the file allocation table (FAT) file system by storing all the EA data for a single OS/2 drive in a single file in the root directory of the drive. However, in addition to performance and accessing problems associated with a single file containing all EA data, the FAT solution was unworkable in DCE DFS environments because of security requirements. The DFS solution requires users to have the proper permission before accessing EA/FA data. This is true even if the user is on a non-OS/2 DFS system which does not support EAs. Grouping EA data from multiple users into a single file, as might otherwise be required without the invention, would render the provision for users to have proper permission unfeasible on a non-OS/2 system. Moreover, the single file concept, as previously noted, would give rise in many instances to the performance bottleneck noted when it is considered that hundreds and perhaps thousands of distributed file system users might need access to that single file at the same time.