In many information processing applications, a server application running on a host or server computer in a distributed network provides processing services or functions for client applications running on terminal or workstation computers of the network which are operated by a multitude of users. Common examples of such server applications include software for processing class registrations at a university, travel reservations, money transfers and other services at a bank, and sales at a business. In these examples, the processing services provided by the server application may update databases of class schedules, hotel reservations, account balances, order shipments, payments, or inventory for actions initiated by the individual users at their respective stations.
A first issue in many of these server applications is the ability to handle heavy processing loads. In the above server application examples, for instance, the updates for a large number of users may be submitted to the server application within a short time period. As each update may consume significant processing time, each additional user of the server application can slow the response or time to complete updates for all other users' updates, thus reducing the quality of service for all users. Eventually, the load may exceed the processing capacity, possibly resulting in system failure, down time, and lost data. The degree to which a server application can support incremental increases in user load while preserving performance is sometimes referred to as scalability.
One factor that affects server application scalability is the server application's use of memory to store user data while performing processing for that user. The server computer's memory is a limited resource that is shared among all the users of the server application. Because server computer memory is a shared fixed resource, the duration that the server application stores one user's data affects the availability of the memory to store other users' data. By minimizing the duration that data is stored in the server computer memory for each user, the server application is able to support many more clients with a given server computer memory capacity.
One approach to enhancing scalability is for the server application to keep user data in memory only during the course of a single interaction or exchange with a client (e.g., while processing one remote procedure call or message exchange from the client). The server application keeps a current state of each user's data (also referred to as the user's “state”) in secondary storage (e.g., hard disk and other large capacity storage devices), and loads the state into the server computer's main memory only as long as necessary to perform a discrete processing operation on the data responsive to the single interaction. After completing processing of the single interaction, the server application again stores the state into secondary memory. This practice effectively shares the scarce main memory resources among the users. Using this practice, the server application can accommodate more users.
Under this “surgical strike” or “get in/get out” style of programming, the server application generally consisted of a group of functions or procedures that could be called remotely by client applications at the user's workstations to perform the discrete processing operations in a single interaction between the client and server application. In general, the user's state was loaded into main memory at the start of the function, and stored away at the function's completion. Also, the function's parameter list would contain all input data from the client application that was needed for the processing operation. This would sometimes lead to server application functions with extensive parameter lists. For example, a simple function in a banking server application for a money transfer might include parameters for the amount to be transferred, the account number to debit, the account number to credit, the authorization number, the check number, the teller id, the branch id, etc.
Programming models generally known as object-oriented programming provide many benefits that have been shown to increase programmers' productivity, but are in many ways antithetical to the just discussed approach to enhancing scalability. In object-oriented programming, programs are written as a collection of object classes which each model real world or abstract items by combining data to represent the item's properties with functions to represent the item's functionality. More specifically, an object is an instance of a programmer-defined type referred to as a class, which exhibits the characteristics of data encapsulation, polymorphism and inheritance. Data encapsulation refers to the combining of data (also referred to as properties of an object) with methods that operate on the data (also referred to as member functions of an object) into a unitary software component (i.e., the object), such that the object hides its internal composition, structure and operation and exposes its functionality to client programs that utilize the object only through one or more interfaces. An interface of the object is a group of semantically related member functions of the object. In other words, the client programs do not access the object's data directly, but must instead call functions on the object's interfaces to operate on the data.
Polymorphism refers to the ability to view (i.e., interact with) two similar objects through a common interface, thereby eliminating the need to differentiate between two objects. Inheritance refers to the derivation of different classes of objects from a base class, where the derived classes inherit the properties and characteristics of the base class.
Object-oriented programming generally has advantages in ease of programming, extensibility, reuse of code, and integration of software from different vendors and (in some object-oriented programming models) across programming languages. However, object-oriented programming techniques generally are antithetical to the above-discussed approach to enhancing server application scalability by minimizing the duration of state to single client/server application interactions. In object-oriented programming, the client program accesses an object by obtaining a pointer or reference to an instance of the object in memory. The client program retains this object reference throughout the course of its interaction with the object, which allows the client program to call member functions on the object. So long as any client programs have a reference to an object's instance, data associated with the instance is maintained in memory to avoid the client issuing a call to an invalid memory reference. At the very least, even where the client program calls only a single member function, the object instance is kept in memory between the client program's initial call to request a reference to the object instance and the client program's call to release the reference (between which the client program issues one or more calls to member functions of the object using the reference). In other words, the client program has control over the object's lifetime. The object is kept in memory until the client's reference to the object is released.
Also, object-oriented programming encourages setting an object's properties using separate member functions. For example, a money transfer object may provide a set of member functions that includes a SetDebitAccount( ) function, a SetCreditAccount( ) function, a SetTransferAmount( ) function, etc. that the client program calls to set the object's properties. Finally, the client program may call a TransferMoney( ) function to cause the object to perform the money transfer operation using the accumulated object properties (also referred to as the object's state). Again, while the client program issues these separate calls, the object is maintained in memory. In a server application, this programming style can drastically reduce the server application's scalability.
A further disadvantage of object-oriented programming of server applications is that each separate operation with or use of an object often requires creating a separate instance of the object. This is because the accumulated properties that are set for one operation with an object typically differ from the settings of the properties in another operation. In the above money transfer object example, for instance, separate money transfer operations usually involve different account numbers and transfer amounts. Since the accumulated state of an object is retained, the client program either instantiates a new instance of the object for a subsequent money transfer or carefully resets each property of the previously used object instance to avoid carrying properties set in the previous money transfer over to the subsequent transfer. However, instantiating each object also is expensive in terms of processing time and thus further reduces server application scalability.
A second issue is that server applications often require coordinating activities on multiple computers, by separate processes on one computer, and even within a single process. For example, a money transfer operation in a banking application may involve updates to account information held in separate databases that reside on separate computers. Desirably, groups of activities that form parts of an operation are coordinated so as to take effect as a single indivisible unit of work, commonly referred to as a transaction. In many applications, performing sets of activities as a transaction becomes a business necessity. For example, if only one account is updated in a money transfer operation due to a system failure, the bank in effect creates or loses money for a customer.
A transaction is a collection of actions that conform to a set of properties (referred to as the “ACID” properties) which include atomicity, consistency, isolation, and durability. Atomicity means that all activities in a transaction either take effect together as a unit, or all fail. Consistency means that after a transaction executes, the system is left in a stable or correct state (i.e., if giving effect to the activities in a transaction would not result in a correct stable state, the system is returned to its initial pre-transaction state). Isolation means the transaction is not affected by any other concurrently executing transactions (accesses by transactions to shared resources are serialized, and changes to shared resources are not visible outside the transaction until the transaction completes). Durability means that the effects of a transaction are permanent and survive system failures. For additional background information on transaction processing, see, inter alia, Jim Gray and Andreas Reuter, Transaction Processing Concepts and Techniques, Morgan Kaufmann, 1993.
In many current systems, services or extensions of an operating system referred to as a transaction manager or transaction processing (TP) monitor implement transactions. A transaction is initiated by a client program, such as in a call to a “begin_transaction” application programming interface (API) of the transaction monitor. Thereafter, the client initiates activities of a server application or applications, which are performed under control of the TP monitor. The client ends the transaction by calling either a “commit_transaction” or “abort_transaction” API of the TP monitor. On receiving the “commit_transaction” API call, the TP monitor commits the work accomplished by the various server application activities in the transaction, such as by effecting updates to databases and other shared resources. Otherwise, a call to the “abort_transaction” API causes the TP monitor to “roll back” all work in the transaction, returning the system to its pre-transaction state.
In systems where transactions involve activities of server applications on multiple computers, a two-phase commit protocol often is used. In general, the two-phase commit protocol centralizes the decision to commit, but gives a right of veto to each participant in the transaction. In a typical implementation, a commit manager node (also known as a root node or transaction coordinator) has centralized control of the decision to commit, which may for example be the TP monitor on the client's computer. Other participants in the transaction, such as TP monitors on computers where a server application performs part of the work in a transaction, are referred to as subordinate nodes. In a first phase of commit, the commit manager node sends “prepare_to_commit” commands to all subordinate nodes. In response, the subordinate nodes perform their portion of the work in a transaction and return “ready_to_commit” messages to the commit manager node. When all subordinate nodes return ready_to_commit messages to the commit manager node, the commit manager node starts the second phase of commit. In this second phase, the commit manager node logs or records the decision to commit in durable storage, and then orders all the subordinate nodes to commit their work making the results of their work durable. On committing their individual portions of the work, the subordinate nodes send confirmation messages to the commit manager node. When all subordinate nodes confirm committing their work, the commit manager node reports to the client that the transaction was completed successfully. On the other hand, if any subordinate node returns a refusal to commit during the first phase, the commit manager node orders all other subordinate nodes to roll back their work, aborting the transaction. Also, if any subordinate node fails in the second phase, the uncommitted work is maintained in durable storage and finally committed during failure recovery.
In transaction processing, it is critical that the client does not pre-maturely commit a server application's work in a transaction (such as database updates). For example, where the activity of a server application in a transaction is to generate a sales order entry, the server application may impose a requirement that a valid sales order entry have an order header (with customer identifying information filled in) and at least one order item. The client therefore should not commit a transaction in which the server application generates a sales order before both an order header and at least one order item in the sales order has been generated. Such application-specific requirements exist for a large variety of server application activities in transactions.
Historically, pre-mature client committal of a server application's work in a transaction generally was avoided in two ways. First, the server application can be programmed such that its work in a transaction is never left in an incomplete state when returning from a client's call. For example, the server application may implement its sales order generation code in a single procedure which either generates a valid sales order complete with both an order header and at least one order item, or returns a failure code to cause the client to abort the transaction. The server application's work thus is never left in an incomplete state between calls from the client, and would not be committed prematurely if the client committed the transaction between calls to the server application.
Second, the client and server application typically were developed together by one programmer or a group of programmers in a single company. Consequently, since the programmers of the server application and client were known to each other, the server application programmers could ensure that the client was not programmed to commit a transaction between calls that might leave the server application's work in an incomplete state. For example, the client's programmers could be told by the server application programmers not to call the commit_transaction API after a call to the server application that sets up an order header and before a call to the server application that adds an order item.
These historical approaches to avoiding pre-mature committal of server application work in a transaction are less effective for, and in some ways antithetical to, component-based server applications that are programmed using object-oriented programming techniques (described above). Object-oriented programming generally has advantages in ease of programming, extensibility, reuse of code, and integration of software from different vendors and (in some object-oriented programming models) across programming languages. However, object-oriented programming techniques generally are antithetical to the above described historical approaches to avoiding pre-mature committal of server application work.
First, object-oriented programming techniques encourage accomplishing work in multiple client-to-object interactions. Specifically, object-oriented programming encourages setting an object's properties in calls to separate member functions, then carrying out the work with the set properties in a call to a final member function. For example, a money transfer object may provide a set of member functions that includes a SetDebitAccount( ) function, a SetCreditAccount( ) function, a SetTransferAmount( ) function, etc. that the client program calls to set the object's properties. Finally, the client program may call a TransferMoney( ) function to cause the object to perform the money transfer operation using the accumulated object properties (also referred to as the object's state). Between these separate client-object interactions, server application work may often be left in an incomplete state. The object-oriented programming style thus is contrary to the above described approach to avoiding pre-mature committal wherein server application work is never left incomplete on return from a client call.
Second, object-oriented programming also encourages integration of objects supplied from unrelated developers and companies. When the server application is built from object classes supplied from unrelated vendors, there is less opportunity for direct collaboration between developers of the server application and the client. Without direct collaboration, the developer of an object used in a server application generally cannot ensure that the developer of a client will not commit a transaction between calls to the server application object which leave the server application's work in an incomplete state. Thus, the second above described approach to avoiding pre-mature committal also is less effective in component-based server applications.
Additionally, in the prior transaction processing systems discussed above, transactions are initiated and completed by explicit programming in the client program, such as by calls to the begin_transaction, commit_transaction and abort_transaction APIs of the transaction monitor. This adds to complexity and increases the burden of programming the server application and client program. Specifically, the client program must be programmed to properly initiate and complete a transaction whenever it uses a server application to perform work that requires a transaction (e.g., work which involves multiple database updates that must be completed together as an atomic unit of work). The server application, on the other hand, relies on its clients to properly manage transactions, and cannot guarantee that all client programs properly initiate and complete transactions when using the server application. The server application therefore must be programmed to handle the special case where the client fails to initiate a needed transaction when using the server application.
The requirement of a client program to explicitly initiate and complete transactions can pose further difficulties in programming models in which the server application is implemented as separate software components, such as in object-oriented programming (“OOP”). Object-oriented programming generally has advantages in ease of programming, extensibility, reuse of code, and integration of software from different vendors and (in some object-oriented programming models) across programming languages. However, object-oriented programming models can increase the complexity and thus programming difficulty of the server application where transaction processing requires explicit initiation and completion by client programs. In particular, by encouraging integration of software components from different vendors, an object-oriented programming model makes it more difficult for programmers to ensure that the client program properly initiates and completes transactions involving the server application's work. Components that are integrated to form the server application and client programs may be supplied by programmers and vendors who do not directly collaborate, such that it is no longer possible to enforce proper behavior of other components by knocking on a colleague's door down the hall. In the absence of direct collaboration, the programmers often must carefully program the components to handle cases where transactions are not properly initiated and completed by the components' clients.
A third issue in a server application that is used by a large number of people, it is often useful to discriminate between what different users and groups of users are able to do with the server application. For example, in an on-line bookstore server application that provides processing services for entering book orders, order cancellations, and book returns, it may serve a useful business purpose to allow any user (e.g., sales clerk or customers) to access book order entry processing services, but only some users to access order cancellation processing services (e.g., a bookstore manager) or book return processing services (e.g., returns department staff).
Network operating systems on which server applications are typically run provide sophisticated security features, such as for controlling which users can logon to use a computer system, or have permission to access particular resources of the computer system (e.g., files, system services, devices, etc.) In the Microsoft Window NT operating system, for example, each user is assigned a user id which has an associated password. A system administrator also can assign sets of users to user groups, and designate which users and user groups are permitted access to system objects that represent computer resources, such as files, folders, and devices. During a logon procedure, the user is required to enter the user id along with its associated password to gain access to the computer system. When the user launches a program, the Windows NT operating system associates the user id with the process in which the program is run (along with the process' threads). When a thread executing on the user's behalf then accesses a system resource, the Windows NT operating system performs an authorization check to verify that the user id associated with the thread has permission to access the resource. (See, Custer, Inside Windows NT 22, 55–57, 74–81 and 321–326 (Microsoft Press 1993).)
A thread is the basic entity to which the operating system allocates processing time on the computer's central processing unit. A thread can execute any part of an application's code, including a part currently being executed by another thread. All threads of a process share the virtual address space, global variables, and operating-system resources of the process. (See, e.g., Tucker Jr., Allen B. (editor), The Computer Science and Engineering Handbook 1662–1665 (CRC Press 1997).)
The Windows NT operating system also provides a way, known as impersonation, to authenticate access from a remote user to resources of a server computer in a distributed network. When a request is received from a remote computer for processing on the server computer, a thread that services the request on the server computer can assume the user id from the thread on the remote computer that made the request. The Windows NT operating system then performs authorization checks on accesses by the servicing thread to system resources of the server computer based on the user id. (See, Siyan, Windows NT Server 4, Professional Reference 1061 (New Riders 1996).)
The use of such operating system security features to control access to particular processing services in a server application presents cumbersome distribution and deployment issues. The user ids and user groups are configured administratively per each computer station and/or network, and thus vary between computers and networks. When the particular user ids or groups that will be configured on a computer system are known at the time of developing a server application, the server application can be designed to control access to particular processing services and data based on those user ids and groups. Alternatively, specific user ids or groups that a server application uses as the basis for access control can be configured on a computer system upon deployment of the server application on the computer system. These approaches may be satisfactory in cases where development and deployment is done jointly, such as by in-house or contracted developers. However, the approaches prove more cumbersome when server application development and deployment are carried out separately, such as where an independent software vendor develops a server application targeted for general distribution and eventual installation at diverse customer sites. On the one hand, the server application developer does not know which user ids and groups will be configured on the end customers' computer systems. On the other, the server application developer must force system administrators to configure specific user ids or groups, which at a minimum could lead to an administratively unwieldy number of user configurations and at worst poses a security risk on the computer systems of the developer's customers.
According to a fourth issue, because these server applications service a large number of users, the server applications must be programmed to deal with problems of concurrent shared access by multiple users. Shared access by multiple users create a number of well-known problems in correctly synchronizing updates by the users to durable data, isolating processing of one user from that of another, etc. These shared access problems are similar to those faced by users of a joint checking account when one user fails to notify the other of changes to the account balance before a check is written, possibly resulting in an overdraft. For example, a server application for an on-line bookstore faces a shared access problem where two customers concurrently place an order for the same book, and there is only one copy of the book in inventory. If the on-line bookstore application fails to update an inventory database to reflect sale of the book to the first customer before inventory is checked for the second customer's order, then the single book in inventory might be sold to both customers.
A number of concurrency isolation mechanisms for dealing with shared access problems in computer programs are known, including locks, semaphores, condition variables, barriers, joins, and like programming constructs that regulate concurrent access to program code and data. (See, e.g., Tucker Jr., Allen B. (editor), The Computer Science and Engineering Handbook, pp. 1662–1665, CRC Press 1997.) However, even with use of these concurrency isolation mechanisms, the task of programming a server application to deal with shared access problems is complex and difficult. Developers of server applications estimate that 30–40% of the development effort is spent on providing infrastructure, including for dealing with shared access problems, as opposed to implementing the business logic of the processing services that the server application is meant to provide. Further, concurrency isolation mechanisms are among the more sophisticated aspects of programming, and typically require the efforts of highly skilled programmers.
By contrast, applications on a single user computing platform are relatively much easier to program. The programmer need not account for shared access problems, and need not implement complex infrastructure to regulate access to code and data by multiple users. Programmers of single user applications thus can concentrate on implementing the “business logic” or data processing services of the application itself. As a result, programmers of single user applications can realize higher productivity, and do not require the added time and expense to acquire the skills necessary to work with sophisticated concurrency isolation mechanisms.
A programming model that is now commonly used in single user computing platforms is object-oriented programming (OOP). Object-oriented programming generally has advantages in ease of programming, extensibility, reuse of code, and integration of software from different vendors and (in some object-oriented programming models) across programming languages. However, object-oriented programming in itself does not solve shared access problems in a multiple user computing system. Thus, when object-oriented programming techniques are used to program server applications, concurrency isolation mechanisms are still needed to regulate shared access of multiple users.
For example, a user of an object typically accomplishes processing work over multiple interactions with the object. First, the user calls member functions through interfaces of the object that set various data properties of the object (also referred to herein as the “data state” of the object). Then, the user calls one or more member functions to initiate processing based on the previously set data properties. If a second user accesses the object between a first user's interactions with the object, the second user can potentially alter the object's data state causing unintended consequences when the first user initiates processing based on the data state.
The present invention provides for management of software components in an object execution environment, such as for transaction processing, access control, concurrency control, and other externally managed operations, using implicitly associated context objects to store intrinsic context properties of the software components. When the server application component is run in the execution environment of an embodiment of the invention illustrated herein, an object management system maintains a component context object associated with the application component. The component context object provides context for the execution of the application component in the execution environment. The component context object has a lifetime that is coextensive with that of the application component. The object management system creates the component context object when the application component is initially created, and destroys the component context object after the application component is destroyed (i.e., after the last reference to the application component is released).
The component context object contains intrinsic properties of the application component that are determined at the component's creation. These properties include a client id, an activity id, and a transaction reference. The client id refers to the client program that initiated creation of the application component. The activity id refers to an activity that includes the application component. An activity is a set of components executing on behalf of a base client, within which only a single logical thread of execution is allowed. The transaction reference indicates a transaction property object that represents a transaction (i.e., an atomic unit of work that is either done in its entirety or not at all) in which the application component participates. The component context object is implemented as a COM Object that runs under control of the object management system. The component context object provides an “IObjectContext” interface described in more detail below, that has member functions called by the application component for use in transaction processing, in creating additional other application components inheriting component's context properties, and in access control based on abstract user classes (roles).
In the illustrated execution environment, the object management system maintains an implicit association of the component context object to the application component. In other words, the object management system does not pass a reference of the component context object to the client program which uses the application component. Rather, the object management system maintains the component's association with the context object, and accesses the component context object when needed during the client program's access to the application component. Thus, the client program is freed from explicitly referencing the component context object while creating and using the application component.
Additional features and advantages of the invention will be made apparent from the following detailed description of an illustrated embodiment which proceeds with reference to the accompanying drawings.