The present invention relates generally to distributed computer systems and, more particularly, to the management of entries in a data structure that are accessible by multiple entities.
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawings hereto: Copyright(copyright) 1999-2000, Microsoft Corporation, and All Rights Reserved.
A distributed computer system, such as but not limited to the Internet, is characterized by rapid, real-time interchange among many dissimilar processes executing simultaneously on a large array of dissimilar and geographically diverse processors. A distributed computer system""s resources are usually spatially separated, and the execution of its applications often involves multiple execution threads that can be widely separated in time. To address some of the challenges of writing applications for use on distributed computer systems, tuple space based coordination languages were developed.
The first tuple space based coordination language was Linda, a language that was originally developed by Dr. David Gelernter. See, for example, xe2x80x9cLinda In Contextxe2x80x9d, N. Carriero and D. Gelernter, Communications of the ACM, 32(4): 444-458, 1989; and N. Carriero and D. Gelernter, xe2x80x9cTuple Analysis and Partial Evaluation Strategies in the Linda Precompilerxe2x80x9d, in Languages and Compilers for Parallel Computing, Research Monographs in Parallel and Distributed Computing, D. Gelernter, A. Nicolau, and D. Padua, editors, MIT Press, 1990, pages 114-125.
A xe2x80x9ctuple spacexe2x80x9d is a globally shared, associatively addressed memory space that is organized as a grouping of tuples. A xe2x80x9ctuplexe2x80x9d is the basic element of a tuple space system. In the context of a tuple space based coordination language like Linda, a tuple is a vector having fields or values of certain types. In a broader sense, a xe2x80x9ctuplexe2x80x9d is an entry in an information storage system. For example, a row in a relational database system can be referred to as a tuple.
In Linda-like languages, constructs called xe2x80x9ctemplatesxe2x80x9d are used to associatively address tuples via matching techniques. A template matches a tuple if they have an equal number of fields and if each template field matches the corresponding tuple field.
Tuple space based coordination languages provide a simple yet powerful mechanism for inter-process communication and synchronization, which is the crux of parallel and distributed programming. A process with data to share generates a tuple and places it into the tuple space. A process requiring data simply requests a tuple from the tuplespace.
Although not quite as efficient as message-passing systems, tuple space programs are typically easier to write and maintain for a number of reasons including the following:
(1) Destination uncoupling (fully anonymous communication)xe2x80x94the creator of a tuple requires no knowledge about the future use of that tuple or its destination.
(2) Spatial uncouplingxe2x80x94because tuples are retrieved using an associative addressing scheme, multiple address-space-disjoint processes can access tuples in the same way.
(3) Temporal uncouplingxe2x80x94tuples have their own life span, independent of the processes that generated them or any processes that may read them. This enables time-disjoint processes to communicate seamlessly.
Research into tuple space systems has proceeded at a steady pace for the past fifteen years, but its focus has been primarily on high-performance parallel computing systems. Recently, interest in tuple space has developed among researchers in distributed systems. For example, SUN Microsystems, Inc. has released as part of the Jini connection technology, a language based on a tuple space based coordination language called Javaspaces, as described in the JavaSpaces Specification, Version 1.0.1, Sun Microsystems, Inc., 1999. Some other known implementations and derivatives of the Linda language are WCL, PageSpace, TuSCon, Jada, TSpaces, KLAIM, Lime, C-Linda, Paradise, Melinda, Ease, Lucinda, FORTRAN-Linda, LindaLISP, and Prolog-Linda.
Tuple space based coordination languages can provide the essential features (spatial and temporal separation) required for many different types of distributed applications, especially for use over the Internet. Developed by scientists and academicians, the Internet was originally used to share research information and to collaborate. However, the Internet now encompasses millions of world-wide computers networked together.
The Internet functions as an object-based multimedia system. It allows for the creation, storage, and delivery of multimedia objects. Numerous on-line service providers, such as Microsoft Network, CompuServe, Prodigy, and America Online, have linked to the Internet. This enables their customers to access a variety of products and services available from independent content providers and other Internet users. For example, a typical customer can access electronic mail, news services, travel services, investment and banking services, online stores and malls, entertainment services, and many other services on the Internet.
There are two distinct types of implementations of tuple space based coordination languages (e.g. Linda), characterized as being either xe2x80x9cclosedxe2x80x9d or xe2x80x9copenxe2x80x9d. The closed implementations use compile time analysis of object and source code to provide highly efficient closed programs. The open implementations allow processes, agents, and programs to coordinate through tuple spaces without the run-time system requiring any prior knowledge. Essentially, the open implementations provide a persistent data store.
The Linda language uses three standard instructions or primitives. These are (with their informal semantics):
(1) out(tuple) Insert a tuple into a tuple space.
(2) in(template) If a tuple exists that matches the template, then remove the tuple and return it to the agent performing the in. If no matching tuple is available, then the primitive blocks until a matching tuple is available.
(3) rd(template) If a tuple exists that matches the template, then return a copy of the tuple to the agent that performed the rd. If there is no matching tuple, then the primitive blocks until a matching tuple is available.
The informal semantics of the in primitive leads implementers to remove the tuple that is returned to the agent from the tuple space when the in primitive is performed. However, this causes an efficiency problem in a distributed computer system where multiple users may be concurrently trying to access a particular tuple, because they can be blocked from accessing the tuple until a matching tuple is available.
Thus there is a significant need to modify existing tuple space based coordination languages to improve their efficiency of execution on large, open implementations, particularly on distributed computer systems, so that users are not unnecessarily blocked from accessing tuples.
There is also a significant need to modify existing tuple space based coordination languages to improve their efficiency of execution on closed implementations, particularly on high-performance parallel computer systems, so that users are not unnecessarily blocked from accessing tuples.
The above-mentioned shortcomings, disadvantages, and problems are addressed by the present invention, which will be understood by reading and studying the following specification.
The present invention includes a number of different aspects for improving the performance of a data processing system. For the purposes of describing this invention, the term xe2x80x9cperformancexe2x80x9d is intended to include within its meaning not only the operational performance, but also the function, structure, operation, code, and behavior of a data processing system.
While the invention has utility in increasing the performance of distributed and parallel-processing systems, its utility is not limited to such, and it has utility in increasing the performance of a wide spectrum of data processing systems, particularly those that utilize tuple space based coordination languages.
An optimization to tuple space based coordination languages has been developed for improving concurrency within data processing systems utilizing such languages. With improved concurrency, more processes can run on such systems concurrently rather than having to wait for one or more other processes to finish.
The present invention solves the above stated problems of existing tuple space based coordination languages by providing an efficient optimization that can be implemented as an extension to such languages. The extension increases concurrency by lessening tuple removal, without requiring compile time analysis, altering existing primitives, or adding new primitives.
The optimization is based upon certain conditions which, if met, enable a tuple to remain visible in tuple space thereby reducing the number of occurrences when accesses to the tuple cause blocking, so that other processes can continue to read the tuple while a first process is updating the tuple.
Subsequent access to a tuple that has been accessed by another agent is permitted, but only under certain conditions. Generally, if any of these conditions isn""t true, then we block the subsequent access. The term xe2x80x9cagentxe2x80x9d is used herein to mean the entities or components of the computer system that communicate including, but not limited to, applications, programs, processes, or xe2x80x9ctraditionalxe2x80x9d agents.
The optimization provided by the present invention is sometimes referred to as xe2x80x9ctuple ghostingxe2x80x9d, because a tuple is permitted to remain in tuple space for limited purposes and under certain conditions, when under normal circumstances it should have been deleted.
In another embodiment of the invention, a run-time optimization modifies the conditions if the execution is in a closed system that is known not to contain deadlock, further improving performance.
The present invention describes systems, clients, servers, methods, computer-readable media, and data structures, all of varying scope. In addition to the advantages of the present invention described in this summary, further advantages of the invention will become apparent by reference to the drawings and by reading the detailed description that follows.