A Computer Program Listing Appendix, containing one (1) file on compact disc, is included with this application.
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The present invention relates generally to data access and processing in a distributed computing system and, more particularly, to a system implementing methodology for improving data streaming of objects in distributed computer environments.
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as xe2x80x9crecordsxe2x80x9d having xe2x80x9cfieldsxe2x80x9d of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of a database management system is known in the art. See e.g., Date, C., An Introduction to Database Systems, Volume I and II, Addison Wesley, 1990; the disclosure of which is hereby incorporated by reference.
DBMS systems have long since moved from a centralized mainframe environment to a de-centralized or distributed environment. One or more PC xe2x80x9cclientxe2x80x9d systems, for instance, may be connected via a network to one or more server-based database systems (SQL database server). Well-known examples of computer networks include local-area networks (LANs) where the computers are geographically close together (e.g., in the same building), and wide-area networks (WANs) where the computers are farther apart and are connected by telephone lines or radio waves.
Often, networks are configured as xe2x80x9cclient/serverxe2x80x9d networks, such that each computer on the network is either a xe2x80x9cclientxe2x80x9d or a xe2x80x9cserver.xe2x80x9d Servers are powerful computers or processes dedicated to managing shared resources, such as storage (i.e., disk drives), printers, modems, or the like. Servers are often dedicated, meaning that they perform no other tasks besides their server tasks. For instance, a database server is a computer system that manages database information, including processing database queries from various clients. The client part of this client-server architecture typically comprises PCs or workstations which rely on a server to perform some operations. Typically, a client runs a xe2x80x9cclient applicationxe2x80x9d that relies on a server to perform some operations, such as returning particular database information. Often, client-server architecture is thought of as a xe2x80x9ctwo-tier architecture,xe2x80x9d one in which the user interface runs on the client or xe2x80x9cfront endxe2x80x9d and the database is stored on the server or xe2x80x9cback end.xe2x80x9d The actual business rules or application logic driving operation of the application can run on either the client or the server (or even be partitioned between the two). In a typical deployment of such a system, a client application, such as one created by an information service (IS) shop, resides on all of the client or end-user machines. Such client applications interact with host database engines (e.g., Sybase(copyright) Adaptive Server(trademark)), executing business logic which traditionally ran at the client machines.
More recently, the development model has shifted from standard client/server or two-tier development to a three-tier (or n-tier), component-based development model. This newer client/server architecture introduces three well-defined and separate processes, each typically running on a different platform. A xe2x80x9cfirst tierxe2x80x9d provides the user interface, which runs on the user""s computer (i.e., the client). Next, a xe2x80x9csecond tierxe2x80x9d provides the functional modules that actually process data. This middle tier typically runs on a server, often called an xe2x80x9capplication server.xe2x80x9d A xe2x80x9cthird tierxe2x80x9d furnishes a database management system (DBMS) that stores the data required by the middle tier. This tier may run on a second server called the database server.
The three-tier design has many advantages over traditional two-tier or single-tier designs. For example, the added modularity makes it easier to modify or replace one tier without affecting the other tiers. Separating the application functions from the database functions makes it easier to implement load balancing. Thus, by partitioning applications cleanly into presentation, application logic, and data sections, the result will be enhanced scalability, reusability, security, and manageability.
In a typical client/server environment, the client knows about the database directly and can submit a database query for retrieving a result set which is generally returned as a tabular data set. In a three-tier environment, particularly a component-based one, the client never communicates directly with the database. Instead, the client typically communicates through one or more components. Components themselves are defined using one or more interfaces, where each interface is a collection of methods. In general, components return information via output parameters. In the conventional, standard client/server development model, in contrast, information is often returned from databases in the form of tabular result sets, via a database interface such as Open Database Connectivity (i.e., ODBC, available from Microsoft Corp. of Redmond, Washington) or Java Database Connectivity (i.e., JDBC, available from Sun Microsystems of Mountain View, California). A typical three-tier environment would, for example, include a middle tier comprising business objects implementing business rules (logic) for a particular organization. The business objects, not the client, communicates with the database.
For their part, application writers or developers like to write object-oriented programs using modern object-oriented programming techniques. At the same time, however, these developers prefer to have their data (i.e., the data employed by the application) stored in a database having relational tables, as that is an easy way of storing and retrieving data. A particular problem arises when one wants to retrieve data from the database for use (e.g., manipulation) within one""s program: how is this xe2x80x9cflatxe2x80x9d data converted into objects. In this regard, xe2x80x9cobjectxe2x80x9d refers to the specific programming construct that defines associated data members and methods (typically, including data hiding and containment), such as an object instantiated from a C++ class, a Java class, an Object Pascal class, or the like.
Today, there are products available to perform object/relational mapping of that nature. Typically, such products operate as an additional, add-in tool that, after examination of the underlying table(s), builds corresponding wrappers for achieving a degree of object/relational mapping. As a simple example, given an employee table, such a tool might, for instance, display a user interface facilitating the creation of an employee C++ class definition for use in the user""s application code. One approach, for example, would be to generate public members that correspond to table columns, with perhaps some automated checking of business rules.
The foregoing approach is disadvantageous, however. For instance, since the approach is not well integrated with the underlying database system, the user is not able to manipulate objects using functionality that is otherwise available to the database system. For example, the user is not able to employ an SQL query having a predicate that refers to the objects, including individual fields and methods of those objects. As another problem, present-day tools still require a fair degree of manual intervention by the user. Thus, although such tools provide graphical user interfaces (GUI) for assisting with the task of browsing table definitions and constructing associated objects (e.g., by assisting with object definitions during code generation), such an approach still requires the user to manually assist in carrying out object mappings. As another problem, current object/relational mapping tools provide little or no inter-operability. Once the user has selected a particular tool, he or she will likely have to xe2x80x9cstickxe2x80x9d with that tool for the duration of program development. If the schema of the underlying database changes, the client applications created as a result of that tool will likely be broken, or at least require significant modification.
What is instead desired is the ability to create objects, particularly Java objects, in one""s application and store those objects within tables, or locate an object created by another application and retrieve it into one""s application for local processing. In other words, developers want the ability to manipulate such objects locally, instead of having to operate on such objects (particularly, database objects) at a distance, such as using proprietary extensions to SQL. Specifically, developers want the ability to bring database objects to the clients, in result sets, as database cursors, as stored procedure output parameters, and the like, for local manipulation. The present invention fulfills this and other needs.
A distributed (e.g., client/server) computing environment is described which, in accordance with the present invention, simplifies the use of objects in distributed applications. In particular, the invention provides methodology for streaming to clients objects (e.g., Java objects) stored and managed remotely (e.g., objects stored and managed in relational databases), so that the objects may be executed or otherwise manipulated locally at the clients.
The present invention may be implemented by taking an existing streaming protocol, such as Sybase Tabular Data Stream (TDS) protocol or other comparable streaming protocol, and extending it in the following manner. The protocol is extended to include a xe2x80x9cchunkedxe2x80x9d data type, so that within a data stream the system can have individual data items which are themselves streams of indeterminate length. This streaming data type is an undifferentiated data type or simply a xe2x80x9cBLOBxe2x80x9d (i.e., binary large object). Using the BLOB extension, the system provides a set of BLOB subtypes which take advantage of existing object streaming mechanism (e.g., Java streaming) but convey additional information in the form of self-describing metadata (which precedes all TDS data types). This extended metadata is present in ROW_FORMAT (i.e., row format) and PARAM_FORMAT (i.e., parameter format) tokens. This metadata contains all necessary information on the BLOB data for clients and servers to narrow the BLOB data itself to the appropriate subtype and extract the semantically correct values from it.
One of the BLOB subtypes is defined as JAVA_OBJECT1. Here, the BLOB data contains the serialized value of a Java Object using the Java version 1 serialization format (as defined by Sun Microsystems, of Mountain View, California). Additionally, the capabilities of the existing negotiation part of the login sequence is extended so that the client and server sides of a TDS connection can be sure when it is appropriate to send JAVA_OBJECT1 data. In this manner, client and server products can now use this extended version of a tabular streaming protocol to exchange serialized Java Objects as parameters or row data.
An improved method of the present invention for allowing a client to retrieve an executable object stored in a database table residing on a database server, embodied in a system comprising a computer network having a database server and a client, includes the following steps. The system provides a streaming protocol for effecting communication between the client and the database server, wherein the protocol supports streaming of an undifferentiated data type, and wherein the protocol provides subtypes providing metadata that conveys additional information for extracting an executable object from a stream. A request is received from the client for retrieving information from the database server, where the request includes a request to transport to the client a particular executable object stored at the database table. In response to the request, the system operates to retrieve a result set from the database server corresponding to the requested particular executable object and stream the result set to the client; the result set includes the particular executable object together with corresponding metadata. Upon receipt of the result set at the client, the metadata is used for recreating at the client a local copy of the particular executable object.