1. Field of the Invention
This invention relates to object management in relational database systems for storing and manipulating any kind of data on internet. With the advancement of internet and world wide web, a large number of different types of objects (text, file, audio, video, image as well as relational data) are being created everyday. One can look at internet as a huge database storing different types of data. Querying such a large database from many different perspectives is a nontrivial task. Additionally, database transactions over the web, internet commerce, security and distributed many tier application architecture are also posing demand for new technology solutions. This invention relates to these specific technology needs.
2. Description of the Prior Art
Information technology is shifting towards multi-tier solutions. In the last decade client/server computing paradigm began with the idea of separating database management from applications. The design of relational model was primarily oriented towards data independence from the application perspective, by putting the databases on a shared server. The referential integrity was maintained in the database server rather than in the application logic. The communication mechanism involved SQL calls to travel from clients to server. However, two-tier client/server model does not scale well to support large numbers of users, high transaction volumes and unpredictable workloads of internet applications. New application architecture is distributed and component oriented to adapt to rapid changes in business and technology. Emerging application architectures are multi-tiered, involving thin clients (for example, browsers requiring no application installation or support), application servers to manage the business logic and data sources on various platforms. A component oriented architecture will easily integrate legacy, current and emerging technologies. Seamless integration and communication among various components requires extensive infrastructure or middleware architecture. This middle-tier is called various names: transaction server, application server, component server and business rule server. The basic abilities for this middle tier include scalability, adaptability, recoverability and manageability.
Web is the computing platform of the future as more platform independent applications being distributed over the web. As web evolves, there will be more dynamic information retrieval and electronic commerce where the middle tier software should support high volume secure transactions. Internet communications are mainly based upon HTTP (Hypertext transport protocol), CGI (Common gateway interface), IIOP (Internet inter ORB protocol) and JDBC (Java database connectivity). HTTP is the main communication mechanism among web browsers and servers. HTTP is a stateless protocol which implies that there is no way for the client and the server to know each other's state. Database operations such as scrolling over result sets or updating tables needs maintenance of system and transaction state. Since web is stateless, each transaction consumes time and resources in the setup and closing of network and database connections. For large online transaction processing applications, this overhead will be significant. CGI scripts are used to create new HTML (Hypertext mark up language) pages for static data access from databases. This is also stateless. JDBC is an application programming interface (API) for Java. It provides synchronous communication mechanism to most relational databases though a common API.
Internet inter ORB protocol (IIOP) is a dynamic invocation interface for the web. hOP is the most promising protocol. This protocol maintains the session between the client and the server objects until either side disconnects. It provides persistent connectivity over the web for distributed objects. Common Object Request Broker Architecture (CORBA) specifies the Object Request Broker (ORB) that allows applications to communicate with one another no matter where they reside on the web. The IIOP specification defines a set of data formatting rules, called CDR (Common Data Representation) which is tailored to the data types supported in the CORBA interface definition language (IDL). Using CDR data formatting rules, the hOP specification defines a set of message types that support all of the ORB semantics defined in CORBA. hOP specification requires that object request brokers send messages over TCP/IP connection, a standard connection oriented transport protocol. Moving from HTTP to IIOP will be transparent to end users, except for the fact that with IIOP the applications will become more sophisticated and have better performance. As web related CORBA standards are progressing, there is going to be standard URL formats for object references and requests. This provides less sophisticated users accesses to powerful object oriented services through the web.
Uniform resource locators representing references to CORBA objects can be treated as complex values for abstract types. URLs are currently used to reference web objects like text, image, sound etc. on the web. URLs are frequently embedded in HTML pages where a browser can navigate through the resource locator to find and manipulate web objects. By extending the use of URLs to represent CORBA objects (more specifically CORBA business objects) and embedding such URLs as attribute values in tables or other hypertext documents will make CORBA objects to behave like uniform web entities like text, image, audio etc. A browser can navigate through such an URL representing CORBA object and apply business logic to retrieve computed values after querying a relational database dynamically. This capability to store and manipulate distributed CORBA and web objects represented in the form of URLs leads to interesting possibilities for databases and middle tier software.
In middle tier application server, certain features are necessary. These common features are scalability, security, transaction management, concurrency and serialization, state management with persistence, exception/fault resolution, composite object creation with multiple components, object life cycle management including transparent persistence, dynamic location of objects and referential integrity. These capabilities are also the essential ingredients of a relational database server. Relational database management system deals with transactions, concurrency, recovery, fault tolerance, security etc. A relational database server provides almost all the necessary services and capabilities necessary for an application server. There is potential possibility for defining abstract types with CORBA business objects located by URL references over the web and then storing URL references in an object oriented relational database which defines and maintains abstract attribute types. Universal relational databases supporting object definitions are currently available as products. Such databases with proper modeling, extensions and modifications can address the needs for multi-tier transaction and application model for the web.
A database schema is the type description for the tables and attributes. Such a schema can be partitioned over the web in such a way that disparate business logic and business objects can exist with relevant data and relational views over the web. Unifying the object paradigm and relational model paradigm is the mainstream effort across the industry. Unified model for distributed relational databases integrated with object model is the key to many storage and manipulation issues for the web objects.
Universal relational database servers are available from different database vendors to offer general extensibility. One can extend types of attributes in tables and integrate routines defined by users written in high level programming languages. A data type is a descriptor that is assigned to a variable or column indicating the type of data that it can hold. The data type system of Universal Server handles the interaction with the data types. To specify a data type, the universal server needs to determine the following: (1) What layout or internal structure can the database server use to store the data type values on disk? (2) What are the operations (such as multiplication or string concatenation) applicable to a specified data type? (3) What are the corresponding access methods, the database server should use for data types? An access method defines how to handle the following. (a) Storage and retrieval of a particular data type in a table (a primary access method), (b) Storage and retrieval of a particular data type in an index (a secondary access method).
One such universal server available as a product offers the facilities of user defined routine. A user-defined routine (UDR) is a routine that a user creates and registers in the system catalog tables and that is invoked within a SQL statement or another routine. A function is a routine that optionally accepts a set of arguments and returns a set of values. A function can be used in SQL expressions. A procedure is a routine that optionally accepts a set of arguments and does not return any values. A procedure cannot be used in SQL expressions because it does not return a value. An UDR can be either a function or a procedure. The ability to integrate user defined routines and functions is the extensibility feature offered by universal servers.
A database schema in a universal server can contain clusters, database links, triggers, stored procedures, indexes, tables, views, snapshot logs, packages, object types, object tables, object views and other definitions. Schema objects or parts of schema objects can be local or remote. This means that it is possible in some universal servers to access objects or parts of objects from a schema other than the local schema owned by the user. It should be noted that one must be granted privileges to refer to objects in other schemas. By default any object or part of object is referred to user's own schema. The syntax for remote schema object reference is schema.object.part@dblink where dblink qualifier allows the user to refer to an object in a database other than the local database. This specific syntax for remote database referencing is supported by a specific database manufacturer's universal server distributed option. One can create a database link with the CREATE DATABASE LINK command. The command allows one to specify the name of the database link, a connect string to access the remote database and a user name/password to connect to it. This information goes to the data dictionary. This facility for accessing schema and objects in remote databases is possible only within the distributed framework offered by a specific database vendor. User defined routines or functions cannot be applied to such remote schema objects. In another vendor's universal database, virtual table interfaces are allowed to extend the sources of data available to users by adding new access methods. One type of access method is a gateway used by the database server to access data stored inside a source that is external to the server. Gateways are intended to unify all existing heterogeneous data distributed throughout an organization. One can access other vendor's database tables, data stored in sequential files and remote data stored across a network. However, these access methods are limited to tables only and it is not possible to send a query to such remote data sources.
Data types as mentioned earlier are always defined for columns in a table in a universal relational schema. Types can be built-in or user defined. Internal or built-in data types can be VARCHAR (variable length character string having a maximum size bytes), CLOB (a character large object of maximum size limit containing single byte characters), BLOB (a binary large object with a maximum size), BFILE (pointer to large binary file stored outside the database) etc. Also for one of the vendor's database product, internal large object (LOB) data types may be included in one of these categories and can store data such as text, image, video, spatial data, etc. Internal LOB columns contain LOB locators that can refer to out-of-line or inline LOB values. Selecting a LOB column value returns the LOB locator and not the entire LOB value. Different operations in the form of packages and functions are performed through these locators. Multiple LOB data type columns can be defined in a table and all possible SQL operations are possible over such tables and attributes. LOB locator can be stored in the table column, either with or without the actual LOB value. BLOB and CLOB values are stored in separate table spaces and BFILE data is stored in an external file on the server. These type definitions are however limiting LOB within the database server space and cannot go to different databases at different locations on the internet. The concept of LOB locator is therefore not generic for referencing any kind of web objects at disparate locations.
User defined types are also currently supported by different universal servers from different vendors. User defined types use internal built-in types and other user defined data types as the building blocks of types that model the structure and behavior of data in applications. Usually commands like CREATE TYPE, CREATE ROW TYPE, CREATE OPAQUE TYPE etc., are used to create an object type, nested table type and other complex user defined types. These types can be associated with operations or methods in order to operate on the instances of those types. An object identifier (OID) allows the corresponding object to be referenced from other objects or from relational tables all in the same server. A built-in data type called REF represents such reference in one of the available universal servers. REF is a container for an object identifier and points to an object. Content of REF can be replaced with a reference to a different object. A table can have top-level REF columns or can have REF attributes embedded within an object column. In general, if a table has a REF column, each REF value in the column could reference a row in a different table. The scope of such references is restricted. These object references can be used to refer to view type (representing a query) or any other user defined types. These references are limited to the data and object spaces within a server and cannot go to any object or any other database on the web.
As described above there is a clear need in the art for efficient relational database management systems to a) support distributed object paradigm for business application logic and b) support heterogeneous data over the internet. There are further needs for universal framework for internet transactions, security, various access techniques and object support in SQL for manipulating legacy databases.