This invention relates to accessing information in a database.
A database is a body of information that is logically organized so that it can be retrieved, stored and searched in a coherent manner by a xe2x80x9cdatabase enginexe2x80x9dxe2x80x94a collection of software methods for retrieving or manipulating data in the database. Databases generally fall into three categories: relational databases, object-oriented databases and object-relational databases.
A relational database (RDB) is a collection of fixed-field two-dimensional tables that can be related (or xe2x80x9cjoinedxe2x80x9d) to each other in virtually any manner a database developer chooses. The structure of a relational database can be modified by selectively redefining the relationships between the tables. A database engine may perform complex searches on a relational database quickly and easily by using any of various database query protocols such as the method expressed by the Structured Query Language (SQL) or by other mechanisms. The relationships between the tables enable results of a search to be automatically cross-referenced to corresponding information in other tables in the database. As shown in FIG. 1, for example, a relational database 100 includes a customer table 102 which is joined by a logical link 103 to an order table 104 which in turn is joined by a logical link 105 to an inventory table 106. A user may query the database 100, for example, for all order numbers higher than a threshold value. Because the order table 104 is joined with the customer table 102 and the inventory table 106, the list of order numbers resulting from the query can be retrieved and displayed along with the respective customer names and inventory items that correspond to the identified order numbers.
An object-oriented database (OODB) is a collection of xe2x80x9cobjectsxe2x80x9dxe2x80x94software elements that contain both data and rules for manipulating that data. In contrast to a relational database which can store only character-type data, an OODB can store data of virtually any type (text, 3D graphic images, video clips, etc.). An OODB stores its constituent objects in a hierarchy of classes with associated rules so that the OODB contains much of the logic it needs to do useful work. A relational database in contrast contains only data and must rely on external application software to perform useful functions with the data.
A object-relational database (ORDB) is a hybrid of the other two types. Non-character data (e.g., an image file) may be stored and retrieved in an ORDB as a binary large object (BLOB)xe2x80x94an undifferentiated mass of data. Rules for manipulating the data contained within a BLOB (e.g., a utility for viewing image files) may be stored either within the database or external to it depending on the particular ORDB implementation. The Informix(copyright) Universal Server (IUS(copyright)) is an example of an object-relational database management system (ORDBMS) that internally stores the rules for manipulating BLOBs so that they may be treated as xe2x80x9cnativexe2x80x9d data typesxe2x80x94that is, data types that the ORDBMS itself has the capability to manipulate.
Information within a relational or an object-relational database typically is accessed by SQL-compliant computer programs that are written to accomplish a specific function. For example, a user may write a SQL program that retrieves a list of customer names from a database which stores customer information. Alternatively, many different application programs are available that support database queries and which allow a user to interactively formulate a database query by specifying an arbitrary set of criteria (e.g., the names of all out-of-state customers with overdue accounts). This type of application program presents the user""s database query to the database engine which retrieves the:requested information from the database. Such application programs are referred to as xe2x80x9cdatabase awarexe2x80x9d because they are have the ability to interact with and manipulate databases.
Most application programs, in contrast, are xe2x80x9cdatabase-unawarexe2x80x9d meaning that they cannot access information stored in a database. Rather, database-unaware applications rely on file systems, such as the Network File System (NFS) developed by Sun Microsystems, Inc., for storing and retrieving information in discrete files. A database-unaware program stores each separate document in a separate disk file identified by the user of the application. In FIG. 2, for example, a file system 200 has two disk drives mounted: drive 202 which is mapped to the label a: and drive 204 which is mapped to the label b:. Each of the a: and b: drives includes one or more directories (docs on the a: drive 202; dir1 and dir2 on the b: drive 204) which in turn may have subdirectories (subdir1 in dir1; subdir2 and subdir3 in dir2) and so on with virtually any level of hierarchical nesting being possible. Files 206-212 may exist at any of the various directory or subdirectory levels within the file system. The labels a: and b: represent the xe2x80x9cnamespacexe2x80x9d of the file system. That is, all filename paths that begin with a: or b: are within the file system""s namespace. As shown in FIG. 2, for example, a document that lists names of out-of-state customers is stored in the file system""s namespace at a location defined by the filename path
a:/docs/cust_outstate.txt
which means that a file 211 named xe2x80x9ccust_outstatexe2x80x9d of the type xe2x80x9ctxtxe2x80x9d ins stored in a directory named docs on a disk drive 202 mapped to the label a:. Another document that lists names of customers with overdue accounts is stored in a separate disk file located at the filename path
a:/docs/cust_overdue.txt.
These two files are separate and distinct entities that are not related or joined in the sense that tables in a database are related.
In one aspect of the invention, information in a database is accessed with a computer system by making one or more database objects (e.g., a table or a row) available as one or more file system objects (e.g., directories, files or links) to an application, for example, a database-unaware application. The database may be relational, object-relational or object-oriented. If multiple file system objects are made available, collectively they may represent a hierarchical file system. A file system request issued by the application that corresponds to the file system object is transformed into a database operation, for example, an SQL query, which is performed on the database with a database engine.
Information associated with the database object which is retrieved as a result of the database operation may be formatted into one or more file system objects and returned to the application. The particular formatting of the retrieved information may be defined in an extension module, which also may include information that defines the specific manner in which the file system request should be transformed into a database query. The database operations, including formatting of a database query, retrieving information and formatting it into file system objects, are performed transparently to the application.
Upon receiving the file system objects, the application may display them on a display screen of a computer, for example, as graphical representations of file system objects. The database object that is made available may be presented as multiple file system objects in formats understandable by different applications. Conversely, a single file system object may correspond to multiple database objects.
In another aspect, a computer-based data repository management system includes a database of information, a file system-based application program for manipulating data, and a file system interface to the database which provides the file system-based application, which otherwise may be database-unaware, with access to information in the database. The data repository management system may further include a database management system which manages information in the database either in addition to, or instead of, the file system-based application.
The data repository management system may include a module for differentiating file system requests directed to the file system from file system requests directed to the. file system interface. The file system interface may include one or more extension modules containing one or more file objects, each file object including information for converting database objects into file system objects.
In another aspect, information in a database is accessed with a computer system by encoding a file handle with information that specifies a database object in a database. In response to a file system request issued by an application, the encoded file handle is transmitted and then decoded to identify the database object associated with the file system request. The encoding may be based on the NFS protocol. The encoded information may include information that corresponds to the issued file system request and which identifies an extension module, a database table and row, metadata, a pointer to a database object, or a combination thereof.
Advantages of the file system interface described here may include one or more of the following. Applications that rely on a file system as a data repository, or which are otherwise database-unaware (i.e., unable to access data in a database), are enabled to access information in a database in a transparent manner. These database-unaware applications can share data seamlessly both with database-aware applications and with other database-unaware applications. Under IXFS, a database may appear to an application as just another local or remote file system that is no different in form or character from the other file systems available to the application. No change to the application""s program code, the database or the database engine is required. As a result, users of database-unaware applications are provided with database functionality without having to invest the time and cost typically associated with database-aware tools.
A system administrator may use the IXFS system to combine disparate data storage technologies (e.g., file-based systems with database systems) in creating a unified data repository strategy that spans an enterprise. The enterprise""s investment in legacy data repositories is maintained because data present in the repositories may easily be transferred to a database as the enterprise moves to the relational or object-relational model of data storage. Moreover, the enterprise""s investment in database-unaware applications is enhanced because IXFS enables them o be used to manage data stored in a database.
The ability for a database-unaware application to access information in a database combines the simplicity of the file system paradigm with the sophistication and effectiveness of database manipulation techniques. This capability is particularly useful for Internet World Wide Web applications in which a user seeks to access a large store of data using, for example, the hypertext transfer protocol (HTTP). In contrast to a common gateway interface (CGI) script, which spawns an external application to retrieve data from a database in response to a URL (Uniform Resource Locator) encoded request, the IXFS system-converts such a request into a form that may be executed by a database engine directly, quickly and transparently.
The ability to represent an arbitrary collection of tables in a database as various file system objects provides a software developer with a rich and flexible set of tools. The extensible nature of IXFS allows it to be tailored to virtually any type of application so that the database will appear as a collection of file system objects that are consistent with the application""s other file system objects.
Other advantages and features will become apparent from the following description, including the drawings and claims.