A database is an organized collection of data, gathered together into discrete records. Data can be retrieved and/or sorted and consulted to obtain information or generate reports. To facilitate better retrieval and sorting of data, each record is usually organized as a set of data elements. Access to the database to store, retrieve, and change records is typically provided to local or remote clients via a database management system (DBMS). Databases may be relational (RDBMS), object (ODBMS), or object-relational (ORDBMS).
The structure and type of data held in the database is defined by the database's schema. The schema may also describe any mathematical or logical relationships among them. Some schemas provide for security and data validation. For example, a schema may be defined that only allows a particular element of a record to have positive integers. Or, a schema may be defined that confirms that if one column is filled with data; a second column is filled with data as well.
The majority of the effort involved in building and maintaining a database is typically the collection of the data itself. To avoid duplication of effort, it is often desirable to share existing database information among databases managed by different commercial entities. Unfortunately, the information requires an efficient means to implement such sharing.
For example, rather than generate separate contact lists for use in different application/OS/devices, an existing contact list may be moved from one application/OS/device to another (e.g. from a personal computer to a handheld/PDA, or cell phone).
If the source and destination databases have the same schema or are designed to be compatible (e.g. if they have matching export/import modules or are designed to the same interchange standard), the data may be moved directly from the source to the destination without transformation or loss of data. The problem, however, is that there is a great deal of data resident in databases with incompatible schemas.
If the source and destination databases have dissimilar schemas or are otherwise incompatible, data may be transferred by exporting the data from the source system to an intermediate low-level format (typically with the loss of some data) and importing the data, expressed in the low level format, into the destination.
For example, a popular low-level interchange format is the comma separated value (CSV) format. Like most low-level interchange formats, CSV does not encode the nature of the data (e.g. if it is a binary number or alphabetical only text) but rather simply the most readable interpretation of the data. It is left to the destination application to resolve the encoding of the data, and to store the encoded data in an internal representation.
To resolve such encoding issues, destination applications may rely on assumptions about the format of imported data. However, even subtle discrepancies in the database schema . . . even column headings . . . can prevent the direct transportation of information from the source to the destination. For example, it might be assumed that a person's name would be stored in three separate columns with corresponding column headings such as “first,” “middle,” and “last”. However, the source application might label the first name column “FirstName” while the second application expects “Name1.” In addition, even the language of a given country may conflict between otherwise identical applications, and user defined fields (UDFs) have almost no hope of being transported without human interaction. Hence, the resolution of such encoding issues typically requires significant human interaction (either by writing code to transform the data from the low-level format to the appropriate destination format or by altering the data itself. It may also result in the loss of data in the transformation to the low-level interchange format.
Importing data from one database to another also involves a duplication of data. That is . . . data that is stored in the first database is transformed if necessary, and copied to the second database. While this duplication has its advantages in some cases, it is disadvantageous in others because it increases the storage requirements of the second database, and may implicate security issues. While data stored in one database may be useful to a second database, the data stored in the first database is of a sensitive nature, and wholesale migration of that data from the first database to the second database may be impermissible for privacy or security issues. This difficulty is especially likely when one of the first and second databases are managed and/or controlled by different commercial entities, since any privacy agreement effective with the owner of one database will not necessarily permit the same use by another commercial entity.
The foregoing lack of schema standardization (including column headings that reflect the data content) is further exacerbated by a lack of any connected method for encoding the relationship of data between databases and any required supporting code or logic. These difficulties make it especially difficult for different commercial entities to share database information.
Further, even if an efficient means for sharing information among databases managed by different commercial entities existed, there is no efficient and flexible means for informing other users that such information exists and is available for use, or for otherwise brokering the provision of such data between different commercial entities.
U.S. Patent publication 2002/0165812, for example, discloses a system and method for selling contingent information. However the disclosed database does not include a globally unique identifier part of the database itself rather than merely index to items in the database. Consequently, the system does not permit one data construct to use and be used by other data constructs authored by other entities.
U.S. Pat. No. 5,491,818, issued Feb. 13, 1996 to Malatesta et al, discloses a system for migrating application data definition catalog changes to a system level data catalog. This system does not use a globally unique inter-database identifier for each data construct, nor is that identifier used to allow one data construct to use and be used by another data construct authored by a commercially distinct entity.
U.S. Pat. No. 5,629,980, issued May 13, 1997 to Stefik et al., discloses a digital work distribution system. The system permits the owner of the digital work to offer it to others, but does not utilize anything analogous to a globally unique identifier, which provides a reference for one data construct to use and be used by other data constructs from other entities.
U.S. Pat. No. 5,752,242, issued May 12, 1998 to Havens, discloses a system and method for automated retrieval of information. Information sources include user parameters that specify user attributes. A library includes filters that each specify one or more parameters for an associated attribute, and a translator selects filters from the library based on the user parameters. This however, does not disclose a globally unique identifier that provides a reference for one data construct to be used by other data constructs from commercially distinct entities.
U.S. Pat. No. 5,974,417, issued Oct. 26, 1999 and U.S. Pat. No. 6,044,372, issued Mar. 28, 2000 disclose systems and methods for publishing events to a network in which publishers publish information which subscribing entities can subscribe. These references, however also do not disclose a system that allows one data construct to use and be used by others by reference to a globally unique identifier in a brokering context.
U.S. Pat. No. 6,304,874, issued to Corley et al. on Oct. 16, 2001, discloses an access system for distributed storage. An indexing system is used, but the index is not globally unique between databases, and is not part of the database information itself (the index sits outside the database, pointing to data therein). Since the index is not part of the database, it cannot be used by one data construct to use or be inferentially used by another data construct.
What is needed is an efficient and reliable brokering system for governing and encapsulating both the data and connected code constructs so that database work products can be built upon and exchanged. The present invention satisfies these needs.