Reliance on software-based data systems is increasing every year both as data becomes more important and as languages provide more options for obtaining additional value from the data and systems. Access to data in the data systems is also becoming more important, even at a user level, as the ability to implement higher-level, abstract languages becomes more common. However, the data in these underlying data systems (i.e., databases, information management systems, hierarchical data systems, etc.), is often of a prescribed layout, format and often content type. Therefore, before access may be optimized for a user using the data system through an intelligent data engine, for instance, defining the underlying metadata or metadata structure to the data engine or data system in these data systems is necessary.
Unfortunately, many data systems do not contain data layout or data type definitions, which are important to defining the underlying metadata. Instead, these data systems define byte arrays of user-defined sizes which are then available for storage or retrieval in response to access calls. As a result, since the data is not defined to the data system or database in these situations, a user is instead required to interpret the data in application code instead of being able to define the data to the database. Additionally, the user's defined interpretation of data, data types, and data definitions, may not be consistent with that of a data system, as often users define data definitions in terms of their business or may use predefined data types offered by a data management system. The resulting differences in data definitions (i.e., including data types, data interpretation, data definitions, etc., and used as herein as either “data definitions” or “data types”) creates further limitations in effectively using a data engine. Traditionally, a data type is defined as one of sixteen primitive types: BOOLEAN, CHAR, VARCHAR, CLOB (character large block objects), TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATA, TIME, TIMESTAMP, BINARY, or BLOB (binary large block objects), collectively “traditional data type” or “standard data type.”
For instance, given the wide range of user-defined data types, a data engine is unable to accommodate such a wide range of data types and is therefore ineffective in intelligently interpreting the user-defined data types. Often, the user-defined data types may be unknown or not recognized by data engines, or may be non-standard particularly when compared to known data types of traditional data systems (i.e., “standard data types”).
An example of such a limitation includes the situation where a user defines a DATE filed as the number of days since a product became publicly available for sale. In this situation, using the traditional data type DATE (i.e., “standard data type” DATE), a database conversion sequence would interpret the read integer as being the number of milliseconds from Jan. 1, 1970, yielding an incorrect result. Similar issues arise as between EBCDIC, ASCII and UNICODE, as well as big endian and little endian, and IEEE float and IBM float, for instance.
This result is also becoming more common as the traditional sixteen primitive data types are being challenged by competing efforts such as XML (Extensible Markup Language) Schema which comprise sixty-four extendable schema types thereby providing for an infinite number of data types. Traditionally, a database field type consisted of a reference to one pre-defined conversion sequence in relation to one of the sixteen primitive data types, where the conversion sequence provided conversion as between user semantic types and how the data was to be physically stored as bytes in a database. Traditionally, a database field is a collection of a name (i.e., reference to a field), length (i.e., reference size of the stored data) and type (i.e., reference to a known bytes interpretation methodology). A limitation to this traditional approach is that a user is restricted to both have user data types be only within the group of sixteen primitive types and control the database and data bytes so as to ensure compliance of data types to be within the sixteen primitive types. Unfortunately, these restrictions are no longer realistic in today's business environment.
Similarly, a data engine is unable to provide users with the intelligent functionality of access options for their data to meet the rising demands given the wide range of user-defined data types.
Therefore, it is highly desired to be able to provide a solution which overcomes the shortcomings and limitations of the present art and more particularly provides a method and system for referentially converting user-defined data types into recognizable standard data types for providing improved access to user data.
The present invention in accordance with its various implementations herein, addresses such needs.