Computing systems increasingly must work with data from diverse sources. Often the data with which these systems must work is in a variety of different formats and representations. Although a large number of formats and representations exist, several major ways of representing data have evolved and become prevalent in modern computing environments. Each of these formats and representations has its own strengths and weaknesses. Choice among formats is a common problem encountered by software developers.
The relational data model makes it easy to query large amounts of information. However, this model does not allow for data that results from a query to be encapsulated. Additionally, relational data is not designed for transmission across the wire. The extensible markup language (XML) model of data representation makes it easy to transmit information across the wire, but is not designed to be encapsulated and is difficult to query. The object model of data is designed around encapsulation principles, but cannot be easily transmitted across the wire. Further, the object model usually offers no effective query capabilities. No single data model provides everything in one package. Most computing applications require using at least two, if not all three, data representation models.
Because each model by definition has a different way of representing data, when the format of represented data is converted from one model to another, a mapping between models usually must be created. Most programming languages do not provide a standard way to specify the mapping between the object model and either the relational or the XML model. Instead, each mapping system usually defines its own set of attributes or schema language to specify how data should map between the models.
Most programming languages also require, as part of the mapping, that the data be copied. Usually, when data is mapped from either the XML or relational model onto the object model, the data is transformed from the XML or relational representation into an object. As a result, the data is disconnected from its original representation because it has been fully encapsulated in an object. To preserve a connection between the object representation and the original representation, the encapsulating object must either include additional information that maintains the connection or an external component must track the connection between the object and its original data. Either approach means that the computer programmer or software developer is usually limited in how object representations are defined or the computer programmer or software developer must explicitly manage the relationship between the object representation and the original representation.
Copying data as part of a mapping also has performance implications. When working with large amounts of information, the requirement that data be shuttled back and forth from one representation to another is memory intensive, both in terms of object allocations and in terms of the basic amount of copying required. Current systems fail to provide an efficient way for data to be used across representation paradigms.