1. Field
Embodiments of the invention relate generally to pluggable merge patterns for data access services. In particular, embodiments relate to using merge patterns to store data into data stores.
2. Description of the Related Art
Relational DataBase Management System (RDBMS) software uses a Structured Query Language (SQL) interface. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
RDBMS uses relational techniques for storing and retrieving data in a relational database. Relational databases are computerized information storage and retrieval systems. Relational databases are organized into tables that consist of rows and columns of data. The rows may be called tuples or records or rows. A database typically has many tables, and each table typically has multiple records and multiple columns.
A majority of object-relational persistence frameworks assume that objects are read from, and stored into, a database in a same transaction. A framework may be described as a reusable design for a software system. Examples of such frameworks include: Java® Data Objects (JDO), Enterprise JavaBeans (EJB) 2.0 (which is a Java® Application Programming Interface (API) that encapsulates business logic at a server), Hibernate (which is an object-relational mapping solution for the Java® programming language), and Java® Persistence API (JPA, which is a framework for managing relational data) (Java is a trademark of Sun Microsystems in the United States, other countries, or both). Therefore, these frameworks require that each persistence-capable object is augmented with code that allows a database engine to monitor how applications modify the object's state during a transaction. As objects are modified, they are deemed “dirty”. At the end of a transaction the database engine scans all objects and stores each “dirty” object into the database. The database engine also keeps track of which objects were read from the database into the transaction and which objects were created or deleted during the transaction. Based on this lifecycle state information, in order to update the objects' state in the database, the database engine determines whether to issue an SQL INSERT statement, an SQL DELETE statement, or an SQL UPDATE statement.
The model described above may be referred to as a “stateful” model and presents problems when applied in a web server environment. This stateful model, in a web server environment, requires that the server maintain the state of persistence across multiple client requests. This stateful model also requires that every request from a particular client is always routed to the server that also maintains the state. Maintaining the server-side state for each client and bypassing a workload manager (that routes requests to servers according to workload of the servers) in order to route client requests to a fixed server limits scalability of services. Servers based on this stateful model can serve a finite number of clients in a closed corporate environment, but they cannot scale up to serve the entire World Wide Web (WWW) in the open with a large number of clients, especially as the number of clients is increasing rapidly.
A more natural and more scalable approach to operate in a web environment is to be stateless, such that objects are read and stored in different transactions. When a server receives a client request, the server reads the objects from the database, serializes them, and sends these serialized objects to the client. The server then forgets the request and any state associated with it. The client operates on the objects, and later on, may send another request to store some objects. The objects sent for storage may not necessarily be the same objects that the client originally read from the database. The objects sent for storage to the database may contain only a relevant subset of the data, or may be entirely different types of objects probably derived from the originally read objects. This shows that maintaining the state of objects originally requested by clients provide no benefit if the objects that the client requests to store later on are not the same. Moreover, there is no guarantee that the request for storing the object will be routed to the same server that served the original objects to the client. Therefore, in a typical web application, it is beneficial for the server to assume that requests for object storage are independent of requests for object reads. In other words, even if the server could carry over the state from one request to another, meaningful information might not be carried over.
The introduction of intelligent, Asynchronous Java® Script (AJAX) based clients makes the stateless server scenario even more challenging. AJAX may be described as a technique for developing interactive web applications. AJAX clients typically read graphs of data elements (e.g., eXtensible Markup Language (XML) data elements, Java® Script Object Notation (JSON) data elements, Java® objects, etc.) and cache them for a period of time. For example, a client may retrieve an existing order-graph (e.g., Order→LineItem→Product) from the server. The client may add new line items, modify existing line items, and delete existing line items. Once the client is done with processing the order, the client may want to merge the order-graph back into the server. Now that the server is stateless, the server needs sophisticated merge logic to determine how the database should be updated. There are a number of approaches for how the order-graph may be merged into the database without any state information, but which approach to use depends on the application. For example, the merge logic may use an “UPSERT” SQL-pattern to determine which line items need to be inserted or updated, and a “NOT IN” SQL-pattern to determine which line items should be deleted. Another pattern may be one in which line items that are determined to be deleted have a status field that is set to “deleted” rather than being physically deleted. Thus, a merge pattern describes how data elements are to be merged into a database.
That is, the prior art provides a number of different patterns suitable for a single use case. However, the prior art does not provide a single pattern that could cover a wide range of use cases.
Moreover, none of the available object-relational frameworks supports stateless merge. At best, conventional object-relational frameworks provide mapping metadata that can include some qualifiers, such as cascade delete, but the actual pattern that may be used to store the objects is fixed and covers a very limited number of scenarios. Therefore the burden of implementing the stateless merge logic is always on the application side and is done by the application developer (also referred to as a “developer”). This is very time consuming and error prone for the application developer.
Thus, there is a need in the art for improved merging of data elements into databases.