This invention relates to a method and apparatus for implementing simultaneous usage of a database and durable storage of database updates and more particularly to simultaneous usage of a database and durable storage for a workflow and process flow management system.
Database systems generally provide atomic, consistent, isolated, and durable (ACID) properties to the systems making use of them. These ACID properties are invaluable to applications, allowing them great freedom and considerable simplification in programming with undiminished reliability.
Database systems typically require that a durable log mechanism be supported by the database system services that provide access to the data. These systems layer their ACID functionality atop this durable logging feature.
Database systems provide the needed reliability, but at substantial performance cost. To assure maximal reliability using a database management system, all updates must be durable (permanent) before acting upon them. This generally requires that the database updates be committed.
A conventional database system maintains the data in a durable storage mechanism, such as a disk drive. The database system will also typically have a non-durable copy of an active portion of the database in a volatile memory cache. The data in volatile memory can be rapidly accessed, but can be destroyed and lost in the event of a system crash, program failure or similar abnormal termination. To maintain the integrity of the database, updates to the database must be guaranteed to be stored, i.e. committed, in the durable storage mechanism. A commit requires that the database system store all modified data in memory cache to the durable storage mechanism.
However, durable storage requires more access time than cache memory. A process that requests the database system to commit an update transaction must wait for an acknowledgment that the commit was successful. The database system only acknowledges the commit when the data has been updated in durable storage. Frequent commits, therefore, degrade system performance because of the time consumed by processes waiting for acknowledgment from the database system.
A database system can be made reliable by storing all state changes in a durable log file. However, the modified data, as represented by the state changes in the log file, would not be easily accessible and much of the utility of the database system is lost.
One application for database systems is workflow systems. Workflow systems effect business processes by controlling the scheduling and parameters of activities, acquiring their results, and using the results in determining other activities to run. A business process is a description of the sequencing, timing, dependency, data, physical agent allocation, business rule and organization policy enforcement requirements of business activities needed to enact work. Most workflow systems utilizes relational, object-oriented, network or hierarchical database management system to store data relating to the business process.
Workflow systems partition the data involved in the execution of a business process into three categories.
Process Specific Data (PSData) are those data used in effecting a business process that are of no concern to any of the individual activities contained therein. PA1 Application Specific Data (ASData) are those data used in effecting a business process that are of concern to one or more of the activities therein but not of concern to the scheduling or controlling of the activities. PA1 Process Relevant Data (PRData) are those data used in effecting a business process that are of concern to one or more of the activities therein and to the scheduling or controlling of those activities and may include system queues.
Workflow systems define business processes through one or more definition languages. These languages may be fundamentally graphical in nature, or may resemble concurrent programming languages. They can be closely tied to software design methodologies and tools.
In general, workflow systems perform a wide range of tasks. For instance, they can provide a method for defining and managing the flow of a work process or support the definition of resources and their attributes. In addition, they can assign resources to work, determine which steps will be executed next within a work process and when they will be executed and can ensure that the workflow process continues until proper termination. Moreover, they can notify resources about pending work, enforce administrative policies, such as access control and track execution and support user inquiries of status.
In addition to the data above, it is useful for some workflow process applications to have access to historical data regarding state changes within the system. Historical data takes the form of an audit trail for completed workflow processes and is useful to the collection of statistical data for process and resource bottleneck analysis, flow optimization and automatic workload balancing. While workflow systems need full ACID properties over some of their data, they do not need them for the historical data. The existing workflow systems that use databases either record historical data along with current data in their database, discard the historical data entirely, or store it non-durably.
Since data in a workflow system represents work that needs to be done or has already been done, the database generally needs to provide a high degree of reliability. Loss of the data related to a completed work event can mean the loss of the work performed by the work event. It is also quite useful for some applications to provide convenient access to historical data from the system.
It is therefore desirable to provide a high degree of reliability in a database by durably storing updated data and to simultaneously provide rapid access to the database. It is particularly desirable in workflow systems to not lose data modifications related to work assignments. It is also desirable in a workflow system to durably store the historical data of the system and provide convenient access to the data.