Database systems typically use a schema for defining how a database is to be used. Such a schema may specify the types of records that a database system manages and also the rules and relationships associated with these record types. Thus a schema may specify that a user is allowed to perform actions (operations) on a record such as cancel, delete and modify. The schema may also define, for example, exactly which fields in a record may be modified and who is allowed to make such changes.
By way of example only, change control systems exist today for tracking changes made to the source code for software projects. Such change control systems typically use a database schema of the type discussed above. A database schema for a change control system may define record types of: defect; feature; test record; verification record etc. Rules will then define how such records may be inserted, modified, deleted etc.
Software projects are increasingly using iterative development in order to flush out problems and resolve risks and issues as early as possible. The project manager may require updates to the change control system at the beginning of an iteration. This means that the schema used by the change control system may need to be updated many times during the duration of the software project.
A change control schema may have levels of security defined which allow user defined roles to update particular fields for each record type. For example, only an administrator may change the ownership of a defect but any one can add a comment to a defect.
The schema may also be used to integrate the Change Control system with other tools, such as a test case management system and/or a requirements management system.
In order to create a schema, a schema administrator needs to understand:
What the schema will be used for
Who will use the schema and their associated user role
What other tools or services will interact with the schema
The workflow (process) associated with the schema.
Without some kind of intervention, schema migration may cause records within the database to exist in a redundant and thus invalid state. For example schema version 1 may allow records to be in a state of cancelled. The subsequent version of the schema may no longer define the cancelled state. Some records in the database will, however, probably have a cancelled state associated with them and this will cause an exception to be thrown when migration is attempted. Such an exception will typically mean that it is not possible to migrate the database to the new schema.
As schema changes can be quite complex it is often not until the actual upgrade is executed that any data integrity problems are discovered (i.e. during a change slot to upgrade the production system which may be outside working hours).
Additionally, schema changes are not necessarily applied immediately to schema items. Indeed multiple schema changes may be batched together and applied to schema items in one go. This means the schema administrator must know the consequence of all the changes in order to identify the required manual changes to the database containing the schema items. If the schema changes span many months, or even years, this can be difficult.
Furthermore, many software projects involve teams located in multiple geographic locations. Copies of the database exist in each location and changes are replicated between locations to keep them in sync with each other. This also applies to schema changes. The schema changes and associated data migration must be made at all locations before users can update a database in any one location. Whilst the schema updates can be automated, the data migration can not. This can lead to lengthy outages and complex logistics in order to bring all the databases and their associated data up to the required level. This also requires local system administrators to understand the changes required in order to apply them.
Solutions describing workflow and schema evolution are known.
Other documents of interest are:
“Scientific Workflow Management by Database Management” by A Ailamaki, Y loannidis, M Livny (Department of Computer Sciences, University of Wisconsin); “Managing Evolving Workflow Specifications with Schema Versioning and Migration Rules” by G Joeris and O Herzog (Intelligent Systems Department, TZI—Center for Computing Technologies University of Bermen); and “Schema Evolution in Process Management Systems” by Stefanie Beate Rinderle from Memmingen (Dissertation in October 2004); and “Dynamic Workflow Schema Evolution Based on Workflow Type Versioning and Workflow Migration” by Markus Kradolfer and Andreas Gepper.
These documents describe the need for workflow engines to cope with change but none describes how to cope with the removal of a state from a schema and what is to happen to affected records.
“Workflow Evolution” by F Casati, S Ceri, B Pernici and G Pozzi (Dipartimento di Elettronica e Informazione—Politecnico di Milano) also deals with the concept of a changing schema and does briefly discuss the removal of a task from a workflow, in the form of a “RemoveSuccessor” primitive. This document does not however discuss the detail of how to cope with database records in a redundant state as a result of the deletion of a state from a database schema.