Since the mid-1990s, web applications have become among the principal means of communication between a business and its customers, employees and partners. More generally, distributed applications of all kinds, including web applications, are used by businesses to communicate, transact and manage information.
Distributed applications have three main layers: the data layer, the application layer, and the presentation layer. The data layer contains two main types of data: the business content, and the supporting data required by the applications.
The current application development process is as follows. Applications must first be carefully planned and designed. Then, a database model for the application must be designed. After the database is fully designed, the physical database is constructed. Then, the application is programmed to access information from the database, process and manage the data, and present the results to the user. The application may also request input from the user, process and manage the data, and insert data into the database.
Despite the importance and popularity of distributed applications, application development has remained a largely non-automated, technical, risk-prone, and costly business process. It is particularly difficult to design and maintain large-scale applications, especially as the data model changes over time.
A data model is the product of the database design process that aims to identify and organize the required data logically and physically. A physical data model describes what information is to be contained in a database and how the items in the database will be related to each other. A properly designed physical data model will structure the data in a way that accurately reflects actual meaning of the data and its relationships. It requires great skill, technical and business insight, and disciplined use of good form to build a good physical data model for a software application.
There are various data-modeling tools available to assist developers with the data-modeling process, however, these tools are not typically utilized once the data model design is complete. That is, software applications written by to access the database are executed independently of the data-modeling tools because the applications are interacting directly with the physical database.
Physical data models are therefore difficult to change once the database is configured, and especially once the application data has been inserted into the database. Consequently, in complex systems, compromises are often made to allow the data model to remain unchanged or to change in a way that is easier but not optimal. For example, it is often more convenient to leave data labels unchanged even when the contents to be described by those labels have changed. This leads to confusion or errors for users not familiar with the original data model design.
Because of the difficulty in creating a well-designed physical data model, and because of the sub-optimal nature of the way data models are changed over time, physical data models often do not properly or intuitively reflect the intended meaning of the data they contain.
Furthermore, physical data models are limited in the concepts that they inherently support. For example, relationships between two pieces of data are represented by a physical join type (e.g., one-to-many) but not by a meaningful relationship label (e.g., “Authors” where a relationship between people and publications is established.). Also, it can be cumbersome and non-intuitive to navigate a physical data model in order to write the code required to insert and retrieve data from the database.
Importantly, software applications are designed to interoperate with a physical database (e.g., SQL Server, Oracle). If the database changes, the applications must be manually updated to accommodate the change. Applications, like the databases they rely on, are hard to maintain over time. For example, applications that utilize relational databases often use the SQL query language to exchange information with the data layer. As the database is modified, the SQL must also be modified to accommodate the changes. After changes are made, the application must also be carefully re-tested, which is a time-consuming process.
Similarly, if the applications change, the underlying database often requires adjustment. This is true because neither the database nor the application has any “awareness” of the other; the layers operate independently.
Another area of inefficiency in the application development process is the re-development of common components that occurs because of the difficulty of code re-use. Many business applications share common concepts such as workflow, security, and content management screens. But because applications rely on the physical database, and the physical databases vary in structure, application concepts must be largely or completely rewritten for each application. For example, security is a general concept that may refer to having the application provide access on the basis of user groups or conditions. Exactly how security will be implemented from application to application can vary dramatically. The database (e.g., SQL Sever, Oracle) may provide a security framework, but the developer must extend that framework into the application layer, and ultimately to the presentation layer. Therefore, much time and effort is spent redeveloping and implementing features that are common to many applications. Even within a single organization or small development team, it is difficult to reuse application code because the specific configuration of the physical data model requires that the application code be adjusted as the code is implemented.
A web application, or most any distributed application development process, depends on a well-designed and maintained data model. But because both applications and physical data models are hard to maintain, data models are often sub-optimal and applications are often left unchanged when the change is needed. Also, because the application development process interacts with the physical database, the process is complicated, prone to error and inefficient.