Consistent and accurate information is of strategic importance to business. If programmers and users incorporate common data elements and structures in programs, files, reports, and databases, consistent and accurate information would be expected. However, despite dictionaries having been in existence for over 20 years, this result has not been achieved! Inconsistency of data from one system to the next continues to be a major problem confronting businesses. MIS managers will readily admit that the problem is caused by programmers not using a common dictionary when coding the data definition portion of their programs. Those who have tried, complain about the bureaucracy involved, the complexity, and finally the amount of time taken away from getting their job done. The lack of use of data dictionary by programmers can be directly attributed to the perception that the dictionary comes in the way of productivity. Most MIS managers wish the data problem would go away, but it only seems to gets worse, as more and more systems are developed without compliance to a single set of dictionary standards. Many MIS managers have been led to believe that they can get around the issue of implementing the dictionary by using CASE tools instead. Using CASE tools without a strong dictionary foundation only compounds the problem, because CASE tools have been used to implement single applications controlling their own files. Organization-wide data definition was the original problem that dictionaries were designed to solve, and CASE tools definitely do not solve the problem. This invention is a solution to this most vexing and long-lived problem confronting system development managers--to implement the dictionary so that it can address the problem of data incompatibility and data inconsistency while simultaneously providing productivity advantages to programmers.
The data dictionary is a tool for standardizing data elements and structures that exist or get created in programs, file layouts, and Data Base Management System (DBMS) schema. Dictionaries maintain information about data structures and their associated data elements in conceptual, logical, and physical models. The conceptual model depicts the data from a business or functional point of view. The logical model depicts the data from an application point of view. The transformation from the conceptual to the logical model takes into account the effects of allowing physical objects and relationships to be represented as information concepts. The physical model depicts the data in the form that is implemented in the program, file layout, or DBMS schema. The transformation from the logical model to the physical model is based on the capabilities of existing technology. Dictionaries initially supported physical models, specifically those required by the DBMS. As they have evolved to support the logical and conceptual models, the complexity of the dictionary has grown considerably.
Some success has been achieved in those installations using a DBMS or CASE tool exclusively, and having very strong data administration. But in most computer installations, system development has been and continues to be carried out in a laissez-faire mode, and data administration is not a significant force. Therefore, most programmers work in environments without a dictionary to support their day-to-day programming and maintenance activities. The reluctance to use existing DBMS-oriented dictionaries stems from the following design factors:
1. Dictionaries have been implemented as a control tool for the DBMS, oriented towards the administrator; this becomes a bottleneck inasmuch as it prevents programmers from using the dictionary directly in their day-to-day work without having to deal with the administrator.
2. Dictionaries use the DBMS (or a proprietary storage mechanism) to store their internal data. In order to get some benefit out of the dictionary, the programmer has to migrate all the existing data structure definitions from the familiar operating system directory/file structure into an unfamiliar scheme.
3. Dictionaries impose considerable administrative burden because of the implied top-down work flow model built into these. If, for some reason, the administrative resources are not applied, the dictionary's accuracy and integrity are quickly compromised, and programmers soon bypass the dictionary altogether.
4. The user interface in existing data dictionaries is complex and cumbersome and mirrors the complexity of the underlying dictionary schema. The user has to be knowledgeable about dictionary concepts (i.e. schema) as well as navigate through several layers of security built into the system. This prevents the average user from accessing and sharing the data in the dictionary in an intuitive and open manner.
Although Fourth Generation Languages and CASE tools have embedded data dictionaries, which are maintained by application of the methodology that comes with the tool, these dictionaries are designed for the specific tool, and maintain the metadata in proprietary structures resulting in some of the same problems outlined earlier. Data dictionaries are also associated with reverse engineering tools, which examine existing source code to come up with conceptual and logical data structures that can be used in future applications development. These dictionaries suffer from the implementation problems discussed earlier and also require considerable expertise and training to operate.
Repositories have recently been touted as a means to address the "data dictionary problem". The functioning of repositories is predicated on having the conceptual and logical models defined by the process of enterprise analysis. Additional complexity and administrative overhead is introduced by the fact that in addition to data models, process models and network configurations have to be maintained. Repository implementations contain the same deficiencies found in existing dictionaries, yet require the user to manage and integrate considerably larger amounts of information about data, process, and network over a life-cycle (configuration management).
Existing dictionaries unwittingly make it easier for the programmer to bypass the dictionary and create a new version of a data element definition than try to use the current version that exists in the dictionary. The net result is that dictionaries have been bypassed by programmers.
The technology paradigm that this invention operates under is quite different from existing dictionaries. It is based on the notion that the programmer should not have to adapt to the dictionary but that the dictionary should adapt to the programmer, tools and environment. The following methods are used:
1. Rendering an existing set of disparate Include Files across the network to this invention via links.
2. Providing hot key access to the data element definitions directly from the editing session.
3. Measurement of Include File compliance with dictionary standards via a dictionary compliance index.
4. Providing feedback to the programmer via reporting of data errors on compilation reports, cross-references to Include Files and programs.
5. Providing a data definition facility that uses data rules to assist programmers propose new data element definition.
This invention supports the way in which programmers code Include Files and use them in program development. It does so by utilizing the programmer's knowledge and experience at specific points in the day-to-day work in an incremental fashion, so that organizations can get their data structures under control without expending enormous resources. This product mimics operating system functionalities wherever possible, so that the programmer does not have to learn new methods. The programmer will find that the time required to learn to use the system is minimal. The feedback mechanism provides the programmer the means to be compliant with data standards in a self-controlling manner.
This invention reduces the bureaucratic burden by specifically providing the following benefits to programmers:
Reduce the time required to research the definition of a data item that appears in an existing Include File;
Take the drudgery away from having to figure out new data element definitions, names, sizes, and edit/validation rules;
Make all proposed data definitions widely visible to other programmers on the project and/or network, so that there is an assurance that no new data elements are being created independently or in a vacuum;
Allow the programmer to view, copy and modify an existing Include File; if the Include File is from a third party software package, ensure that a comparable standardized version is provided;
Allow access to data element definitions directly from the editing session, so that the time required to learn to use this facility is minimal.
The continued use of this invention should provide the following results:
1. A set of data element definitions that conform to the organization's naming conventions.
2. Programs, file structures, and database schema that incorporate common data definitions.
3. Tracking of relationships between programs and data elements.
4. Documentation of application systems.
5. Notification of proposed changes to data definitions.